Sample records for audio machine text-to-speech

  1. A Hakka Text-To-Speech System

    Microsoft Academic Search

    Hsiu-min Yu; Hsin-te Hwang; Dong-yi Lin; Sin-horng Chen

    2006-01-01

    \\u000a In this paper, the implementation of a Hakka text-to-speech (TTS) system is presented. The system is designed based on the\\u000a same principle of developing a Mandarin and a Min-Nan TTS systems proposed previously. It takes 671 base-syllables as basic\\u000a synthesis units and uses a recurrent neural network (RNN)-based prosody generator to generate proper prosodic parameters for\\u000a synthesizing natural output speech.

  2. Audio-Visual Teaching Machines.

    ERIC Educational Resources Information Center

    Dorsett, Loyd G.

    An audiovisual teaching machine (AVTM) presents programed audio and visual material simultaneously to a student and accepts his response. If his response is correct, the machine proceeds with the lesson; if it is incorrect, the machine so indicates and permits another choice (linear) or automatically presents supplementary material (branching).…

  3. Building a Prototype Text to Speech for Sanskrit

    NASA Astrophysics Data System (ADS)

    Mahananda, Baiju; Raju, C. M. S.; Patil, Ramalinga Reddy; Jha, Narayana; Varakhedi, Shrinivasa; Kishore, Prahallad

    This paper describes about the work done in building a prototype text to speech system for Sanskrit. A basic prototype text-to-speech is built using a simplified Sanskrit phone set, and employing a unit selection technique, where prerecorded sub-word units are concatenated to synthesize a sentence. We also discuss the issues involved in building a full-fledged text-to-speech for Sanskrit.

  4. A grammatical component for a text-to-speech system

    Microsoft Academic Search

    R. Delmonte; G. Mian; G. Tisato

    1986-01-01

    A grammatical component to supply information to a text-to-speech system is presented. It is composed of four modules: a lexicon, a morphological recognizer, a syntactic preanalyzer, a parser. The parser is composed of a bottom-up algorithm implementing a context-free grammar for Italian simply as Recursive Transition Networks (RTN). No conditions are introduced on the arcs: this will constitute the topic

  5. A text-to-speech system for italian

    Microsoft Academic Search

    Rodolfo Delmonte; G. Mian; G. Tisato

    1984-01-01

    A system for the automatic translation of any text of Italian into naturally fluent speech is presented. The system, planned for use in a reading machine for the blind, is build up around a Phonological Processor (hence FP) and synthesizes speech-by joining LPC coded diphones. The FP maps into prosodic structures the phonological rules of Italian. Structural information is provided

  6. Use Pronunciation by Analogy for text to speech system in Persian language

    Microsoft Academic Search

    Ali Jowharpour; Masha allah abbasi dezfuli; Mohammad hosein Yektaee

    2011-01-01

    The interest in text to speech synthesis increased in the world .text to speech have been developed formany popular languages such as English, Spanish and French and many researches and developmentshave been applied to those languages. Persian on the other hand, has been given little attentioncompared to other languages of similar importance and the research in Persian is still in

  7. "Look What I Did!": Student Conferences with Text-to-Speech Software

    ERIC Educational Resources Information Center

    Young, Chase; Stover, Katie

    2014-01-01

    The authors describe a strategy that empowers students to edit and revise their own writing. Students input their writing in to text-to-speech software that rereads the text aloud. While listening, students make necessary revisions and edits.

  8. Use Pronunciation by Analogy for text to speech system in Persian language

    E-print Network

    Jowharpour, Ali; Yektaee, Mohammad hosein

    2011-01-01

    The interest in text to speech synthesis increased in the world .text to speech have been developed formany popular languages such as English, Spanish and French and many researches and developmentshave been applied to those languages. Persian on the other hand, has been given little attentioncompared to other languages of similar importance and the research in Persian is still in its infancy.Persian language possess many difficulty and exceptions that increase complexity of text to speechsystems. For example: short vowels is absent in written text or existence of homograph words. in thispaper we propose a new method for persian text to phonetic that base on pronunciations by analogy inwords, semantic relations and grammatical rules for finding proper phonetic. Keywords:PbA, text to speech, Persian language, FPbA

  9. Using Polysyllabic units for Text to Speech Synthesis in Indian languages

    E-print Network

    Sivalingam, Krishna M.

    Using Polysyllabic units for Text to Speech Synthesis in Indian languages Vinodh.M.V., Ashwin units. Firstly, a phone based TTS is built. Later, a monosyllable cluster unit TTS is built. It is observed that the quality of the synthesized sentences can improve if polysyllable units are used (when

  10. Construction of the acoustic inventory for a Greek text-to-speech concatenative synthesis system

    Microsoft Academic Search

    Costas Christogiannis; Theodora Varvarigou; A. Zappa; Yiannis Vamvakoulas; Chilin Shih; A. Arvaniti

    2000-01-01

    The development of the Greek text-to-speech (TTS) system by NTUA is based on the method of concatenative synthesis and follows the Bell Labs approach to this technique. Concatenative synthesis is one of the simplest methods for speech synthesis and at the same time bypasses most of the problems encountered by articulatory and formant synthesis techniques. The method relies on designing

  11. A rule-based approach to build a text-to-speech system for Romanian

    Microsoft Academic Search

    Ovidiu Buza; Gavril Toderean; Jozsef Domokos

    2010-01-01

    We present in this article our approach for building a text-to-speech system for Romanian. Main stages of this work were: voice signal analysis, region segmentation, construction of acoustic database, text analysis, unit and prosody detection, unit matching, concatenation and speech synthesis. In our approach we consider word syllables as basic units and stress indicating intrasegmental prosody. A special characteristic of

  12. Integrating Text-to-Speech Software into Pedagogically Sound Teaching and Learning Scenarios

    ERIC Educational Resources Information Center

    Rughooputh, S. D. D. V.; Santally, M. I.

    2009-01-01

    This paper presents a new technique of delivery of classes--an instructional technique which will no doubt revolutionize the teaching and learning, whether for on-campus, blended or online modules. This is based on the simple task of instructionally incorporating text-to-speech software embedded in the lecture slides that will simulate exactly the…

  13. Basic Research and Implementation Decisions for a Text-to-Speech Synthesis System in Romanian

    Microsoft Academic Search

    Dragos Burileanu

    2002-01-01

    Speech synthesis is one of the most language-dependent domains of speech technology. In particular, the natural language processing stage of a text-to-speech (TTS) system contains the largest part of the linguistic knowledge for a given language. In this respect, one can state that building a high-quality TTS system for a new language involves many theoretical and technical challenges. Especially, extensive

  14. Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

    Microsoft Academic Search

    Tamar Shoham; David Malah; Slava Shechtman

    2012-01-01

    A concatenative text-to-speech (CTTS) synthesizer requires a large acoustic database for high-quality speech synthesis. This database consists of many acoustic leaves, each containing a number of short, compressed, speech segments. In this paper, we propose two algorithms for recompression of the acoustic database, by recompressing the data in each acoustic leaf, without compromising the perceptual quality of the obtained synthesized

  15. Multi voice text to speech synthesis based on the instantaneous parametric voice conversion

    Microsoft Academic Search

    Elias Azarov; Alexander A. Petrovsky; Piotr Zubrycki

    2010-01-01

    The paper describes an approach to text-to-speech synthesis based on processing in harmonic domain. A special harmonic analysis technique is presented that provides accurate estimation of instantaneous harmonic parameters. The technique is based on narrow band filtering aligned to the fundamental frequency, which improves estimation accuracy of higher-order harmonics with rapid frequency changes. The advanced analysis ensures natural-sounding amplitude, pitch

  16. A fuzzy decision tree-based duration model for Standard Yorůbá text-to-speech synthesis

    Microsoft Academic Search

    Odétúnjí A. Odéjobí; Shun Ha Sylvia Wong; Anthony J. Beaumont

    2007-01-01

    In this paper, we present syllable-based duration modelling in the context of a prosody model for Standard Yorůbá (SY) text-to-speech (TTS) synthesis applications. Our prosody model is conceptualised around a modular holistic framework. This framework is implemented using the Relational Tree (R-Tree) techniques. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation

  17. A Computational Model of Intonation for Yorůbá Text-to-Speech Synthesis: Design and Analysis

    Microsoft Academic Search

    Odétúnjí A. Odéjobí; Anthony J. Beaumont; Shun Ha Sylvia Wong

    2004-01-01

    \\u000a In this paper we present the design and analysis of an intonation model for text-to-speech (TTS) synthesis applications using\\u000a a combination of Relational Tree (RT) and Fuzzy Logic (FL) technologies. The model is demonstrated using the Standard Yorůbá (SY) language. In the proposed intonation model, phonological information extracted from text is converted into an RT. RT is\\u000a a sophisticated data

  18. A joint prosody evaluation of French text-to-speech synthesis systems

    Microsoft Academic Search

    Marie-Neige Garcia; Christophe d'Alessandro; Gérard Bailly; Philippe Boula de Mareüil; Michel Morel

    2006-01-01

    This paper reports on prosodic evaluation in the framework of the EVALDA\\/EvaSy project for text-to-speech (TTS) evaluation for the French language. Prosody is evaluated using a prosodic transplantation paradigm. Intonation contours generated by the synthesis systems are transplanted on a common segmental content. Both diphone based synthesis and natural speech are used. Five TTS systems are tested along with natural

  19. Faking it: Synthetic text-to-speech synthesis for u nder-resourced languages - Experimental design

    Microsoft Academic Search

    Harold Somers

    Speech synthesis or text-to-speech (TTS) systems are currently available for a number of the world's major languages, but for thousands of the world's 'minor' languages no such technology is available. While awaiting the development of such technology, we would like to try the stop-gap solution of using an existing TTS system for a major language (the base language) to 'fake'

  20. A Text-to-Speech Platform for Variable Length Optimal Unit Searching Using Perceptual Cost Functions

    Microsoft Academic Search

    Minkyu Lee; Daniel P. Lopresti; Joseph P. Olive

    In concatenative Text-to-Speech, the size of the speech cor- pus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic inventories with a rich set of annotations, models and data structures for representing and managing such inventories, and an optimal unit selection algorithm that

  1. Sparte: A text-to-speech machine using synthesis by diphones

    Microsoft Academic Search

    J.-L. Courbon; F. Emerard

    1982-01-01

    An operational synthesis system for the French language has been developed at the CNET. Vocal output of any message in the language may be obtained from input typed on a keyboard. This speech synthesis equipment is built in an autonomous cabinet of fairly small dimensions; it can also be used as a single board with a V24-RS232 connection. The device

  2. Adaptive and Longitudinal Pharmaceutical Care Instruction Using an Interactive Voice Response/Text-to-Speech System

    PubMed Central

    Hussein, Gamal; Kawahara, Nancy

    2006-01-01

    Objectives To develop a course structure that would more closely simulate the actual provision of pharmaceutical care. Design An interactive voice response/text-to-speech system (hardware and software) for obtaining patient data was designed and used in a pharmaceutical care laboratory. Students called the system to collect data, listen to progress notes, make recommendations, and update the pharmaceutical care plan for virtual patients. Laboratory time was utilized to evaluate patient progress and respond to recommendations as well as to identify and solve drug-related problems. Assessment Students' recorded communications with the system and completed care plans were evaluated and a competency-based final examination was administered. Peer evaluations and course evaluations were administered. Conclusion This innovative approach challenged students and promoted interactive learning. Student evaluations indicated we achieved our objective of creating a course that more closely simulated the actual provision of pharmaceutical care. PMID:17149416

  3. Advancements in text-to-speech technology and implications for AAC applications

    NASA Astrophysics Data System (ADS)

    Syrdal, Ann K.

    2003-10-01

    Intelligibility was the initial focus in text-to-speech (TTS) research, since it is clearly a necessary condition for the application of the technology. Sufficiently high intelligibility (approximating human speech) has been achieved in the last decade by the better formant-based and concatenative TTS systems. This led to commercially available TTS systems for highly motivated users, particularly the blind and vocally impaired. Some unnatural qualities of TTS were exploited by these users, such as very fast speaking rates and altered pitch ranges for flagging relevant information. Recently, the focus in TTS research has turned to improving naturalness, so that synthetic speech sounds more human and less robotic. Unit selection approaches to concatenative synthesis have dramatically improved TTS quality, although at the cost of larger and more complex systems. This advancement in naturalness has made TTS technology more acceptable to the general public. The vocally impaired appreciate a more natural voice with which to represent themselves when communicating with others. Unit selection TTS does not achieve such high speaking rates as the earlier TTS systems, however, which is a disadvantage to some AAC device users. An important new research emphasis is to improve and increase the range of emotional expressiveness of TTS.

  4. Segmental intelligibility of four currently used text-to-speech synthesis methods.

    PubMed

    Venkatagiri, Horabail S

    2003-04-01

    The study investigated the segmental intelligibility of four currently available text-to-speech (TTS) products under 0-dB and 5-dB signal-to-noise ratios. The products were IBM ViaVoice version 5.1, which uses formant coding, Festival version 1.4.2, a diphone-based LPC TTS product, AT&T Next-Gen, a half-phone-based TTS product that uses harmonic-plus-noise method for synthesis, and FlexVoice2, a hybrid TTS product that combines concatenative and formant coding techniques. Overall, concatenative techniques were more intelligible than formant or hybrid techniques, with formant coding slightly better at modeling vowels and concatenative techniques marginally better at synthesizing consonants. No TTS product was better at resisting noise interference than others, although all were more intelligible at 5 dB than at 0-dB SNR. The better TTS products in this study were, on the average, 22% less intelligible and had about 3 times more phoneme errors than human voice under comparable listening conditions. The hybrid TTS technology of FlexVoice had the lowest intelligibility and highest error rates. There were discernible patterns of errors for stops, fricatives, and nasals. Unrestricted TTS output--e-mail messages, news reports, and so on--under high noise conditions prevalent in automobiles, airports, etc. will likely challenge the listeners. PMID:12703720

  5. A joint intelligibility evaluation of French text-to-speech synthesis systems: the EvaSy SUS\\/ACR campaign

    Microsoft Academic Search

    Philippe Boula de Mareüil; Christophe d'Alessandro; Alexander Raake; Gérard Bailly; Marie-Neige Garcia; Michel Morel

    The EVALDA\\/EvaSy project is dedicated to the evaluation of text-to-speech synthesis systems for the French language. It is subdivided into four components: evaluation of the grapheme-to-phoneme conversion module (Boula de Mareil et al., 2005), evaluation of prosody (Garcia et al., 2006), evaluation of intelligibility, and global evaluation of the quality of the synthesised speech. This paper reports on the key

  6. A Text-to-Speech Platform for Variable Length Optimal Unit Searching Using Perception Based Cost Functions

    Microsoft Academic Search

    Minkyu Lee; Daniel P. Lopresti; Joseph P. Olive

    2003-01-01

    In concatenative Text-to-Speech, the size of the speech corpus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic inventories with a rich set of annotations, models and data structures for representing and managing such inventories, and an optimal unit selection algorithm that accommodates

  7. Event detection in field sports video using audio-visual features and a support vector Machine

    Microsoft Academic Search

    David A. Sadlier; Noel E. O'connor

    2005-01-01

    In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine,

  8. On the implications of machine virtualization for DRM and fair use: a case study of a virtual audio device driver

    Microsoft Academic Search

    Ninad Ghodke; Renato J. O. Figueiredo

    2004-01-01

    This paper examines the architecture of present day systems and shows that they are not trustworthy enough to support certain DRM features\\/restrictions, even when the DRM delivery system exclusively utilizes signed and protected operating system components. This weakness was discovered while creating a technique for remote transfer of audio streams generated by a Virtual Machine Monitor (VMM), to achieve network

  9. Study of an Audio Playback Machine Storage, Distribution, and Repair System. Options for Machine Operation. Study II, Part 1, Phase 2, Final Report.

    ERIC Educational Resources Information Center

    ManTech Technical Services Corp., Fairfax, VA.

    This report presents the results of a management study of audio playback equipment operations conducted by the National Library Service, Library of Congress, its associated network of state and local machine lending agencies (MLA), and other parties that play a role in current operations. The objectives were to document current operations,…

  10. Pontine Nucleus audio stimuli detection & modeling for brain machine interface rehabilitation of conditional learning

    Microsoft Academic Search

    Hanan Shteingart; Aryeh Taub; Hagit Messer

    2009-01-01

    In order to establish a brain-machine interface (BMI) system that rehabilitates damaged cerebellum function of discrete motor learning, the detection of conditional and unconditional stimuli (CS and US) onset times based on electro-physiology recordings analysis is necessary. These signals are relayed through brainstem areas called Pontine Nucleus (PN) and the Inferior Olive (IO) respectively. In this paper we focus on

  11. Audio 2008: Audio Fixation

    ERIC Educational Resources Information Center

    Kaye, Alan L.

    2008-01-01

    Take a look around the bus or subway and see just how many people are bumping along to an iPod or an MP3 player. What they are listening to is their secret, but the many signature earbuds in sight should give one a real sense of just how pervasive digital audio has become. This article describes how that popularity is mirrored in library audio

  12. CREATING AUDIO KEYWORDS FOR EVENT DETECTION IN SOCCER VIDEO

    Microsoft Academic Search

    Changsheng Xu; Qi Tian; Heng Mui; Keng Terrace

    This paper presents a novel framework called audio keywords to assist event detection in soccer video. Audio keyword is a middle-level representation that can bridge the gap between low-level features and high-level semantics. Audio keywords are created from low-level audio features by using support vector machine learning. The created audio keywords can be used to detect semantic events:in soccer video

  13. Creating audio keywords for event detection in soccer video

    Microsoft Academic Search

    Min Xu; N. C. Maddage; Changsheng Xu; M. Kankanhalli; Qi Tian

    2003-01-01

    This paper presents a novel framework called audio keywords to assist event detection in soccer video. Audio keyword is a middle-level representation that can bridge the gap between low-level features and high-level semantics. Audio keywords are created from low-level audio features by using support vector machine learning. The created audio keywords can be used to detect semantic events in soccer

  14. Audio Mining

    NSDL National Science Digital Library

    Leske, Cavin.

    2002-01-01

    Occasionally referred to as audio indexing, audio mining is a computerized task involving the processing of an audio file, extracting the dialog and creating a textual transcript, and searching the transcript for certain words or phrases. Considering the amount of audio content on the Internet and other sources, it is clear that audio mining is a growing technology.To get an idea of what audio mining is and how it can be used, people can read this article from the Cutter Consortium (1). It lists six broad areas that can benefit from using the technology and briefly discusses each one. A more detailed introduction is offered on the Leavitt Communications Web site (2). This article delves into how audio mining works by giving a basic technical understanding of the process. A new method of searching an audio file, dubbed the "phonetic search engine," is compared to traditional methods in this white paper (3). A publication from the Compaq Cambridge Research Laboratory (4) discusses ways of collecting and analyzing information from an audio file. It also mentions SpeechBot, a Web-based tool for multimedia retrieval. Several papers can be downloaded from the home page of a research project studying the National Gallery of the Spoken Word (5). The repository is comprised of massive historical audio content, and the team at the University of Colorado is investigating phrase recognition to index the data. Have you ever had a tune stuck in your head, but not known the name of the artist or song title? The Musical Audio-Mining project (6) is working on ways to search for information about a song simply by humming part of it. Audio mining can also be used in the War on Terrorism, as is described in this article of Federal Computer Week (7). Massive amounts of recorded phone conversations are intercepted by the government each day, and audio mining would be an efficient way to sort through irrelevant material and catch suspicious activity. The World Wide Web Consortium released this draft of the Voice Extensible Markup Language (8), which could have applications for the audio mining community.

  15. Fast transcription of unstructured audio recordings

    Microsoft Academic Search

    Brandon C. Roy; Deb Roy

    2009-01-01

    We introduce a new method for human-machine collaborative speech transcription that is significantly faster than existing transcription methods. In this approach, automatic audio pro- cessing algorithms are used to robustly detect speech in audio recordings and split speech into short, easy to transcribe seg- ments. Sequences of speech segments are loaded into a tran- scription interface that enables a human

  16. Probabilistic Modeling Paradigms for Audio Source Separation

    E-print Network

    Plumbley, Mark

    Probabilistic Modeling Paradigms for Audio Source Separation Emmanuel Vincent IRISA-INRIA, France-of-the-art systems. KEYWORDS Source separation, latent variable model, spatial model, spectral model, Bayesian, M. E. Davies. Probabilistic modeling paradigms for audio source separation. In W. Wang (Ed), Machine

  17. IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 14, NO. 1, JANUARY 2003 209 Content-Based Audio Classification and Retrieval by Support Vector Machines

    E-print Network

    Guo, Guodong

    Euclidean (Mahalanobis) distance and the nearest neighbor (NN) rule are used to classify the query sound as the audio features. A tree-structured vector quantizer is used to partition the feature vector space

  18. Modem/Audio IntegrationModem/Audio Integration Concurrent Audio AndConcurrent Audio And

    E-print Network

    Maher, Robert C.

    #12;Modem/Audio IntegrationModem/Audio Integration #12;Concurrent Audio AndConcurrent Audio And Modem AccelerationModem Acceleration Dr. Rob MaherDr. Rob Maher Engineering ManagerEngineering ManagerIntroduction and Scope uu Impact of Audio/Modem AccelerationImpact of Audio/Modem Acceleration uu Features and Cost

  19. 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 16-19, 2005, New Paltz, NY LEARNING AUDITORY MODELS OF MACHINE VOICES

    E-print Network

    Ellis, Dan

    to traditional therapy akin to art therapy and music therapy, utilizes the sounds of machines as relational@ee.columbia.edu ABSTRACT Vocal imitation is often found useful in Machine Therapy ses- sions as it creates an emphatic by our work in Machine Therapy in which hu- mans try to vocally imitate machines, but this task also

  20. Video salient event classification using audio features

    NASA Astrophysics Data System (ADS)

    Corchs, Silvia; Ciocca, Gianluigi; Fiori, Massimiliano; Gasparini, Francesca

    2014-03-01

    The aim of this work is to detect the events in video sequences that are salient with respect to the audio signal. In particular, we focus on the audio analysis of a video, with the goal of finding which are the significant features to detect audio-salient events. In our work we have extracted the audio tracks from videos of different sport events. For each video, we have manually labeled the salient audio-events using the binary markings. On each frame, features in both time and frequency domains have been considered. These features have been used to train different classifiers: Classification and Regression Trees, Support Vector Machine, and k-Nearest Neighbor. The classification performances are reported in terms of confusion matrices.

  1. Detecting double compression of audio signal

    NASA Astrophysics Data System (ADS)

    Yang, Rui; Shi, Yun Q.; Huang, Jiwu

    2010-01-01

    MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.

  2. Violence Content Classification Using Audio Features

    Microsoft Academic Search

    Theodoros Giannakopoulos; Dimitrios I. Kosmopoulos; Andreas Aristidou; Sergios Theodoridis

    2006-01-01

    \\u000a This work studies the problem of violence detection in audio data, which can be used for automated content rating. We employ\\u000a some popular frame-level audio features both from the time and frequency domain. Afterwards, several statistics of the calculated\\u000a feature sequences are fed as input to a Support Vector Machine classifier, which decides about the segment content with respect\\u000a to

  3. RECENT ADVANCES IN MULTILINGUAL TEXT-TO-SPEECH SYNTHESIS

    Microsoft Academic Search

    Bernd M; Juergen Schroeter; Jan van Santen; Richard Sproat; Joseph Olive

    1996-01-01

    this paper we will discuss recent advances in multilingualtext-to-speech (TTS) synthesis research atAT&T Bell Laboratories. The TTS system developedat AT&T Bell Laboratories generates syntheticspeech by concatenating segments of natural speech.The architecture of the system is designed as a modularpipeline where each module handles one particularstep in the process of converting text into speech. Besidesconceptual and computational advantages, themodular structure has

  4. BIBLIOGRAPHY Text-to-speech in Vocabulary Acquisition and Student

    E-print Network

    . Cambridge, UK: Cambridge University Press. Collins-Thompson, K. & Callan, J. (2004). A language modeling-9). Dundee, England: University of Abertay Dundee. Fender, M. (2003). English word recognition and word.S.A. Horst, M., Cobb, T. & Meara, P. (1998). Beyond Clockwork Orange: Acquiring second Language Vocabulary

  5. Prosodic Word Boundaries Prediction for Mandarin Text-to-Speech

    Microsoft Academic Search

    YanQiu Shao; JiQing Han; Ting Liu; YongZhen Zhao

    In Mandarin speech, the Prosodic Word (PW) is the basic rhythmic unit instead of Lexical Word (LW), and the naturalness of TTS will be directly influenced by the segmentation of PW. Most of the PWs are the combination of some LWs. In this paper, three models, i.e. a directed acyclic graph (DAG) model, segmentation model and Markov Model (MM) combined

  6. Designing Audio Environments - Not Audio Interfaces

    Microsoft Academic Search

    Arthur Murphy

    this paper we will first consider several issuesregarding audio-based access to GUIs andinformation. We will survey several approaches forGUI access and audio-related research efforts. Theprimary concern of our work is providingenvironments for non-visual access to hyper-linkedinformation. We will propose several concepts andmetaphors for the design of audio environments,and present a preliminary prototype that embodiessome of these concepts. The results of

  7. Multilingual Video and Audio News Alerting

    Microsoft Academic Search

    David D. Palmer; Patrick Bray; Andrew Merlino; Francis Kubala

    This paper describes a fully-automated real- time broadcast news video and audio process- ing system. The system combines speech rec- ognition, machine translation, and cross- lingual information retrieval components to enable real-time alerting from live English and Arabic news sources.

  8. Audio Engineering Society

    NSDL National Science Digital Library

    Audio Engineering Society. Inc..

    The Audio Engineering Society (AES), now in its fifth decade, is the only professional society devoted exclusively to audio technology. Its membership consists of leading engineers, scientists and other authorities throughout the world. The Web site has links to information about audio education, events, careers and more.

  9. Blind Audio Source Separation

    Microsoft Academic Search

    Emmanuel Vincent; Maria G. Jafari; Samer A. Abdallah; Mark D. Plumbley; Mike E. Davies

    Most audio signals are mixtures of several audio sources which are active simultaneously. For example, live debates are mixtures of several speakers, music CDs are mixtures of musical instruments and singers, and movie soundtracks are mixtures of speech, music and natural sounds. Blind Audio Source Separation (BASS) is the problem of recovering each source signal from a given mixture signal.

  10. AudioNet

    NSDL National Science Digital Library

    For Internauts with RealAudio 1.0 capability (a 14.4 modem) try AudioNet, the "Broadcast Network of the Internet." AudioNet offers live broadcasts of over ten different talk radio stations, including WOR--New York, WTEM--Washington D.C., and XTRA--San Diego. It also offers several music radio stations, a selection of audio books, and numerous live (and recent) sporting events such as NIT and NCAA Men's and Women's basketball games and college baseball games. http://www.audionet.com/ Free RealAudio 1.0 and 2.0 players can be downloaded from the above sites. RealAudio 2.0 players will play RealAudio 1.0 sites, but 1.0 players will not play 2.0 sites. For more information on this and other plug-ins, visit the Scout Toolkit: webtools/plugins.html

  11. Audio-Visual Speech Source Separation Bertrand Rivet, Wenwu Wang, Syed Mohsen Naqvi and Jonathon A. Chambers

    E-print Network

    Boyer, Edmond

    1 Audio-Visual Speech Source Separation Bertrand Rivet, Wenwu Wang, Syed Mohsen Naqvi and Jonathon of audio-visual speech source separation which targets at doing likewise in a machine. Success in audio-visual speech source separation building from early methods which simply use the visual modality

  12. Audio Engineers: Sound Weavers

    NSDL National Science Digital Library

    Integrated Teaching and Learning Program,

    Students are introduced to audio engineers, discovering the type of environment in which they work and exactly what they do on a day-to-day basis. Students come to realize that audio engineers help produce their favorite music and movies.

  13. Audio detection algorithms

    NASA Astrophysics Data System (ADS)

    Neta, B.; Mansager, B.

    1992-08-01

    Audio information concerning targets generally includes direction, frequencies, and energy levels. One use of audio cueing is to use direction information to help determine where more sensitive visual direction and acquisition sensors should be directed. Generally, use of audio cueing will shorten times required for visual detection, although there could be circumstances where the audio information is misleading and degrades visual performance. Audio signatures can also be useful for helping classify the emanating platform, as well as to provide estimates of its velocity. The Janus combat simulation is the premier high resolution model used by the Army and other agencies to conduct research. This model has a visual detection model which essentially incorporates algorithms as described by Hartman(1985). The model in its current form does not have any sound cueing capability. This report is part of a research effort to investigate the utility of developing such a capability.

  14. An introduction to super audio CD and DVD-Audio

    Microsoft Academic Search

    K. Konstantinides

    2003-01-01

    Highlights the latest developments in consumer audio and specifically in DVD-Audio and SACD. The DVD-Audio specification allows for up to 24-b PCM data and uses the Meridian lossless packing (MLP) algorithm to provide up to six channels of high-quality, multichannel audio at sampling rates of up to 96 kHz for six channels or 192 kHz for two channels. Super-audio CD

  15. Audio-visual Segmentation and "The Cocktail Party Effect" Trevor Darrell1

    E-print Network

    Darrell, Trevor

    independent sources. Prior statistical approaches to source separation often took audio-only approaches. The "blind source separation" problem has been studied extensively in the machine learning literature audio recognition, a single source must first be isolated. Existing solutions to this problem generally

  16. Acoustic chase : designing an interactive audio environment to stimulate human body movement

    E-print Network

    Schiessl, Simon Karl Josef, 1972-

    2004-01-01

    An immersive audio environment was created that explores how humans react to commands imposed by a machine generating its acoustic stimuli on the basis of tracked body movement. In this environment, different states of ...

  17. Signal Processing for Audio HCI

    Microsoft Academic Search

    Dmitry N. Zotkin; Ramani Duraiswami

    \\u000a This chapter reviews recent advances in computer audio processing from the viewpoint of improving the human-computer interface.\\u000a Microphone arrays are described as basic tools for untethered audio acquisition, and principles for the synthesis of realistic\\u000a virtual audio are outlined. The influence of room acoustics on audio acquisition and production is also considered. The chapter\\u000a finishes with a review of several

  18. 3D Audio System

    NASA Technical Reports Server (NTRS)

    1992-01-01

    Ames Research Center research into virtual reality led to the development of the Convolvotron, a high speed digital audio processing system that delivers three-dimensional sound over headphones. It consists of a two-card set designed for use with a personal computer. The Convolvotron's primary application is presentation of 3D audio signals over headphones. Four independent sound sources are filtered with large time-varying filters that compensate for motion. The perceived location of the sound remains constant. Possible applications are in air traffic control towers or airplane cockpits, hearing and perception research and virtual reality development.

  19. Women's Audio Mission

    NSDL National Science Digital Library

    Women\\'s Audio Mission

    Get the inside scoop on the recording industry. The Women's Audio Mission is dedicated to helping women and girls in the field. Not only can you learn more about the industry on these pages, you can see the way woman have made their mark in it.

  20. Automatic audio morphing

    Microsoft Academic Search

    Malcolm Slaney; Michele Covell; Bud Lassiter

    1996-01-01

    This paper describes techniques to automatically morph from one sound to another. Audio morphing is accomplished by representing the sound in a multi-dimensional space that is warped or modified to produce a desired result. The multi-dimensional space encodes the spectral shape and pitch on orthogonal axes. After matching components of the sound, a morph smoothly interpolates the amplitudes to describe

  1. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 7, NO. 2, APRIL 2005 243 Audio/Visual Mapping With Cross-Modal

    E-print Network

    Gutierrez-Osuna, Ricardo

    -driven facial animation. This paper presents a brief overview of these models, followed by an analysis THE GOAL OF audio/visual (A/V) mapping is to produce accurate, synchronized and perceptually natural anima practical benefits in human-machine interfaces [1], since the combination of audio and visual information

  2. Maher Audio Enhancement via Time-Frequency Filtering AUDIO ENHANCEMENT

    E-print Network

    Maher, Robert C.

    Maher Audio Enhancement via Time-Frequency Filtering AUDIO ENHANCEMENT USING NONLINEAR TIME on a time-varying spectral representation of the noisy signal. The enhancement process adapts to the instantaneous signal behavior and alters the noisy signal so that the enhanced output signal is higher

  3. Engaging Students with Audio Feedback

    ERIC Educational Resources Information Center

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio

  4. ASDF: AUDIO SCENE DESCRIPTION FORMAT

    Microsoft Academic Search

    Matthias Geier; Sascha Spors

    The Audio Scene Description Format (ASDF) is an col- laboratively evolving format for the storage and inter- change of static, dynamic and interactive spatial audio content. This position paper briefly describes the current status and raises a list of open questions which shall be addressed in the panel discussion.

  5. Audio Watermarking with Error Correction

    E-print Network

    Chadha, Aman; Goel, Rishabh; Dave, Hiren; Roja, M Mani

    2011-01-01

    In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important technological and research issue. Audio watermarking has been proposed as a solution to tackle this issue. The main purpose of audio watermarking is to protect against possible threats to the audio data and in case of copyright violation or unauthorized tampering, authenticity of such data can be disputed by virtue of audio watermarking.

  6. Audio Based Event Detection for Multimedia Surveillance

    Microsoft Academic Search

    Pradeep K. Atrey; Namunu C. Maddage; Mohan S. Kankanhalli

    2006-01-01

    With the increasing use of audio sensors in surveillance and monitoring applications, event detection using audio streams has emerged as an important research problem. This paper presents a hierarchical approach for audio based event detection for surveillance. The proposed approach first classifies a given audio frame into vocal and nonvocal events, and then performs further classification into normal and excited

  7. A text-to-speech system for Spanish with a frequency domain based prosodic modification algorithm

    Microsoft Academic Search

    E. R. Banga; E. Ldpez-Gorzzalo; C. Garcia-Mateo

    1993-01-01

    From the input text, the linguistic-prosodic module obtains the phonetic transcription and prosodic marks that reflect both the syntactic structure and some rhythmical constraints. The synthesis module is a variation of the MBE (multiband excitation) vocoder with an LPC (linear predictive coding) filter that is very flexible for prosodic modifications. From a parametrized acoustic database, the algorithm decodes the speech

  8. Intonation contour realisation for Standard Yorůbá text-to-speech synthesis: A fuzzy computational approach

    Microsoft Academic Search

    Odétúnjí A. Odéjobí; Anthony J. Beaumont; Shun Ha Sylvia Wong

    2006-01-01

    This paper presents a novel intonation modelling approach and demonstrates its applicability using the Standard Yorůbá language. Our approach is motivated by the theory that abstract and realised forms of intonation and other dimensions of prosody should be modelled within a modular and unified framework. In our model, this framework is implemented using the Relational Tree (R-Tree) technique. The R-Tree

  9. UNICEF Video/Audio

    NSDL National Science Digital Library

    UNICEF is known throughout the world for their focus on the health, education, equality and protection of children. They produce a number of helpful research reports and policy briefs, and as visitors to this site will find out, a good deal of audio and visual material in the form of podcasts, video news reports, and radio programs. Visitors to the UNICEF Radio area will find a wide range of radio reports on topics such as Nigeria's efforts to contain outbreaks of avian influenza and the effects of floods in Mozambique on children. Visitors interested in podcasts will be impressed with the offerings here, as they include over one hundred total archived programs, and visitors can also sign up to receive each new addition to this collection.

  10. Metrological digital audio reconstruction

    DOEpatents

    Fadeyev; Vitaliy (Berkeley, CA), Haber; Carl (Berkeley, CA)

    2004-02-19

    Audio information stored in the undulations of grooves in a medium such as a phonograph record may be reconstructed, with little or no contact, by measuring the groove shape using precision metrology methods coupled with digital image processing and numerical analysis. The effects of damage, wear, and contamination may be compensated, in many cases, through image processing and analysis methods. The speed and data handling capacity of available computing hardware make this approach practical. Two examples used a general purpose optical metrology system to study a 50 year old 78 r.p.m. phonograph record and a commercial confocal scanning probe to study a 1920's celluloid Edison cylinder. Comparisons are presented with stylus playback of the samples and with a digitally re-mastered version of an original magnetic recording. There is also a more extensive implementation of this approach, with dedicated hardware and software.

  11. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ...false Digital audio broadcasting service requirements...Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED...SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service...

  12. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ...false Digital audio broadcasting service requirements...Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED...SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service...

  13. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ...false Digital audio broadcasting service requirements...Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED...SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service...

  14. The Audio-Tutorial System

    ERIC Educational Resources Information Center

    Postlethwait, S. N.

    1970-01-01

    Describes the audio-tutorial program in Botany at Purdue University. Advantages include adaptability to individual stduent needs, integration of laboratory activities and information giving, aid flexibility in use of media and means of presentation. (EB)

  15. Plasmon-Assisted Audio Recording

    E-print Network

    Chen, Hao

    We present the first demonstration of the recording of optically encoded audio onto a plasmonic nanostructure. Analogous to the ‘‘optical sound’’ approach used in the early twentieth century to store sound on photographic ...

  16. AUDIO SOURCE SEPARATION USING SPARSITY

    Microsoft Academic Search

    A. A?ssa-el-bey; K. Abed-meraim; Y. Grenier

    ABSTRACT In this paper, we are interested in blind source separation from instantaneous mixtures of audio signals. Using the sparsity property of audio signals, we propose an iterative method that relies on a relative gradient technique which minimizes,a contrast function based on the ‘p norm. This norm,is considered as a good sparsity measure. The simulations show that the proposed method,outperforms,other

  17. The Timbre Toolbox: extracting audio descriptors from musical signals.

    PubMed

    Peeters, Geoffroy; Giordano, Bruno L; Susini, Patrick; Misdariis, Nicolas; McAdams, Stephen

    2011-11-01

    The analysis of musical signals to extract audio descriptors that can potentially characterize their timbre has been disparate and often too focused on a particular small set of sounds. The Timbre Toolbox provides a comprehensive set of descriptors that can be useful in perceptual research, as well as in music information retrieval and machine-learning approaches to content-based retrieval in large sound databases. Sound events are first analyzed in terms of various input representations (short-term Fourier transform, harmonic sinusoidal components, an auditory model based on the equivalent rectangular bandwidth concept, the energy envelope). A large number of audio descriptors are then derived from each of these representations to capture temporal, spectral, spectrotemporal, and energetic properties of the sound events. Some descriptors are global, providing a single value for the whole sound event, whereas others are time-varying. Robust descriptive statistics are used to characterize the time-varying descriptors. To examine the information redundancy across audio descriptors, correlational analysis followed by hierarchical clustering is performed. This analysis suggests ten classes of relatively independent audio descriptors, showing that the Timbre Toolbox is a multidimensional instrument for the measurement of the acoustical structure of complex sound signals. PMID:22087919

  18. Audio watermarking and partial encryption

    NASA Astrophysics Data System (ADS)

    Steinebach, Martin; Zmudzinski, Sascha; Bolke, Torsten

    2005-03-01

    Today two technologies are applied when protecting audio data in digital rights management (DRM) environments: Encryption and digital watermarking. Encryption renders the data unreadable for those not in the possession of a key enabling decryption. This is especially of interest for access control, as usage of the audio data is restricted to those owning a key. Digital watermarking adds additional information into an audio file without influencing quality our file size. This additional information can be used for inserting copyright information or a customer identity into the audio file. The later method is of special interest for DRM as it is the only protection mechanism enabling tracing illegal usage to a certain customer even after the audio data has escaped the secure DRM environment. Existing methods combine these methods in first embedding the watermark and than encrypting the content. As a more efficient alternative, we introduce a combined watermarking and encryption scheme where both mechanisms are transparent to each other. A watermark is embedded in and detected from an encrypted or unencrypted file. The watermark also does not influence the encryption mechanism. The only requirement for this method is a common key available to both algorithms.

  19. 50 CFR 27.72 - Audio equipment.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ...27.72 Audio equipment. The operation or use of audio devices including radios, recording and playback devices, loudspeakers, television sets, public address systems and musical instruments so as to cause unreasonable disturbance to others in the...

  20. 50 CFR 27.72 - Audio equipment.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ...27.72 Audio equipment. The operation or use of audio devices including radios, recording and playback devices, loudspeakers, television sets, public address systems and musical instruments so as to cause unreasonable disturbance to others in the...

  1. Digital Audio Application to Short Wave Broadcasting

    NASA Technical Reports Server (NTRS)

    Chen, Edward Y.

    1997-01-01

    Digital audio is becoming prevalent not only in consumer electornics, but also in different broadcasting media. Terrestrial analog audio broadcasting in the AM and FM bands will be eventually be replaced by digital systems.

  2. Robust spread-spectrum audio watermarking

    Microsoft Academic Search

    Darko Kirovski; Henrique Malvar

    2001-01-01

    We present several mechanisms that enable effective spread-spectrum audio watermarking systems: prevention against detection desynchronization, cepstrum filtering, and chess watermarks. We have incorporated these techniques into a system capable of reliably detecting a watermark in an audio clip that has been modified using a composition of attacks that degrade the original audio characteristics well beyond the limit of acceptable quality.

  3. 36 CFR 2.12 - Audio disturbances.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...2010-07-01 2010-07-01 false Audio disturbances. 2.12 Section 2.12...PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are...motor vehicle, motorized toy, or an audio device, such as a radio,...

  4. 36 CFR 1002.12 - Audio disturbances.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...2010-07-01 2010-07-01 false Audio disturbances. 1002.12 Section 1002...PUBLIC USE AND RECREATION § 1002.12 Audio disturbances. (a) The following are...motor vehicle, motorized toy, or an audio device, such as a radio,...

  5. Robust audio watermarking in the time domain

    Microsoft Academic Search

    Paraskevi Bassia; Ioannis Pitas; Nikos Nikolaidis

    2001-01-01

    The audio watermarking method proposed in this paper offers copyright protection to an audio signal by time domain processing. The strength of audio signal modifications is limited by the necessity to produce an output signal that is perceptually similar to the original one. The watermarking method presented here does not require the use of the original signal for watermark detection.

  6. Instructional Video Content Analysis Using Audio Information

    Microsoft Academic Search

    Ying Li; Chitra Dorai

    2006-01-01

    Automatic media content analysis and understanding for efficient topic searching and browsing are current challenges in the management of e-learning content repositories. This paper presents our current work on analyzing and structuralizing instructional videos using pure audio information. Specifically, an audio classification scheme is first developed to partition the sound-track of an instructional video into homogeneous audio segments where each

  7. Digital audio watermarking in the cepstrum domain

    Microsoft Academic Search

    Sang-Kwang Lee; Yo-Sung Ho

    2000-01-01

    We propose a digital audio watermarking technique in the cepstrum domain. We insert a digital watermark into the cepstral components of the audio signal using a technique analogous to spread spectrum communications, hiding a narrow band signal in a wideband channel. In our method, we use pseudo-random sequences to watermark the audio signal. The watermark is then weighted in the

  8. Kenneth S. Goldstein Audio Recordings

    NSDL National Science Digital Library

    This remarkable collection consists of over 850 audio reels recorded primarily by Dr. Kenneth S. Goldstein. He was a folklorist, record producer, and teacher who happened to also find time to serve as chairman of the department of folklore and folklife at the University of Pennsylvania. These audio tapes include interviews with musicians and storytellers, recitations of folktales from Newfoundland and Labrador, Pennsylvania, and Scotland. First-time visitors might do well to look over the English Language Folktale reels and then move on to perform their own detailed search across the entire archive. Visitors can also elect to receive updates on the collection via their RSS feed.

  9. Unsupervised speech\\/music classification using one-class support vector machines

    Microsoft Academic Search

    S. Omid Sadjadi; S. M. Ahadi; Oldooz Hazrati

    2007-01-01

    Audio classification is an important issue in current audio processing and content analysis researches. Speech\\/music classification is one of the most interesting branches of audio signal classification. In this paper we present an unsupervised clustering method, based on one-class support vector machines (OCSVM) and inspired by the classical K-means algorithm, which effectively classifies speech\\/music signals. First, relevant features are extracted

  10. AudioMath: blind children learning mathematics through audio

    Microsoft Academic Search

    J H Sánchez; H E Flores

    Diverse studies using computer applications have been implemented to improve the learning of children with visual disabilities. A growing line of research uses audio-based interactive interfaces to enhance learning and cognition in these children. The development of short-term memory and mathematics learning through virtual environments has not been emphasized in these studies. This work presents the design, development, and usability

  11. Development of Learning Modules for Machine Shop Occupations. Final Report.

    ERIC Educational Resources Information Center

    Kent, Randall

    This final report contains an eight-page narrative and materials/products of a program to produce (the final) sixty-eight individualized machine shop skill tasks modules (and fifty-two master audio tapes for students with serious reading disabilities). The narrative also describes the determination of the vital few skills used by machine tool…

  12. Cluster: Metals. Course: Machine Shop. Research Project.

    ERIC Educational Resources Information Center

    Sanford - Lee County Schools, NC.

    The set of 13 units is designed for use with an instructor in actual machine shop practice and is also keyed to audio visual and textual materials. Each unit contains a series of task packages which: specify prerequisites within the series (minimum is Unit 1); provide a narrative rationale for learning; list both general and specific objectives in…

  13. Scan for Author Audio Interview

    E-print Network

    VIEWPOINT Scan for Author Audio Interview Primary Health Care in Low-Income Countries Building.9 billion in 2010.2,3 With approxi- mately 1 billion low-income recipients for the preponder- ance in low-income settings. As one key example, as of 2000, the first-line treatment for malaria was still

  14. Haptic design for digital audio

    Microsoft Academic Search

    Lonny L. Chu

    2002-01-01

    This paper describes work on a system that incorporates haptic feedback into digital sound editing software. Currently, the most common methods for navigating through digital audio are with computer keyboards and mice. Occasionally, peripheral consoles may be used that provide passive knobs. In each of these cases, the user relies on visual and aural feedback, without any haptic feedback related

  15. Single Channel Audio Source Separation

    Microsoft Academic Search

    W. L. WOO; S. S. DLAY

    2008-01-01

    Blind source separation is an advanced statistical tool that has found widespread use in many signal processing applications. However, the crux topic based on one channel audio source separation has not fully developed to enable its way to laboratory implementation. The main idea approach to single channel blind source separation is based on exploiting the inherent time structure of sources

  16. Audio Engineering Society Convention Paper

    E-print Network

    Wichmann, Felix

    of the Audio Engineering Society. Comparison of frequency-warped representations for source separation-frequency representations for the pur- pose of sound source separation from stereo mixtures. Such transformations enhance on the localization and detection of the sources, as well as on the quality of the separated signals. Specifically, we

  17. Audio steganography using bit modification

    Microsoft Academic Search

    K. Gopalan

    2003-01-01

    A method of embedding a covert audio message in a cover utterance for secure communication is presented. The covert message is represented in a compressed form with possibly encryption and\\/or encoding for added security. One bit in each of the samples of a given cover utterance is altered in accordance with the data bits and a key. The same key

  18. Audio Engineering Society Convention Paper

    E-print Network

    Milios, Evangelos E.

    Roulette Several of the more popular ray-based methods in- clude image sources [1], ray tracing [9 of the Audio Engineering Society. Acoustical Modeling Using a Russian Roulette Strategy Bill Kapralos1 One of the problems with geometric (ray) based acoustical modeling approaches is handling

  19. Audio/ Videoconferencing Packages: High Cost

    ERIC Educational Resources Information Center

    Murillo, Sonia; Rizzuto, Mary; Sawyers, Urel

    2005-01-01

    This report compares two integrated course delivery packages: "Centra 6" and "WebEx". Both applications feature asynchronous and synchronous audio communications for online education and training. They are relatively costly products, and provide useful comparisons with the two less expensive products to be evaluated in the following report #53.…

  20. Audio Engineering Society Convention Paper

    E-print Network

    Ferri, Massimo

    and Saturation model of color space. An experiment was run on both sighted and blind participants in order Render of Colors Adding color information to this system, offers a more sophisticated tool to blind of the Audio Engineering Society. Acoustic Rendering for Color Information Ludovico Ausiello1 , Emanuele

  1. Audio Information Retrieval (AIR) Tools

    Microsoft Academic Search

    George Tzanetakis; Perry Cook

    Most of the work in music Information Retrieval (MIR) and analysis has been performed using symbolic representation like MIDI. The recent advances in computing power and network connectivity have made large amounts of raw digital audio data available in the form of unstructured monolithic sound files. In this work the focus is on tools that work directly on real world

  2. Audio-Visual Materials Catalog.

    ERIC Educational Resources Information Center

    Anderson (M.D.) Hospital and Tumor Inst., Houston, TX.

    This catalog lists 27 audiovisual programs produced by the Department of Medical Communications of the University of Texas M. D. Anderson Hospital and Tumor Institute for public distribution. Video tapes, 16 mm. motion pictures and slide/audio series are presented dealing mostly with cancer and related subjects. The programs are intended for…

  3. Simple Machines

    NSDL National Science Digital Library

    Mr. Oldroyd

    2007-09-26

    Online Simple Machines Assignment OBJECTIVES: Student\\'s will be able to name and describe all seven simple machines. Students will be able to identify simple machines that they use everyday. Example: Clock = Gear INSTRUCTIONS: 1. Click on the Simple Machines Glossary page and familiarize yourself with the seven simple machines. Simple Machines Glossary Page 2. Students are to click on ...

  4. The Art and Science of Audio Book Production.

    ERIC Educational Resources Information Center

    Library of Congress, Washington, DC. National Library Service for the Blind and Physically Handicapped.

    The diverse skills and technologies necessary to achieve high levels of artistic and technical quality in audio book production are called audio book art and science. This document explains the principles of audio book art and science and is divided into three sections: "Communication Art in Audio Book Production,""Science in Audio Book…

  5. Structured Models for Semantic Analysis of Audio Content

    E-print Network

    Eskenazi, Maxine

    Structured Models for Semantic Analysis of Audio Content Sourish Chaudhuri CMU­LTI­13­005 Language structure for universal audio analysis have been relatively limited. Prior work in analysis of audio content Chaudhuri #12;Keywords: audio content analysis, audio semantic analysis, acoustic unit descriptors

  6. Collusion-resistant audio fingerprinting system in the modulated complex lapped transform domain.

    PubMed

    Garcia-Hernandez, Jose Juan; Feregrino-Uribe, Claudia; Cumplido, Rene

    2013-01-01

    Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios. PMID:23762455

  7. Collusion-Resistant Audio Fingerprinting System in the Modulated Complex Lapped Transform Domain

    PubMed Central

    Garcia-Hernandez, Jose Juan; Feregrino-Uribe, Claudia; Cumplido, Rene

    2013-01-01

    Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios. PMID:23762455

  8. Audio indexing using speaker identification

    NASA Astrophysics Data System (ADS)

    Wilcox, Lynn D.; Kimber, Don; Chen, Francine R.

    1994-10-01

    In this paper, a technique for audio indexing based on speaker identification is proposed. When speakers are known a priori, a speaker index can be created in real time using the Viterbi algorithm to segment the audio into intervals from a single talker. Segmentation is performed using a hidden Markov model network consisting of interconnected speaker sub- networks. Speaker training data is used to initiate sub-networks for each speaker. Sub- networks can also be used to model silence, or non-speech sounds such as musical theme. When no prior knowledge of the speakers is available, unsupervised segmentation is performed using a non-real time iterative algorithm. The speaker sub-networks are first initialized, and segmentation is performed by iteratively generating a segmentation using the Viterbi algorithm, and retraining the sub-networks based on the results of the segmentation. Since the accuracy of the speaker segmentation depends on how well the speaker sub-networks are initiated, agglomerative clustering is used to approximately segment the audio according to speaker for initialization of the speaker sub-networks. The distance measure for the agglomerative clustering is a likelihood ratio in which speed segments are characterized by Gaussian distributions. The distance between merged segments is recomputed at each stage of the clustering, and a duration model is used to bias the likelihood ratio. Segmentation accuracy using agglomerative clustering initialization matches accuracy using initialization with speaker labeled data.

  9. Aeronautical audio broadcasting via satellite

    NASA Technical Reports Server (NTRS)

    Tzeng, Forrest F.

    1993-01-01

    A system design for aeronautical audio broadcasting, with C-band uplink and L-band downlink, via Inmarsat space segments is presented. Near-transparent-quality compression of 5-kHz bandwidth audio at 20.5 kbit/s is achieved based on a hybrid technique employing linear predictive modeling and transform-domain residual quantization. Concatenated Reed-Solomon/convolutional codes with quadrature phase shift keying are selected for bandwidth and power efficiency. RF bandwidth at 25 kHz per channel, and a decoded bit error rate at 10(exp -6) with E(sub b)/N(sub o) at 3.75 dB are obtained. An interleaver, scrambler, modem synchronization, and frame format were designed, and frequency-division multiple access was selected over code-division multiple access. A link budget computation based on a worst-case scenario indicates sufficient system power margins. Transponder occupancy analysis for 72 audio channels demonstrates ample remaining capacity to accommodate emerging aeronautical services.

  10. Simple Machines

    NSDL National Science Digital Library

    AWOL

    2006-11-15

    This activity is designed to learn about simple machines and to have fun doing so! First, use this website to learn backround information on the basics of simple machines. Try the quiz! Simple Machines Learning Site Next, play a game that tests your ability to identify simple machines.... Edheads: Simple Machines Finally, view this video to see how students your age used applied simple machines to do a cool task... Building Simple Machines: A Glass of Milk, Please ...

  11. Working with HTML5 Audio and Video

    Microsoft Academic Search

    Peter Lubbers; Brian Albers; Frank Salim

    \\u000a In this chapter, we’ll explore what you can do with two important HTML5 elements—audio and video— and we’ll show you how they can be used to create compelling applications. The audio and video elements add new media options\\u000a to HTML5 applications that allow you to use audio and video without plugins while providing a common, integrated, and scriptable\\u000a API.

  12. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ...2010-10-01 2010-10-01 false Digital audio broadcasting service requirements. 73.403...SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements....

  13. Audio issues in MIR evaluation vOverview of audio formats

    E-print Network

    Reiss, Josh

    system capable of playing WAV AES31 1. physical data transport ­ How files move between systems via removable media or networks 2. audio file format ­ Broadcast Wave format 3. simple project structure ­ Audio Decision List (ADL) 4. object-oriented project structure #12;Audio Codecs · MPEG-2 · MP3 · MPEG-4 AAC

  14. The Silicon Audio an audio-data compression and storage system with a semiconductor memory card

    Microsoft Academic Search

    A. Sugiyama; M. Iwadare; N. Ohdate; T. Manabe; H. Takano; O. Kitabatake; E. Hirao

    1995-01-01

    A new audio-data compression and storage system, the Silicon Audio, is presented. It employs the MPEG\\/Audio Layer II algorithm for data compression, which has been standardized by ISO (International Standardization Organization). A semiconductor memory card is equipped with to store the compressed signal. Decoding is carried out by a general purpose digital signal processor and a specially designed gate array

  15. Three-Dimensional Audio Client Library

    NASA Technical Reports Server (NTRS)

    Rizzi, Stephen A.

    2005-01-01

    The Three-Dimensional Audio Client Library (3DAudio library) is a group of software routines written to facilitate development of both stand-alone (audio only) and immersive virtual-reality application programs that utilize three-dimensional audio displays. The library is intended to enable the development of three-dimensional audio client application programs by use of a code base common to multiple audio server computers. The 3DAudio library calls vendor-specific audio client libraries and currently supports the AuSIM Gold-Server and Lake Huron audio servers. 3DAudio library routines contain common functions for (1) initiation and termination of a client/audio server session, (2) configuration-file input, (3) positioning functions, (4) coordinate transformations, (5) audio transport functions, (6) rendering functions, (7) debugging functions, and (8) event-list-sequencing functions. The 3DAudio software is written in the C++ programming language and currently operates under the Linux, IRIX, and Windows operating systems.

  16. Environmental Sound Classification using Hybrid SVM\\/KNN Classifier and MPEG7 Audio Low-Level Descriptor

    Microsoft Academic Search

    Jia-ching Wang; Jhing-fa Wang; Kuok Wai He; Cheng-shu Hsu

    2006-01-01

    In this paper, we present a new environmental sound classification architecture. The proposed sound classifier is performed in frame level and fuses the support vector machine (SVM) and the k nearest neighbor rule (KNN). In feature selection, three MPEG-7 audio low-level descriptors, spectrum centroid, spectrum spread, and spectrum flatness are used as the sound features to exploit their ability in

  17. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XXX 20XX 1 Partially Supervised Speaker Clustering

    E-print Network

    Hasegawa-Johnson, Mark

    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XXX 20XX 1 Partially, audio, video, text, etc.), one significant goal being to associate the identity of the content, audio, video, text, etc.) so that the identity of the content (face, voice, keywords, etc.) can

  18. Let's Hear It for Audio Mining

    NSDL National Science Digital Library

    Leavitt, Neal

    A detailed introduction is offered on the Leavitt Communications Web site. This article delves into how audio mining works by giving a basic technical understanding of the process. Approaches to audio mining are discussed, as well as how the technology works, performance, languages, and the challenges faced by designers.

  19. Improving Audio Quality in Distance Learning Applications.

    ERIC Educational Resources Information Center

    Richardson, Craig H.

    This paper discusses common causes of problems encountered with audio systems in distance learning networks and offers practical suggestions for correcting the problems. Problems and discussions are divided into nine categories: (1) acoustics, including reverberant classrooms leading to distorted or garbled voices, as well as one-dimensional audio

  20. Audio Indexing of Arabic broadcast news

    Microsoft Academic Search

    J. Billa; M. Noamany; A. Srivastava; D. Liu; R. Stone; J. Xu; J. Makhoul; F. Kubala

    2002-01-01

    This paper describes the development of the BBN Audio Indexing System for broadcast news in Arabic. Key issues addressed in this work revolve around the three major components of the audio indexing system: automatic speech recognition, speaker identification, and named entity identification. The system deals with several challenges introduced by the Arabic language, including the absence of short vowels in

  1. Dual Audio TV Instruction: A Broadcast Experiment.

    ERIC Educational Resources Information Center

    Borton, Terry; And Others

    An experiment assessed the potential effectiveness of "dual audio television instruction" (DATI) as a mass education medium. The DATI consisted of a radio program heard by children while they watched television shows. The audio instructor did not talk when the television characters spoke, but used the "quiet" times to help with reading, define…

  2. MOSIEVIUS: FEATURE DRIVEN INTERACTIVE AUDIO MOSAICING

    Microsoft Academic Search

    Ari Lazier; Perry Cook

    2003-01-01

    The process of creating an audio mosaic consists of the concate- nation of segments of sound. Segments are chosen to correspond best with a description of a target sound specified by the desi red features of the final mosaic. Current audio mosaicing techni ques take advantage of the description of future target units in o rder to make more intelligent

  3. Embedded system for audio source separation

    Microsoft Academic Search

    Laurentiu Frangu; M. Ma?za?rel; C. Chiculit?a?; S. Epure

    2010-01-01

    The paper presents an embedded system, designed for audio signal processing. It consists of a microphone array, the signal conditioning circuits, the AD converter, 2 DSP modules and the communication circuits, placed on a single board (excepting the microphone array). Up to 8 microphones receive the audio signals, which are recorded and processed, in order to provide information about the

  4. 3D Audio and Acoustic Environment Modeling

    Microsoft Academic Search

    William G. Gardner

    1999-01-01

    Abstract: Recently, there has been a proliferation of... This paper describes the 3D audio and acoustic environment modeling technology developed by Wave Arts, Inc. Wave Arts technology is the result of extensive research and development by Bill Gardner, a graduate of the MIT Media Lab. Dr. Gardner's research at the Media Lab focussed on the key technologies of 3D audio:

  5. Audio Engineering Society Convention Paper 8892

    E-print Network

    Reiss, Josh

    -varying equalization, real-time, spectral distribution, automatic mixing, and audio production. #12;Ma et al.reiss@eecs.qmul.ac.uk dawn.black@eecs.qmul.ac.uk ABSTRACT A new approach for automatically equalizing an audio signal towards recordings. A real-time C++ VST plug-in and an off-line Matlab implementation have been created

  6. Mixed Type Audio Classification using Sinusoidal Parameters

    Microsoft Academic Search

    P. M. B. Mahale; A. Sayadiyan; K. Faez

    2008-01-01

    A preprocessing stage in every audio application including music\\/speech separation, speech or speaker recognition and audio transcription task is inevitable to determine each frame belongs to which classes, namely: speech only, music only and finally mixture. Such classification can significantly lower the computational complexity due to factorial search commonly used in many model-based systems including monaural separation systems as well

  7. Enhancing Manual Scan Registration Using Audio Cues

    NASA Astrophysics Data System (ADS)

    Ntsoko, T.; Sithole, G.

    2014-04-01

    Indoor mapping and modelling requires that acquired data be processed by editing, fusing, formatting the data, amongst other operations. Currently the manual interaction the user has with the point cloud (data) while processing it is visual. Visual interaction does have limitations, however. One way of dealing with these limitations is to augment audio in point cloud processing. Audio augmentation entails associating points of interest in the point cloud with audio objects. In coarse scan registration, reverberation, intensity and frequency audio cues were exploited to help the user estimate depth and occupancy of space of points of interest. Depth estimations were made reliably well when intensity and frequency were both used as depth cues. Coarse changes of depth could be estimated in this manner. The depth between surfaces can therefore be estimated with the aid of the audio objects. Sound reflections of an audio object provided reliable information of the object surroundings in some instances. For a point/area of interest in the point cloud, these reflections can be used to determine the unseen events around that point/area of interest. Other processing techniques could benefit from this while other information is estimated using other audio cues like binaural cues and Head Related Transfer Functions. These other cues could be used in position estimations of audio objects to aid in problems such as indoor navigation problems.

  8. Digital Audio Radio Field Tests

    NASA Technical Reports Server (NTRS)

    Hollansworth, James E.

    1997-01-01

    Radio history continues to be made at the NASA Lewis Research Center with the beginning of phase two of Digital Audio Radio testing conducted by the Consumer Electronic Manufacturers Association (a sector of the Electronic Industries Association and the National Radio Systems Committee) and cosponsored by the Electronic Industries Association and the National Association of Broadcasters. The bulk of the field testing of the four systems should be complete by the end of October 1996, with results available soon thereafter. Lewis hosted phase one of the testing process, which included laboratory testing of seven proposed digital audio radio systems and modes (see the following table). Two of the proposed systems operate in two modes, thus making a total of nine systems for testing. These nine systems are divided into the following types of transmission: in-band on channel (IBOC), in-band adjacent channel (IBAC), and new bands - the L-band (1452 to 1492 MHz) and the S-band (2310 to 2360 MHz).

  9. Using audio fingerprinting for duplicate detection and thumbnail generation

    Microsoft Academic Search

    Christopher J. C. Burges; Dan Plastina; John C. Platt; Erin Renshaw; Henrique S. Malvar

    2005-01-01

    Audio fingerprinting is a powerful tool for identifying file-based or streaming audio, using a database of fingerprints. The paper presents two new applications of audio fingerprinting: duplicate detection, whose goal is to identify duplicate audio clips in a set, even if they differ in compression quality or duration, and thumbnail generation, which aims to provide a representative short clip of

  10. EMD and psychoacoustic model based watermarking for audio

    Microsoft Academic Search

    Liang Wang; Sabu Emmanuel; Mohan S. Kankanhalli

    2010-01-01

    The audio watermarking method proposed in this paper offers the copyright protection to an audio without the use of the original signal for watermark detection. The analysis filterbank decomposition, the psychoacoustic model and the empirical mode decomposition (EMD) are the three key techniques used in the novel audio watermarking method. Unlike the traditional audio watermarking algorithms where the watermark bits

  11. A Scalable System for 3D Audio Ray Tracing

    Microsoft Academic Search

    Wolfgang Mueller; Frank Ullmann

    1999-01-01

    Abstract Though several approaches in sound processing are denoted as 3D audio very few of them generate high quality 3D audio information which allows listeners to exactly locate sound sources in three dimensional space We present an approach to enhance sound by high quality 3D audio information through acoustic ray tracing where 3D audio is o ine processed with digital

  12. Hierarchical classification of audio data for archiving and retrieving

    Microsoft Academic Search

    Tong Zhang; C.-C. Jay Kuo

    1999-01-01

    A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The first stage is called the coarse-level audio classification and segmentation, where audio recordings are classified and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal

  13. Audio signal representations for indexing in the transform domain

    E-print Network

    Richard, Gaël

    and music is now widely stored and diffused in digital form. This revolution is mainly due to the spread as the state-of-the-art standard for (near-)transparent audio coding. More recently, the digital revolution - Audio coding - I. INTRODUCTION Digital audio has progressively replaced analog audio since the 80s

  14. Machine Learning

    Microsoft Academic Search

    Xin Yao; Yong Liu

    Machine learning is a very active sub-field of artificial intelligence concerned with the development of computational models\\u000a of learning. Machine learning is inspired by the work in several disciplines: cognitive sciences, computer science, statistics,\\u000a computational complexity, information theory, control theory, philosophy, and biology. Simply speaking, machine learning is\\u000a learning by machine. From a computational point of view, machine learning refers

  15. High-Fidelity Piezoelectric Audio Device

    NASA Technical Reports Server (NTRS)

    Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

    2003-01-01

    ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.

  16. How to Make an Audio Tape Bow

    NSDL National Science Digital Library

    Science Museum of Minnesota

    2012-06-26

    From this How To slide show, you create an Audio Tape Bow that can play distorted audio sounds by running it across a tape head. Learners will open up cassette tapes and used tape players to see how they work. Then, they will dismantle some of the parts in order to create and design a new instrument. The How To includes a video of the sound the Audio Tape Bow makes when run across a tape head. This activity is a great for exploring the way sounds are recorded and the technology used to play it back.

  17. Sparsity and Synchrony in Blind Audio-Visual Source Separation

    Microsoft Academic Search

    Anna Llagostera Casanovas; Gianluca Monaci; Pierre Vandergheynst; Remi Gribonval

    Abstract When dealing with an audio-visual scene, humans tend to integrate both modalities by assessing the temporal synchrony,between,relevant audio and video events. In this way we are able to detect and separate audio-visual sources. Inspired by these observations, we propose a novel method to decompose sequences made of a video signal and a one-microphone audio track into audio-visual sources. First,

  18. REISS ET AL. COMPRESSION FOR SUPER AUDIO CD Audio Engineering Society

    E-print Network

    Reiss, Josh

    a DSD encoded signal on other media, such as hard disk drives. For 6 channel DSD, this would require 12, Philips and Sony have devised and implemented a new audio storage format known as Super Audio Compact Disc compact discs use 16 bit PCM encoding at 44.1kHz, DSD uses 1-bit sampling of audio at 64x44.1kHz. Thus

  19. History Channel: Audio and Video

    NSDL National Science Digital Library

    Itâ??s perhaps a bit of a stretch of the imagination to think of a place that would include both a clip of Spiro Agnew speaking out on what he perceived to be the biases of television news coverage and some archival footage of Depression-era gangsters, but itâ??s all right here on the History Channelâ??s Audio and Video online archive. The speech archive is quite nice, and may prove to be both edifying and entertaining. Visitors can browse the speech archive by topics (such as War & Diplomacy) or alphabetically. Some of the clips offered here include comments by the scientist Wernher von Braun after hearing that the U.S.S.R had landed a spacecraft on the moon. The video clip section is also quite well-developed, as it contains clips of the trial of Adolf Eichmann and the breaking of the sound barrier.

  20. Simple Machines

    NSDL National Science Digital Library

    At this website, EdHeads, a nonprofit, offers five interactive, animated modules to educate second- through sixth-graders about simple machines. By identifying the many machines located throughout a house, students can learn about fulcrums, wheel and axles, levers, pulleys, inclined planes, and much more. The website is equipped with simple animations to help children understand how the machines work. After students have a handle on simple machines, they can begin to see how they work together to create compound machines. The website also provides a brief glossary summarizing nine types of simple machines. This site is also reviewed in the February 18, 2005_NSDL Physical Sciences Report_.

  1. CERN automatic audio-conference service

    E-print Network

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  2. Onset Detection in Musical Audio Signals

    Microsoft Academic Search

    Stephen Hainsworth; Malcolm Macleod

    2003-01-01

    This paper presents work on changepoint detection in musi- cal audio signals, focusing on the case where there are note changes with low associated energy variation. Several meth- ods are described and results of the best are presented.

  3. Time-frequency Masking EECS 352: Machine Perception of

    E-print Network

    Pardo, Bryan

    Time-frequency Masking EECS 352: Machine Perception of Music & Audio Zafar Rafii, Winter 2014 1 #12;Real spectrogram time frequency · The Short-Time Fourier Transform (STFT) is a succession of local Fourier Transforms (FT) STFT STFT Zafar Rafii, Winter 2014 2 + j* Imaginary spectrogram time frequency

  4. Audio Gallery: Scientists and Social Responsibility

    NSDL National Science Digital Library

    This online audio gallery is from the Museum's Seminars on Science, a series of distance-learning courses designed to help educators meet the new national science standards. Scientists and Social Responsibility, part of the Frontiers in Physical Science seminar, is available in broadband and modem formats and with a printable PDF transcript. The audio discusses some of the social-responsibility issues that scientists are grappling with today.

  5. Spread-spectrum watermarking of audio signals

    Microsoft Academic Search

    Darko Kirovski; Henrique S. Malvar

    2003-01-01

    Watermarking has become a technology of choice for a broad range of multimedia copyright protection applications. Watermarks have also been used to embed format-independent metadata in audio\\/video signals in a way that is robust to common editing. In this paper, we present several novel mechanisms for effective encoding and detection of direct-sequence spread-spectrum watermarks in audio signals. The developed techniques

  6. Hierarchical system for content-based audio classification and retrieval

    NASA Astrophysics Data System (ADS)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classical and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossing rate, and the fundamental frequency of audio signals. The first stage is called the coarse-level audio classification and segmentation. Then, environmental sounds are classified into finer classes such as applause, rain, birds' sound, etc., which is called the fine-level audio classification. The second stage is based on time-frequency analysis of audio signals and the use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to the input sample audio. The way of modeling audio features with the hidden Markov model, the procedures of audio classification and retrieval, and the experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy higher than 90%. Examples of audio fine classification and audio retrieval with the proposed HMM-based method are also provided.

  7. Development of multilingual medical reception support system with text-to-speech function to combine utterance data with voice synthesis

    Microsoft Academic Search

    Mai Miyabe; Takashi Yoshino

    2010-01-01

    The need for multilingual communication in Japan has increased. In the medical field, there exists a serious problem when it comes to communications between hospital staff and foreign patients. Currently, medical translators accompany patients to medical care facilities, and the number of requests for medical translators is increasing. However, medical translators cannot provide support at all times, especially in cases

  8. A joint intelligibility evaluation of French text-to-speech synthesis systems: the EvaSy SUS/ACR campaign

    E-print Network

    Boula de MareĂĽil, Philippe

    /ACR campaign Philippe Boula de MareĂĽil1 , Christophe d'Alessandro1 , Alexander Raake1 , GĂ©rard Bailly2 , Marie module (Boula de MareĂĽil et al., 2005), evaluation of prosody (Garcia et al., 2006), evaluation

  9. A joint prosody evaluation of French text-to-speech synthesis systems Marie-Neige Garcia1

    E-print Network

    Boula de MareĂĽil, Philippe

    , Christophe d'Alessandro2 , Gérard Bailly3 , Philippe Boula de Mareüil2 , Michel Morel4 1 ELDA, 55­57 rue components: evaluation of the grapheme-to-phoneme conversion module (Boula de Mareüil et al., 2005 speech (Boula de Mareüil et al., 2006). One of the aims of the project is to assess the quality

  10. SIMPLE MACHINES

    NSDL National Science Digital Library

    Mrs. MacHose

    2007-03-10

    You will be learning about several types of simple machines. Have fun!! Review the first website (which is right here!! Simple machines) . It has information about simple machines. DON\\"T click until you read all directions!!! Prepare to discuss each type in class. You will need to take some basic notes about each machine, using a bubble-map format. Don\\'t forget ...

  11. Sewing Machines!!

    NSDL National Science Digital Library

    Miss. Walker

    2008-10-20

    Learn the Parts of a Sewing Machine This should help you understand the history of sewing machines and how they work. For this assignment, answer these questions on a sheet of paper and bring it to class. Click on this link to go to a site which will briefly explain the history of the sewing machine: wikipedia Was ...

  12. Simple Machines

    NSDL National Science Digital Library

    This is a lesson about simple machines and how they relate to robots. Learners will gain an understanding of simple machines and how they may be used in our everyday lives. Students will also have an opportunity to design a Rube Goldberg Machine of their own. This is lesson 10 of 16 in the MarsBots learning module.

  13. Electrostatic Machines

    NSDL National Science Digital Library

    De Queiroz, Antonio Carlos M.

    This website from Antonio Carlos M. De Queiroz, an associate professor at the Federal University of Rio de Janeiro, illustrates a number of different electrostatic machines. The site includes details and images of machines built by the professor as well as many other historical machines of this type. Some information is also available in Portugese.

  14. AUDIO SPECTRUM PROJECTION BASED ON SEVERAL BASIS DECOMPOSITION ALGORITHMS APPLIED TO GENERAL SOUND RECOGNITION AND AUDIO

    E-print Network

    Wichmann, Felix

    AUDIO SPECTRUM PROJECTION BASED ON SEVERAL BASIS DECOMPOSITION ALGORITHMS APPLIED TO GENERAL SOUND- formance of MPEG-7 Audio Spectrum Projection (ASP) features based on basis decomposition vs. Mel-scale Fre method of the MPEG-7 sound recogni- tion framework is based on the projection of a spectrum onto a low

  15. Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study

    ERIC Educational Resources Information Center

    Romero-Fresco, Pablo; Fryer, Louise

    2013-01-01

    Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…

  16. Interaction with Machine Improvisation

    NASA Astrophysics Data System (ADS)

    Assayag, Gerard; Bloch, George; Cont, Arshia; Dubnov, Shlomo

    We describe two multi-agent architectures for an improvisation oriented musician-machine interaction systems that learn in real time from human performers. The improvisation kernel is based on sequence modeling and statistical learning. We present two frameworks of interaction with this kernel. In the first, the stylistic interaction is guided by a human operator in front of an interactive computer environment. In the second framework, the stylistic interaction is delegated to machine intelligence and therefore, knowledge propagation and decision are taken care of by the computer alone. The first framework involves a hybrid architecture using two popular composition/performance environments, Max and OpenMusic, that are put to work and communicate together, each one handling the process at a different time/memory scale. The second framework shares the same representational schemes with the first but uses an Active Learning architecture based on collaborative, competitive and memory-based learning to handle stylistic interactions. Both systems are capable of processing real-time audio/video as well as MIDI. After discussing the general cognitive background of improvisation practices, the statistical modelling tools and the concurrent agent architecture are presented. Then, an Active Learning scheme is described and considered in terms of using different improvisation regimes for improvisation planning. Finally, we provide more details about the different system implementations and describe several performances with the system.

  17. Digital Multicasting of Multiple Audio Streams

    NASA Technical Reports Server (NTRS)

    Macha, Mitchell; Bullock, John

    2007-01-01

    The Mission Control Center Voice Over Internet Protocol (MCC VOIP) system (see figure) comprises hardware and software that effect simultaneous, nearly real-time transmission of as many as 14 different audio streams to authorized listeners via the MCC intranet and/or the Internet. The original version of the MCC VOIP system was conceived to enable flight-support personnel located in offices outside a spacecraft mission control center to monitor audio loops within the mission control center. Different versions of the MCC VOIP system could be used for a variety of public and commercial purposes - for example, to enable members of the general public to monitor one or more NASA audio streams through their home computers, to enable air-traffic supervisors to monitor communication between airline pilots and air-traffic controllers in training, and to monitor conferences among brokers in a stock exchange. At the transmitting end, the audio-distribution process begins with feeding the audio signals to analog-to-digital converters. The resulting digital streams are sent through the MCC intranet, using a user datagram protocol (UDP), to a server that converts them to encrypted data packets. The encrypted data packets are then routed to the personal computers of authorized users by use of multicasting techniques. The total data-processing load on the portion of the system upstream of and including the encryption server is the total load imposed by all of the audio streams being encoded, regardless of the number of the listeners or the number of streams being monitored concurrently by the listeners. The personal computer of a user authorized to listen is equipped with special- purpose MCC audio-player software. When the user launches the program, the user is prompted to provide identification and a password. In one of two access- control provisions, the program is hard-coded to validate the user s identity and password against a list maintained on a domain-controller computer at the MCC. In the other access-control provision, the program verifies that the user is authorized to have access to the audio streams. Once both access-control checks are completed, the audio software presents a graphical display that includes audiostream-selection buttons and volume-control sliders. The user can select all or any subset of the available audio streams and can adjust the volume of each stream independently of that of the other streams. The audio-player program spawns a "read" process for the selected stream(s). The spawned process sends, to the router(s), a "multicast-join" request for the selected streams. The router(s) responds to the request by sending the encrypted multicast packets to the spawned process. The spawned process receives the encrypted multicast packets and sends a decryption packet to audio-driver software. As the volume or muting features are changed by the user, interrupts are sent to the spawned process to change the corresponding attributes sent to the audio-driver software. The total latency of this system - that is, the total time from the origination of the audio signals to generation of sound at a listener s computer - lies between four and six seconds.

  18. The Effect of Audio and Visual Aids on Task Performance in Distributed Collaborative Virtual Environments

    NASA Astrophysics Data System (ADS)

    Ullah, Sehat; Richard, Paul; Otman, Samir; Mallem, Malik

    2009-03-01

    Collaborative virtual environments (CVE) has recently gained the attention of many researchers due to its numerous potential application domains. Cooperative virtual environments, where users simultaneously manipulate objects, is one of the subfields of CVEs. In this paper we present a framework that enables two users to cooperatively manipulate objects in virtual environment, while setting on two separate machines connected through local network. In addition the article presents the use of sensory feedback (audio and visual) and investigates their effects on the cooperation and user's performance. Six volunteers subject had to cooperatively perform a peg-in-hole task. Results revealed that visual and auditory aid increase users' performance. However majority of the users preferred visual feedback to audio. We hope this framework will greatly help in the development of CAD systems that allow the designers to collaboratively design while being distant. Similarly other application domains may be cooperative assembly, surgical training and rehabilitation systems.

  19. REPET for Background/Foreground Separation in Audio

    E-print Network

    Pardo, Bryan

    appears as an exploitable cue for source separation in audio. By identifying and extracting the repeatingChapter 14 REPET for Background/Foreground Separation in Audio Zafar Rafii, Antoine Liutkus Source Separation, 395 Signals and Communication Technology, DOI: 10

  20. Podscanning : audio microcontent and synchronous communication for mobile devices

    E-print Network

    Wheeler, Patrick Sean

    2010-01-01

    Over the past decade, computationally powerful audio communication devices have become commonplace. Mobile devices have high storage capacity for digital audio, and smartphones or networked PDAs can be used to stream ...

  1. Machine Learning

    Microsoft Academic Search

    Crina Grosan; Ajith Abraham

    \\u000a Machine Learning[6][8][12] is concerned with the study of building computer programs that automatically improve and\\/or adapt\\u000a their performance through experience. Machine learning can be thought of as “programming by example” [11]. Machine learning\\u000a has many common things with other domains such as statistics and probability theory (understanding the phenomena that have\\u000a generated the data), data mining (finding patterns in the

  2. Music genre classification based on MPEG7 audio features

    Microsoft Academic Search

    Song Li; Haifeng Li; Lin Ma

    2010-01-01

    Music genre is a useful tool which can categorize large music database. MPEG-7 is an international standard for managing audiovisual contents. MPEG-7 audio features are used for audio classification, music instrument classification and speech recognition. In this paper, we present a series of experiments to evaluate the MPEG-7 audio features for music genre classification performance. For the feature extraction, we

  3. AUDIO-VIDEO EVENT RECOGNITION SYSTEM FOR PUBLIC TRANSPORT SECURITY

    E-print Network

    Paris-Sud XI, Université de

    AUDIO-VIDEO EVENT RECOGNITION SYSTEM FOR PUBLIC TRANSPORT SECURITY Van-Thinh Vu Quoc-Cuong Pham vehicle. The system comprises six modules including in particular three novel ones: (i) Face Detection and Tracking, (ii) Audio Event Detection and (iii) Audio-Video Scenario Recognition. The Face Detection

  4. Increasing Access to Learning With Hybrid Audio-Data Collaboration

    Microsoft Academic Search

    Michael W. Freeman; Lawrence W. Grimes; J. Ray Holliday

    2000-01-01

    Internet enabled hybrid audio-data collaboration delivers high quality audio over telephone lines and data interaction over packet switched Internet connections, thus distributing the transmission load between two highly accessible but limited bandwidth media. This paper explores the need for hybrid audio-data collaboration and describes two complementary studies comparing the performance and satisfaction of groups of graduate students taking an introductory

  5. AGGLOMERATIVE CLUSTERING IN SPARSE ATOMIC DECOMPOSITIONS OF AUDIO SIGNALS

    E-print Network

    California at Santa Barbara, University of

    signals [2], blind source separation [3], and audio signal coding [4]. One can incorporate a diverseAGGLOMERATIVE CLUSTERING IN SPARSE ATOMIC DECOMPOSITIONS OF AUDIO SIGNALS Bob L. Sturm, John J clus- tering of atoms in sparse atomic decompositions of audio signals. Our goal is to demonstrate

  6. A dynamic grouping technique for ink and audio notes

    Microsoft Academic Search

    Patrick Chiu; Lynn Wilcox

    1998-01-01

    In this paper, we describe a technique for dynamically grouping digital ink and audio to support user interaction in freeform note-taking systems. For ink, groups of strokes might correspond to words, lines, or paragraphs of handwritten text. For audio, groups might be a complete spoken phrase or a speaker turn in a conversation. Ink and audio grouping is important for

  7. Automatically segmenting and clustering minimal-impact personal audio archives

    E-print Network

    Ellis, Dan

    Automatically segmenting and clustering minimal-impact personal audio archives Daniel P.W. Ellis, and experimenting with methods to index and access the resulting data. Audio archives have several distinctive no attention from the research community. At the same time, continuous audio archives are minimally intrusive

  8. Audio Source Separation Using Hierarchical Phase-Invariant Models

    E-print Network

    Paris-Sud XI, Université de

    Audio Source Separation Using Hierarchical Phase-Invariant Models Emmanuel Vincent METISS Group-invariant modeling could form the basis of future modular source separation systems. 1 Introduction Most audio source separation consists of analyzing a given audio recording so as to estimate the signal produced

  9. Reconstructing audio signals from modified non-coherent hilbert envelopes

    Microsoft Academic Search

    Joachim Thiemann; Peter Kabal

    2007-01-01

    In this paper, we present a speech and audio analysis-synthesis method based on a Basilar Membrane (BM) model. The audio signal is represented in this method by the Hilbert envelopes of the responses to complex gammatone filters uniformally spaced on a critical band scale. We show that for speech and audio sig- nals, a perceptually equivalent signal can be reconstructed

  10. Wideband speech and audio coding using gammatone filter banks

    Microsoft Academic Search

    Eliathamby Ambikairajah; Julien Epps; Lee Lin

    2001-01-01

    Considerable research attention has been directed towards speech and audio coding algorithms capable of producing high quality coded speech and audio, however few of these use signal representations which account for temporal as well as spectral detail. This paper presents a new technique for 16 kHz wideband speech and audio coding, whereby analysis and synthesis are performed using a linear

  11. Analysis of Audio Packet Loss in the Internet

    Microsoft Academic Search

    Jean-chrysostome Bolot; Hugues Crépin; Andrés Vega-garcía

    1995-01-01

    We consider the problem of distributing audio data over networks such as the Internet that do not provide support for real-time applications. Experiments with such networks indicate that audio quality is mediocre in large part because of excessive audio packet losses. In this paper, we show using measurements over the Internet as well as analytic modeling that the number of

  12. Audio signal representations for indexing in the transform domain

    E-print Network

    Paris 7 - Denis Diderot, Université

    and diffused in digital form. This revolution is mainly due to the spread of audio coding technologies, which coding. More recently, the digital revolution gave birth to another research domain known as automatic Digital audio has progressively replaced analog audio since the 80s and music is now widely stored

  13. High performance MPEG-audio decoder IC

    NASA Astrophysics Data System (ADS)

    Thorn, M.; Benbassat, G.; Cyr, K.; Li, S.; Gill, M.; Kam, D.; Walker, K.; Look, P.; Eldridge, C.; Ng, P.

    The emerging digital audio and video compression technology brings both an opportunity and a new challenge to IC design. The pervasive application of compression technology to consumer electronics will require high volume, low cost IC's and fast time to market of the prototypes and production units. At the same time, the algorithms used in the compression technology result in complex VLSI IC's. The conflicting challenges of algorithm complexity, low cost, and fast time to market have an impact on device architecture and design methodology. The work presented in this paper is about the design of a dedicated, high precision, Motion Picture Expert Group (MPEG) audio decoder.

  14. Sparse audio representations using the MCLT

    Microsoft Academic Search

    Mike E. Davies; Laurent Daudet

    2006-01-01

    Abstract We consider sparse representations of audio based around,the modulated,complex,lapped transform,(MCLT) and a generalized,iteratively reweighted,least squares,algorithm,which,can,be interpreted as a variation of expectation maximization.,We compare,this mildly overcomplete,representation,to the more,traditional modified,discrete cosine transform,(MDCT) in terms of coding cost and explore the possibility of extending it to a dual-resolution analysis using a pair of MCLT transforms, illustrating its potential application for audio modification.

  15. High performance MPEG-audio decoder IC

    NASA Technical Reports Server (NTRS)

    Thorn, M.; Benbassat, G.; Cyr, K.; Li, S.; Gill, M.; Kam, D.; Walker, K.; Look, P.; Eldridge, C.; Ng, P.

    1993-01-01

    The emerging digital audio and video compression technology brings both an opportunity and a new challenge to IC design. The pervasive application of compression technology to consumer electronics will require high volume, low cost IC's and fast time to market of the prototypes and production units. At the same time, the algorithms used in the compression technology result in complex VLSI IC's. The conflicting challenges of algorithm complexity, low cost, and fast time to market have an impact on device architecture and design methodology. The work presented in this paper is about the design of a dedicated, high precision, Motion Picture Expert Group (MPEG) audio decoder.

  16. Enhancing Navigation Skills through Audio Gaming

    PubMed Central

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2014-01-01

    We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks. PMID:25505796

  17. Precision Machining

    NSDL National Science Digital Library

    Leske, Cavin.

    Basic machining processes are introduced on a Web site that is devoted to engineering fundamentals (1). Descriptions and illustrations of drilling, turning, grinding, and other common processes are provided for people with little to no prior machining knowledge. A waterjet is a non-traditional machining technology that uses high pressure streams of water with abrasive additives rather than solid cutting instruments to slice through metal and other materials. An in-depth discussion of waterjet operation and applications is available from Southern Methodist University (2). Waterjets are often cited as being much more precise than traditional machining techniques. The Waterjet Video Vault (3) contains clips of waterjet machines in action. The video of the foam cutting procedure is especially interesting, as it shows how quick and accurate the machining process can be. An online guide to cross process machining, which incorporates elements from various conventional and unconventional techniques, is provided by the Mechanical Engineering Department at Columbia University (4). Some remarkable and innovative techniques that have surfaced over the past few years are outlined, including underwater laser machining and plasma-assisted machining. Entirely different and exotic machining techniques are required for creating microelectromechanical systems (MEMS) and other extremely small devices. The Caltech Micromachining Laboratory (5) maintains an archive of research highlights and papers on its homepage, including a paper on a MEMS-driven flapping wing for a palm-sized aerial vehicle. An online article from Modern Machine Shop (6) outlines some new technologies and research in the area of high speed machining. A particularly interesting section of the article describes a system developed at the University of Florida that aims to enable micromachining to achieve rotational speeds of standard machining processes, specifically up to a half million rotations per minute. Cutting edge waterjet innovations are the subject of a February 2003 feature from a publication of the Society of Manufacturing Engineers (7). Extremely high pressure nozzles are being developed to improve cutting speed, and enhanced software for controlling machine movements is also a focus of study. This news article (8) from June 20, 2003 describes an electrochemical machining process that is being used to fabricate complex nanostructures. The work, produced by German and U.S. researchers, has the potential to compete with current lithographic processes.

  18. Audio Steganography Using Bit Modification - A Tradeoff on Perceptibility and Data Robustness for Large Payload Audio Embedding

    Microsoft Academic Search

    Kaliappan Gopalan; Qidong Shi

    2010-01-01

    Audio steganography using bit modification of time domain audio samples is a simple technique for multimedia data embedding with potential for large payload. Depending on the index of the bit used to modify the samples in accordance with the data to be hidden, the resulting stego audio signal may become perceptible and\\/or susceptible to incorrect retrieval of the hidden data.

  19. Comparison of MPEG7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation

    Microsoft Academic Search

    Hyoung-Gook Kim; Thomas Sikora

    2004-01-01

    We evaluate the MPEG-7 audio spectrum projection (ASP) features for general sound recognition performance against the well established MFCC. The recognition tasks of interest are speaker recognition, sound classification, and segmentation of audio using sound\\/speaker identification. For sound classification we use three approaches: direct approach; hierarchical approach without hints; hierarchical approach with hints. For audio segmentation, the MPEG-7 ASP features

  20. Simple Machines

    NSDL National Science Digital Library

    KET

    2010-11-16

    How do you get a glove and a ball up to your tree house? One answer is to use a pulley. A pulley is a simple machine. In this original KET interactive, children learn about the basic workings of three simple machines.

  1. Excavating machines

    SciTech Connect

    Plummer, D.

    1980-10-21

    The excavating machine has a cutter carrying boom carried by a boom support member which can be swung about an axis extending in the direction of the roadway. The machine includes a cutter unit and a stay unit each of which is releasably anchorable in the roadway and each of which can be advanced relative to the other unit.

  2. AudioBIFS: Describing Audio Scences with MPEG4 Multimedia Standard

    Microsoft Academic Search

    Eric D. Scheirer; Jyri Huopaniemi

    1999-01-01

    We present an overview of the AudioBIFS system,part of the Binary Format for Scene Description (BIFS) tool inthe MPEG-4 International Standard. AudioBIFS is the tool thatintegrates the synthetic and natural sound coding functions inMPEG-4. It allows the flexible construction of soundtracks andsound scenes using compressed sound, sound synthesis, streamingaudio, interactive and terminal-dependent presentation, threedimensional(3-D) spatialization, environmental auralization, anddynamic...

  3. Audio Arduino -- an ALSA (Advanced Linux Sound Architecture) Audio Driver for FTDI-based Arduinos

    Microsoft Academic Search

    Smilen Dimitrov; Stefania Serafin

    2011-01-01

    A contemporary PC user, typically expects a sound cardto be a piece of hardware, that: can be manipulated by'audio' software (most typically exemplified by 'media players'); and allows interfacing of the PC to audio reproduction and\\/or recording equipment. As such, a 'sound card'can be considered to be a system, that encompasses designdecisions on both hardware and software levels -- that

  4. Voltage Surges in Audio-Frequency Apparatus

    Microsoft Academic Search

    E. H. Fisher

    1929-01-01

    Transient voltages of over 2000 are shown to occur across the secondary when the normal plate current of a high inductance audio transformer is opened. The oscillograph and inverted vacuum tube are used to bring out further that these transients are oscillations of definite frequency and magnitude, depending on the primary current, inductance, and secondary distributed capacity. The manner of

  5. Push-pull audio amplifier theory

    Microsoft Academic Search

    M. Melehy

    1957-01-01

    Assuming nonlinear tube characteristics, this paper presents: 1) a single mathematical analysis of instantaneous relations applicable to all classes of operation of push-pull audio vacuum tube amplifiers; 2) a mathematical derivation, applicable to all classes of operation, of the composite load line equation and a study of the amplifier frequency response both at medium and low frequencies, with third and

  6. Sparse audio representations using the MCLT

    Microsoft Academic Search

    M. E. Davies; L. Daudet

    We consider sparse representations of audio based around the modulated complex lapped transform (MCLT) and a generalized iteratively reweighted least squares algorithm which can be interpreted as a variation of expectation maximization. We compare this mildly overcomplete representation to the more traditional modified discrete cosine transform (MDCT) in terms of coding cost and explore the possibility of extending it to

  7. Spanish for Agricultural Purposes: The Audio Program.

    ERIC Educational Resources Information Center

    Mainous, Bruce H.; And Others

    The manual is meant to accompany and supplement the basic manual and to serve as support to the audio component of "Spanish for Agricultural Purposes," a one-semester course for North American agriculture specialists preparing to work in Latin America, consists of exercises to supplement readings presented in the course's basic manual and to…

  8. Audio-Tutorial Instruction; An Expanded Approach.

    ERIC Educational Resources Information Center

    Herrick, Merlyn C.

    The University of Missouri-Columbia School of Medicine is developing an audio-tutorial system with several unique features. A Didactor, a device which provides most of the capabilities of computer-assisted instruction but at a fraction of the cost, is the center of the system. The Didactor is combined with tape recordings and slides to present a…

  9. Audio Source Separation: Solutions and Problems

    Microsoft Academic Search

    Nikolaos Mitianoudis; Mike Davies

    2002-01-01

    SUMMARY The problem of separating out a number of audio sources observed from an array of microphones in a real room environment has received a great deal of attention in the past decade. While there are now a number of workable methods that can even deal with relatively high reverberation (18), a number of interesting problems still remain. In this

  10. Audio source separation of convolutive mixtures

    Microsoft Academic Search

    Nikolaos Mitianoudis; Michael E. Davies

    2003-01-01

    The problem of separation of audio sources recorded in a real world situation is well established in modern literature. A method to solve this problem is Blind Source Separation (BSS) using Independent Compo- nent Analysis (ICA). The recording environment is usually modelled as convolutive. Previous research on ICA of instan- taneous mixtures provided solid background for the sepa- ration of

  11. Performance measurement in blind audio source separation

    Microsoft Academic Search

    Emmanuel Vincent; Rémi Gribonval; Cédric Févotte

    2006-01-01

    In this paper, we discuss the evaluation of blind audio source separation (BASS) algorithms. Depending on the exact application, different distortions can be allowed between an estimated source and the wanted true source. We consider four dif- ferent sets of such allowed distortions, from time-invariant gains to time-varying filters. In each case, we decompose the estimated source into a true

  12. AUDIO-READER MAX KADE CENTER

    E-print Network

    SPORTS PAVILION ANDERSON STRENGTH CENTER PARKING GARAGE & OFFICES BURGE UNION HOGLUND-MAUPIN STADIUMBAEHR AUDIO-READER CENTER MAX KADE CENTER (SUDLER HOUSE) K.J.H.K. TRIANGLE SUNFLOWER APARTMENTS 1 DEVELOPMENT CENTER LEARNED HALL BURTHALL A B E C D GREEN HALL NAISMITHHALL (PRIVATE) STOUFFER PLACE HOUSE

  13. An ESL Audio-Script Writing Workshop

    ERIC Educational Resources Information Center

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  14. Providing Students with Formative Audio Feedback

    ERIC Educational Resources Information Center

    Brearley, Francis Q.; Cullen, W. Rod

    2012-01-01

    The provision of timely and constructive feedback is increasingly challenging for busy academics. Ensuring effective student engagement with feedback is equally difficult. Increasingly, studies have explored provision of audio recorded feedback to enhance effectiveness and engagement with feedback. Few, if any, of these focus on purely formative…

  15. Robust Multiplicative Patchwork Method for Audio Watermarking

    Microsoft Academic Search

    Nima Khademi Kalantari; Mohannad Ali Akhaee; Seyed Mohammad Ahadi; Hamidreza Amindavar

    2009-01-01

    This paper presents a Multiplicative Patchwork Method (MPM) for audio watermarking. The watermark signal is embedded by selecting two subsets of the host signal features and modifying one subset multiplicatively regarding the watermark data, whereas another subset is left unchanged. The method is implemented in wavelet domain and approximation coefficients are used to embed data. In order to have an

  16. SIMAC: SEMANTIC INTERACTION WITH MUSIC AUDIO CONTENTS

    Microsoft Academic Search

    Perfecto Herrera; Juan Bello; Gerhard Widmer; Mark Sandler; Ňscar Celma; Fabio Vignoli; Elias Pampalk; Pedro Cano; Steffen Pauws; Xavier Serra

    2005-01-01

    The SIMAC project addresses the study and development of innovative components for a music information retrieval system. The key feature is the usage and exploitation of semantic descriptors of musical content that are automatically extracted from music audio files. Th ese descriptors are generated in two ways: as derivations and combinations of lower-level descriptors and as generalizatio ns induced from

  17. Music Icons: Procedural Glyphs for Audio Files

    Microsoft Academic Search

    Philipp Kolhoff; Jacqueline Preuß; Jörn Loviscach

    2006-01-01

    Abstract Nowadays, a personal music collection may comprise thousands of MP3 files. Visualization can help the user to gain an overview and to find similar songs inside so large a set. We describe a method,to create icons from audio files in such a way that songs which the user considers sim- ilar receive similar icons. This allows visual data mining

  18. Music Icons: Procedural Glyphs for Audio Files

    Microsoft Academic Search

    Philipp Kolhoff Jacqueline

    Nowadays, a personal music collection may comprise thousands of MP3 files. Visualization can help the user to gain an overview and to find similar songs inside so large a set. We describe a method to create icons from audio files in such a way that songs which the user considers sim- ilar receive similar icons. This allows visual data mining

  19. Probabilistic Modeling Paradigms for Audio Source Separation

    E-print Network

    Paris-Sud XI, Université de

    Probabilistic Modeling Paradigms for Audio Source Separation Emmanuel Vincent INRIA, Centre Inria form the basis of future state-of-the-art systems. KEYWORDS Source separation, latent variable model sound scenes result from the superposition of several sources, which can be separately perceived

  20. Using Salient Envelope Features for Audio Coding

    E-print Network

    Kabal, Peter

    of gammatone filters, and then computing the Hilbert envelopes of the responses. Relevant points) is modelled by anal- ysis of the audio signal with a set of gammatone fil- ters. In contrast to Feldbauer of the gammatone filter responses. Previous work [3] has shown that this information is sufficient to reconstruct

  1. AUDIO MASKING AND TIME-FREQUENCY EXPANSIONS

    Microsoft Academic Search

    Bernard Mulgrew

    2006-01-01

    An alternative mechanism for audio masking is postulated. This mechanism is derived as a solution to the classic prob- lem of representing a signal as a linear combination of ba- sis functions which are only approximately orthogonal and hence are prone to leakage. This mechanism involves aug- menting each basis function or filter with an auxiliary filter. In this combined

  2. CERN automatic audio-conference service

    NASA Astrophysics Data System (ADS)

    Sierra Moral, Rodrigo

    2010-04-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  3. The FreeBSD Audio Driver

    Microsoft Academic Search

    Luigi Rizzo

    1997-01-01

    . We recently developed an audio driver in the FreeBSD operatingsystem. In this work, we decided to consider compatibility with existingsoftware interfaces only as a secondary issue, to be implemented ata later time and only for those applications which could not be adaptedto the new software interface. This turned out to be a significant advantage,since it let us design the

  4. Audio effects to enhance spatial information displays

    Microsoft Academic Search

    Davide Rocchesso

    2002-01-01

    Vision and hearing are inherently different perceptual systems. The vision system is tuned to extract spatial information of surfaces and edges, while the auditory system is better suited to detect the temporal behavior of sources. However there are spatial audio effects that can be used to overcome some limitations of visual displays, especially when time-varying complex scenarios have to be

  5. Digital audio broadcasting: an interactive services architecture

    Microsoft Academic Search

    Nikos Manouselis; Pythagoras Karampiperis

    2001-01-01

    Digital media technologies offer enhanced multimedia signal broadcasting and description of the signal on content. Digital audio broadcasting (DAB) is a media standard with extended multimedia capabilities, offering novel services to users world-wide. Each digital broadcasting standard though, cannot be separately viewed from the development of Internet radio broadcasting. We introduce a multi-agent based system architecture, specially designed to provide

  6. Machine therapy

    E-print Network

    Dobson, Kelly E. (Kelly Elizabeth), 1970-

    2007-01-01

    Machine Therapy is a new practice combining art, design, psychoanalysis, and engineering work in ways that access and reveal the vital, though often unnoticed, relevance of people's interactions and relationships with ...

  7. Math Machines

    NSDL National Science Digital Library

    The mission of the Math Machines organization is to "improve the quality of mathematical education, enhance the transfer of mathematical thinking into other classes, and increase students' ability to apply rigorous mathematics outside the classroom." Their website supports a National Science Foundation ATE grant-supported project designed to improve teaching in the areas of Mathematics, Science, and Technology at the high school and college levels. This improved learning results from using math, science, and technology principles to build and control various machines such as pointers and robots or "math machines", which are simple devices that provide an immediate, physical, dynamic expression to abstract mathematical equations. The website provides information links on Educational Theory, Classroom Activities, Project Workshops, Calculators & Programs, and Machine Construction Instructions for Building: Closed Circuits, Servo Motors, Controllers, Robot Boards and more. There is also contact information, an FAQ section, as well as upcoming events.

  8. Simple Machines

    NSDL National Science Digital Library

    This series of three interactive, multimedia activities introduce and demonstrate the properties of six simple machines. Specifically, the lessons show how levers, pulleys, inclined planes, screws, wheels and axles, and wedges can reduce the amount of work done by humans. After learning about the characteristics of each classification, users can try to find the simple machines that make up a lawn mower. By inspecting the mower from different angles, several simple machines are revealed and must be identified. The final activity lets users test their knowledge of the mechanics of simple machines. Following a builder through each stage of constructing a tree house, users can apply equations to determine the mechanical advantage supplied by using the tools.

  9. Monel Machining

    NASA Technical Reports Server (NTRS)

    1983-01-01

    Castle Industries, Inc. is a small machine shop manufacturing replacement plumbing repair parts, such as faucet, tub and ballcock seats. Therese Castley, president of Castle decided to introduce Monel because it offered a chance to improve competitiveness and expand the product line. Before expanding, Castley sought NERAC assistance on Monel technology. NERAC (New England Research Application Center) provided an information package which proved very helpful. The NASA database was included in NERAC's search and yielded a wealth of information on machining Monel.

  10. [Intermodal timing cues for audio-visual speech recognition].

    PubMed

    Hashimoto, Masahiro; Kumashiro, Masaharu

    2004-06-01

    The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises. PMID:15244074

  11. Content-based classification and retrieval of audio

    NASA Astrophysics Data System (ADS)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

  12. Three-dimensional audio using loudspeakers

    NASA Astrophysics Data System (ADS)

    Gardner, William G.

    1997-12-01

    3-D audio systems, which can surround a listener with sounds at arbitrary locations, are an important part of immersive interfaces. A new approach is presented for implementing 3-D audio using a pair of conventional loudspeakers. The new idea is to use the tracked position of the listener's head to optimize the acoustical presentation, and thus produce a much more realistic illusion over a larger listening area than existing loudspeaker 3-D audio systems. By using a remote head tracker, for instance based on computer vision, an immersive audio environment can be created without donning headphones or other equipment. The general approach to a 3-D audio system is to reconstruct the acoustic pressures at the listener's ears that would result from the natural listening situation to be simulated. To accomplish this using loudspeakers requires that first, the ear signals corresponding to the target scene are synthesized by appropriately encoding directional cues, a process known as 'binaural synthesis,' and second, these signals are delivered to the listener by inverting the transmission paths that exist from the speakers to the listener, a process known as 'crosstalk cancellation.' Existing crosstalk cancellation systems only function at a fixed listening location; when the listener moves away from the equalization zone, the 3-D illusion is lost. Steering the equalization zone to the tracked listener preserves the 3-D illusion over a large listening volume, thus simulating a reconstructed soundfield, and also provides dynamic localization cues by maintaining stationary external sound sources during head motion. This dissertation will discuss the theory, implementation, and testing of a head-tracked loudspeaker 3-D audio system. Crosstalk cancellers that can be steered to the location of a tracked listener will be described. The objective performance of these systems has been evaluated using simulations and acoustical measurements made at the ears of human subjects. Many sound localization experiments were also conducted; the results show that head-tracking both significantly improves localization when the listener is displaced from the ideal listening location, and also enables dynamic localization cues. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

  13. IEEE Transactions on Consumer Electronics Visible Light Communication for Audio Systems

    E-print Network

    Pang, Grantham

    LEDs, in which current fed to the LEDs is modulated and encoded with audio information or messages. The audio system provides audio signal transmission in a free space optical link. The receiver, combined to be constantly illuminated. Keywords: audio systems, light emitting diodes; audio broadcasting. 1. Introduction

  14. Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models.

    PubMed

    Aucouturier, Jean-Julien; Nonaka, Yulri; Katahira, Kentaro; Okanoya, Kazuo

    2011-11-01

    The paper describes an application of machine learning techniques to identify expiratory and inspiration phases from the audio recording of human baby cries. Crying episodes were recorded from 14 infants, spanning four vocalization contexts in their first 12 months of age; recordings from three individuals were annotated manually to identify expiratory and inspiratory sounds and used as training examples to segment automatically the recordings of the other 11 individuals. The proposed algorithm uses a hidden Markov model architecture, in which state likelihoods are estimated either with Gaussian mixture models or by converting the classification decisions of a support vector machine. The algorithm yields up to 95% classification precision (86% average), and its ability generalizes over different babies, different ages, and vocalization contexts. The technique offers an opportunity to quantify expiration duration, count the crying rate, and other time-related characteristics of baby crying for screening, diagnosis, and research purposes over large populations of infants. PMID:22087925

  15. Audio-visual integration and saccadic inhibition.

    PubMed

    Makovac, Elena; Buonocore, Antimo; McIntosh, Robert D

    2015-07-01

    Saccades operate a continuous selection between competing targets at different locations. This competition has been mostly investigated in the visual context, and it is well known that a visual distractor can interfere with a saccade toward a visual target. Here, we investigated whether multimodal, audio-visual targets confer stronger resilience against visual distraction. Saccades to audio-visual targets had shorter latencies than saccades to unisensory stimuli. This facilitation exceeded the level that could be explained by simple probability summation, indicating that multisensory integration had occurred. The magnitude of inhibition induced by a visual distractor was comparable for saccades to unisensory and multisensory targets, but the duration of the inhibition was shorter for multimodal targets. We conclude that multisensory integration can allow a saccade plan to be reestablished more rapidly following saccadic inhibition. PMID:25599266

  16. Digital audio and video broadcasting by satellite

    NASA Astrophysics Data System (ADS)

    Yoshino, Takehiko

    In parallel with the progress of the practical use of satellite broadcasting and Hi-Vision or high-definition television technologies, research activities are also in progress to replace the conventional analog broadcasting services with a digital version. What we call 'digitalization' is not a mere technical matter but an important subject which will help promote multichannel or multimedia applications and, accordingly, can change the old concept of mass media, such as television or radio. NHK Science and Technical Research Laboratories has promoted studies of digital bandwidth compression, transmission, and application techniques. The following topics are covered: the trend of digital broadcasting; features of Integrated Services Digital Broadcasting (ISDB); compression encoding and transmission; transmission bit rate in 12 GHz band; number of digital TV transmission channels; multichannel pulse code modulation (PCM) audio broadcasting system via communication satellite; digital Hi-Vision broadcasting; and development of digital audio broadcasting (DAB) for mobile reception in Japan.

  17. Audio feature extraction using probability distribution function

    NASA Astrophysics Data System (ADS)

    Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.

    2015-05-01

    Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.

  18. Tubes and transistors in audio amplifiers

    Microsoft Academic Search

    O. M. Reshetnikov; R. K. Khestanov; Y. V. Chernykh

    1985-01-01

    The alleged differences between tube and transistor high-fidelity sound reproduction channels, in terms of subjective versus objective evaluation of reception quality was studied. For testing the preamplifier stage behind the phonograph pickup was singled out, the Audio Research SP-6C tube preamplifier being compared with a specially built RIAA transistor preamplifier-corrector, so as to separate interaction of this stage from interaction

  19. Statistical modeling of DCT coefficients for audio

    Microsoft Academic Search

    Cuiping Wang; Li Guo; Yifang Wei; Yujie Wang

    2010-01-01

    Based on the sharp-peak and heavy-tail of statistical distribution of discrete cosine transform (DCT) coefficients for audio, generalized Gaussian distribution (GGD) and alpha-stable distribution are usually employed as modeling tools. In this paper, the Kullback-Leibler Divergence is applied to measure the difference between modeling result and the true distribution, and the experiment results show that compared with alpha-stable distribution, the

  20. Multirate adaptive filtering for immersive audio

    Microsoft Academic Search

    Jong-soong Lim; Chris Kyriakakis

    2001-01-01

    This paper describes a method for implementing immersive audio rendering filters for single or multiple listeners and loudspeakers. In particular, the paper is focused on the case of single or two listeners with different loudspeaker arrays to determine the weighting vectors for the necessary FIR and IIR filters using the LMS (least-mean-squares) adaptive inverse algorithm. It describes transform-domain LMS adaptive

  1. RTP Payload for Redundant Audio Data

    Microsoft Academic Search

    Andres Vega-garcia; Colin Perkins; Isidor Kouvelas; Jean-chrysostome Bolot; Orion Hodson; Sacha Fosse-parisis; Vicky Hardman

    1997-01-01

    This document describes a payload format for use with the real-time transportprotocol (RTP), version 2, for encoding redundant audio data. Theprimary motivation for the scheme described herein is the development ofaudio conferencing tools for use with lossy packet networks such as theInternet Mbone, although this scheme is not limited to such applications.Perkins et al INTERNET-DRAFT 25 July 19971 IntroductionIf multimedia

  2. A digital audio/video interleaving system. [for Shuttle Orbiter

    NASA Technical Reports Server (NTRS)

    Richards, R. W.

    1978-01-01

    A method of interleaving an audio signal with its associated video signal for simultaneous transmission or recording, and the subsequent separation of the two signals, is described. Comparisons are made between the new audio signal interleaving system and the Skylab Pam audio/video interleaving system, pointing out improvements gained by using the digital audio/video interleaving system. It was found that the digital technique is the simplest, most effective and most reliable method for interleaving audio and/or other types of data into the video signal for the Shuttle Orbiter application. Details of the design of a multiplexer capable of accommodating two basic data channels, each consisting of a single 31.5-kb/s digital bit stream are given. An adaptive slope delta modulation system is introduced to digitize audio signals, producing a high immunity of work intelligibility to channel errors, primarily due to the robust nature of the delta-modulation algorithm.

  3. An active headrest for personal audio.

    PubMed

    Elliott, Stephen J; Jones, Matthew

    2006-05-01

    There is an increasing need for personal audio systems, which generate sounds that are clearly audible to one listener but are not audible to other listeners nearby. Of particular interest in this paper are listeners sitting in adjacent seats in aircraft or land vehicles. Although personal audio could then be achieved with headsets, it would be safer and more comfortable if loudspeakers in the seat headrests could be actively controlled to generate an acceptable level of acoustic isolation. In this paper a number of approaches to this problem are investigated, but the most successful involves a pair of loudspeakers on one side of the headrest, driven together to reproduce an audio signal for a listener in that seat and also to attenuate the pressures in the adjacent seat. The performance of this technique is investigated using simple analytic models and also with a practical implementation, tested in an anechoic chamber and a small room. It is found that significant attenuations, of between 5 and 25 dB, can be achieved in the crosstalk between the seats for frequencies up to about 2 kHz. PMID:16708929

  4. Realization of audio transmission node for vehicular MOST network

    Microsoft Academic Search

    Wang Jian-guo; Yang Bing

    2010-01-01

    A set of three-node MOST (Media Oriented Systems Transport) audio system is designed in this work, which adopts the C8051F043 single chip as the controller, and OS8104 chip as the transceiver of network. In sequence, a hardware circuit diagram of the audio node and software procedure diagram of the control node and audio node are shown, realizing the transmission of

  5. Audio classification based on MPEG7 spectral basis representations

    Microsoft Academic Search

    Hyoung-gook Kim; Nicolas Moreau; Thomas Sikora

    2004-01-01

    In this paper, we present an MPEG-7-based audio classification and retrieval technique targeted for analysis of film material. The technique consists of low-level descriptors and high-level description schemes. For low-level descriptors, low-dimensional features such as audio spectrum projection based on audio spectrum basis descriptors is produced in order to find a balanced tradeoff between reducing dimensionality and retaining maximum information

  6. For Immediate Release --Tuesday, May 13, 2014 University of Lethbridge digital audio arts majors place

    E-print Network

    Seldin, Jonathan P.

    digital audio arts majors place second at international recording competition A University of Lethbridge team of digital audio arts majors earned the Runner Up of L Shure Recording Team was also selected randomly from digital audio arts

  7. 37 CFR 201.27 - Initial notice of distribution of digital audio recording devices or media.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...notice of distribution of digital audio recording devices... COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT...notice of distribution of digital audio recording devices...charge, by contacting the Library of Congress, Copyright...first distribution of digital audio recording...

  8. 37 CFR 201.28 - Statements of Account for digital audio recording devices or media.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...Statements of Account for digital audio recording devices... COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT...Statements of Account for digital audio recording devices...Licensing Division, Library of Congress. Forms...Statement of Account for Digital Audio Recording...

  9. 76 FR 79755 - First Meeting: RTCA Special Committee 226 Audio Systems and Equipment

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-22

    ...First Meeting: RTCA Special Committee 226 Audio Systems and Equipment AGENCY: Federal...Notice of RTCA Special Committee 226, Audio Systems and Equipment...meeting of RTCA Special Committee 226, Audio Systems and Equipment, for the first...

  10. 78 FR 18416 - Sixth Meeting: RTCA Special Committee 226, Audio Systems and Equipment

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-26

    ...Meeting: RTCA Special Committee 226, Audio Systems and Equipment AGENCY: Federal...Notice of RTCA Special Committee 226, Audio Systems and Equipment...meeting of the RTCA Special Committee 226, Audio Systems and Equipment. DATES: The...

  11. 77 FR 37732 - Fourteenth Meeting: RTCA Special Committee 224, Audio Systems and Equipment

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-22

    ...Meeting: RTCA Special Committee 224, Audio Systems and Equipment AGENCY: Federal...Notice of RTCA Special Committee 224, Audio Systems and Equipment...meeting of RTCA Special Committee 224, Audio Systems and Equipment. DATES: The...

  12. 47 CFR 73.9005 - Compliance requirements for covered demodulator products: Audio.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ...requirements for covered demodulator products: Audio. 73.9005 Section 73.9005 Telecommunication...requirements for covered demodulator products: Audio. Except as otherwise provided in...demodulator products shall not output the audio portions of unscreened content or of...

  13. 78 FR 57673 - Eighth Meeting: RTCA Special Committee 226, Audio Systems and Equipment

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-09-19

    ...Meeting: RTCA Special Committee 226, Audio Systems and Equipment AGENCY: Federal...Notice of RTCA Special Committee 226, Audio Systems and Equipment...meeting of the RTCA Special Committee 226, Audio Systems and Equipment. DATES: The...

  14. Texas A&M Information Technology Audio Visual Surveillance Technology Committee

    E-print Network

    ...............................................................................................................7 General Audio video surveillance technology may be used by Texas A&M University and other SystemTexas A&M Information Technology Audio Visual Surveillance Technology Committee Audio Visual Surveillance Technology Operational Standards Table of Contents General

  15. Audio scene segmentation for video with generic content

    NASA Astrophysics Data System (ADS)

    Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

    2008-01-01

    In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.

  16. Workout Machine

    NASA Technical Reports Server (NTRS)

    1995-01-01

    The Orbotron is a tri-axle exercise machine patterned after a NASA training simulator for astronaut orientation in the microgravity of space. It has three orbiting rings corresponding to roll, pitch and yaw. The user is in the middle of the inner ring with the stomach remaining in the center of all axes, eliminating dizziness. Human power starts the rings spinning, unlike the NASA air-powered system. Marketed by Fantasy Factory (formerly Orbotron, Inc.), the machine can improve aerobic capacity, strength and endurance in five to seven minute workouts.

  17. 50 CFR 27.71 - Commercial filming and still photography and audio recording.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... Commercial filming and still photography and audio recording. 27...Disturbing Violations: Filming, Photography, and Light and Sound Equipment... Commercial filming and still photography and audio recording....

  18. Function Machine

    NSDL National Science Digital Library

    2011-01-01

    This Java applet allows learners to explore simple linear functions. Students determine the algebraic form of a linear equation by entering inputs into the machine and by looking for patterns in the outputs. The function rules available are: integers from -10 to 10 are either added to, subtracted from, or multiplied by the input x to yield the output y.

  19. Decoding Machine

    NSDL National Science Digital Library

    2012-10-22

    In this math lesson, learners explore variables and their uses. Learners pretend to be FBI agents and make a TOP SECRET tool that enables them to decode and find the values of hidden messages and words. Learners make their simple "decoding machines" out of paper and tape.

  20. 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 21-24, 2007, New Paltz, NY LINEAR REGRESSION ON SPARSE FEATURES FOR SINGLE-CHANNEL SPEECH

    E-print Network

    source separation (BSS) using little or no prior information about the signals; and machine learning meth2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 21-24, 2007, New Paltz, NY LINEAR REGRESSION ON SPARSE FEATURES FOR SINGLE-CHANNEL SPEECH SEPARATION Mikkel N

  1. Combination of Audio and Lyrics Features for Genre Classification in Digital Audio Collections

    E-print Network

    Rauber,Andreas

    of music ­ once a track contains trumpet sounds it will most likely be assigned to genres like `Jazz of audio files is their sound ­ there are, however, more ways of describing a song, for instance its lyrics, which describe songs in terms of content words. Lyrics of music may be orthogonal to its sound

  2. Comparing audio descriptors for singing voice detection in music audio files

    Microsoft Academic Search

    Mart ´ őn Rocamora; Perfecto Herrera; Julio Herrera

    Given the relevance of the singing voice in popular western music, a system able to reliable identify those portions of a music audio file containing vocals would be very useful. In this work, we explore already used descriptors to perform this task and compare the perfor- mance of a statistical classifier using each kind of them, concluding that MFCC are

  3. Instructional Audio Guidelines: Four Design Principles to Consider for Every Instructional Audio Design Effort

    ERIC Educational Resources Information Center

    Carter, Curtis W.

    2012-01-01

    This article contends that instructional designers and developers should attend to four particular design principles when creating instructional audio. Support for this view is presented by referencing the limited research that has been done in this area, and by indicating how and why each of the four principles is important to the design process.…

  4. Spoken Documents: Creating Searchable Archives from Continuous Audio

    Microsoft Academic Search

    Sean Colbath; Francis Kubala; Daben Liu; Amit Srivastava

    2000-01-01

    Current search technologies for audio rely on the cataloguer of the data to provide additional keywords or metadata to enable retrieval. This can lead to haphazard cataloging and misleading searches, and provides the end user with no summarization, editing, or information extraction capabilities. One obvious way to tackle this problem is to transcribe the speech-based audio using automatic speech recognition

  5. Implementation of Direct Sequence Spread Spectrum steganography on audio data

    Microsoft Academic Search

    Rizky M. Nugraha

    2011-01-01

    Image steganography has widely developed. There are also many algorithm developed for it. Meanwhile, the interest in using audio data as cover object in steganography can be spelled out late emergence than image data. This paper discusses the implementation of steganography in audio data using Direct Sequence Spread Spectrum method. Spread Spectrum method is often used to send hidden message

  6. Audio-Assisted Video Browsing for DVD Recorders

    Microsoft Academic Search

    Ajay Divakaran; Isao Otsuka; Regunathan Radhakrishnan; Kazuhiko Nakane; Masaharu Ogawa

    2004-01-01

    We present an audio-assisted video browsing system for a Hard Disk Drive (HDD) enhanced DVD recorder. We focus on our sports highlights extraction based on audio classification. We have systematically established that sports highlights are indicated by the presence of audience reaction such as cheering, applause and the commentators excited speech. That enables us to develop a common highlights extraction

  7. Making the Most of Audio. Technology in Language Learning Series.

    ERIC Educational Resources Information Center

    Barley, Anthony

    Prepared for practicing language teachers, this book's aim is to help them make the most of audio, a readily accessible resource. The book shows, with the help of numerous practical examples, how a range of language skills can be developed. Most examples are in French. Chapters cover the following information: (1) making the most of audio (e.g.,…

  8. Let Their Voices Be Heard! Building a Multicultural Audio Collection.

    ERIC Educational Resources Information Center

    Tucker, Judith Cook

    1992-01-01

    Discusses building a multicultural audio collection for a library. Gives some guidelines about selecting materials that really represent different cultures. Audio materials that are considered fall roughly into the categories of children's stories, didactic materials, oral histories, poetry and folktales, and music. The goal is an authentic…

  9. TV datacasting system: new opportunities in the audio subcarrier

    Microsoft Academic Search

    Lee Jin-Soo

    1999-01-01

    This paper describes a new TV datacasting system for digital data broadcast, referred to as S-Channel, using digital modulation over the TV first audio subcarrier in Korea. The system is designed to achieve high capacity communication rates over the TV channel. I examine research and development results pertaining to a new method for on-carrier TV datacasting using the main audio

  10. Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling

    E-print Network

    Paris-Sud XI, Université de

    and inverting the STFT. For audio data, a common sparse prior such as the Laplacian [1], a mixture of Gaussians [2] or a generalized Gaussian [3], is usually assumed for all source coefficients. This model suffersUnderdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling Emmanuel Vincent

  11. Alpha-Stable Distributions in Signal Processing of Audio Signals

    E-print Network

    Mosegaard, Klaus

    Alpha-Stable Distributions in Signal Processing of Audio Signals Preben Kidmose, Department of Mathematical Modelling, Section for Digital Signal Processing, Technical University of Denmark, Building 321 the applicability of stable distributions in audio processing, a classical problem from statistical signal

  12. Normalized Auditory Attention Levels for Automatic Audio Surveillance

    E-print Network

    Dupont, Stéphane

    on audio material consisting of security-relevant audio events (e.g., gun shot, glass breaking, woman security represents a major challenge for public authorities and a profitable market for private companies. By abnormal, we mean sudden and unexpected sounds (e.g., gun shot, glass breaking, woman scream, siren sound

  13. Redundant coding of simulated tactile key clicks with audio signals

    Microsoft Academic Search

    Hsiang-Yu Chen; Jaeyoung Park; Hong Z. Tan; Steve Dai

    2010-01-01

    The present study examined the efficacy of using audio cues for redundant coding of tactile key clicks simulated with a piezoelectric actuator. The tactile stimuli consisted of six raised cosine pulses at two levels of frequency and three levels of amplitude. An absolute identification experiment was conducted to measure the information transfers associated with the tactile-audio signal set. Results from

  14. AUDIO FINGERPRINTING: COMBINING COMPUTER VISION & DATA STREAM PROCESSING

    E-print Network

    Tomkins, Andrew

    AUDIO FINGERPRINTING: COMBINING COMPUTER VISION & DATA STREAM PROCESSING Shumeet Baluja & Michele a combination of computer-vision techniques and large-scale-data-stream processing algorithms to create compact capabilities for small snippets of audio that have been degraded in a variety of manners, including competing

  15. Concentrated Spectrogram of audio acoustic signals -a comparative study

    E-print Network

    Paris-Sud XI, Université de

    , France 405 #12;The paper presents results of time-frequency analysis of audio acoustic discrete-time in the whole domain (in contrast to scalograms). Specifically for Short-time Fourier transform (STFT), shapesConcentrated Spectrogram of audio acoustic signals - a comparative study K. Czarnecki, M. Moszy

  16. MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES

    Microsoft Academic Search

    P. S. Lampropoulou; A. S. Lampropoulos; G. A. Tsihrintzis

    We propose a two-step, audio feature-based musical genre classification methodology. First, we identify and separate the various musical instrument sources in the audio signal, using the convolutive sparse coding algorithm. Next, we extract classification features from the separated signals that correspond to distinct musical instrument sources. The methodology is evaluated and its performance is assessed.

  17. Blind separation of audio sources using modal decomposition

    Microsoft Academic Search

    A. Aissa-El-Bey; K. Abed-Meraim; Y. Grenier

    2005-01-01

    ABSTRACT This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated,by a sum of damped,sinusoidal (modal) components. Based on this representation, we propose a two steps approach consisting of a signal analysis (extraction of the modal components) followed by a signal synthesis (pairing

  18. Further experiments on audio-visual speech source separation

    Microsoft Academic Search

    David Sodoyer; Laurent Girin; Christian Jutten; Jean-Luc Schwartz

    Looking at the speaker's face seems useful to better hear a speech signal and extract it from competing sources before identification. This might result in elaborating new speech enhancement or extraction techniques exploiting the audio-visual coherence of speech stimuli. In this paper, we present a set of experiments on a novel algorithm plugging audio-visual coherence estimated by statistical tools, on

  19. Developing an audio-visual speech source separation algorithm

    Microsoft Academic Search

    David Sodoyer; Laurent Girin; Christian Jutten; Jean-luc Schwartz

    2004-01-01

    Looking at the speakers face is useful to hear better a speech signal and extract it from competing sources before identification. This might result in elaborating new speech enhancement or extraction techniques exploiting the audio- visual coherence of speech stimuli. In this paper, a novel algorithm plugging audio-visual coherence estimated by sta- tistical tools on classical blind source separation algorithms

  20. Effects of Audio Coding on ICA Performance: an Experimental Study

    E-print Network

    Boyer, Edmond

    files. Blind Source Separation (BSS), and in particular Inde- pendent Component Analysis (ICA), have a mixture of the decompressed audio sources provided by the other users. This scenario also occurs in multi-track audio formats [9], where the original tracks (i.e., the sources in a BSS problem) are individually

  1. A fixed point solution for convolved audio source separation

    Microsoft Academic Search

    Nikolaos Mitianoudis; Mike Davies

    2001-01-01

    We examine the problem of blind audio source separation using independent component analysis (ICA). In order to separate audio sources recorded in a real recording environment, we need to model the mixing process as convolutional. Many methods have been introduced for separating convolved mixtures, the most successful of which require working in the frequency domain. This paper proposes a fixed-point

  2. A new wavelet based blind audio source separation using Kurtosis

    Microsoft Academic Search

    M. R. Mirarab; M. A. Sobhani; A. A. Nasiri

    2010-01-01

    We consider the problem of blind audio source separation. A method to solve this problem is blind source separation (BSS) using independent component analysis (ICA). ICA exploits the non-Gaussianity of source in the mixtures. In this paper we propose a new wavelet based ICA method using Kurtosis for blind audio source separation. In this method, the observations are transformed into

  3. Beyond Podcasting: Creative Approaches to Designing Educational Audio

    ERIC Educational Resources Information Center

    Middleton, Andrew

    2009-01-01

    This paper discusses a university-wide pilot designed to encourage academics to creatively explore learner-centred applications for digital audio. Participation in the pilot was diverse in terms of technical competence, confidence and contextual requirements and there was little prior experience of working with digital audio. Many innovative…

  4. Effect of Audio vs. Video on Aural Discrimination of Vowels

    ERIC Educational Resources Information Center

    McCrocklin, Shannon

    2012-01-01

    Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…

  5. Use of Audio Modification in Science Vocabulary Assessment

    ERIC Educational Resources Information Center

    Adiguzel, Tufan

    2011-01-01

    The purposes of this study were to examine the utilization of audio modification in vocabulary assessment in school subject areas, specifically in elementary science, and to present a web-based key vocabulary assessment tool for the elementary school level. Audio-recorded readings were used to replace independent student readings as the task…

  6. A Case Study on Audio Feedback with Geography Undergraduates

    ERIC Educational Resources Information Center

    Rodway-Dyer, Sue; Knight, Jasper; Dunne, Elizabeth

    2011-01-01

    Several small-scale studies have suggested that audio feedback can help students to reflect on their learning and to develop deep learning approaches that are associated with higher attainment in assessments. For this case study, Geography undergraduates were given audio feedback on a written essay assignment, alongside traditional written…

  7. Using Audio Books to Improve Reading and Academic Performance

    ERIC Educational Resources Information Center

    Montgomery, Joel R.

    2009-01-01

    This article highlights significant research about what below grade-level reading means in middle school classrooms and suggests a tested approach to improve reading comprehension levels significantly by using audio books. The use of these audio books can improve reading and academic performance for both English language learners (ELLs) and for…

  8. Exposing audio data to the web: an API and prototype

    Microsoft Academic Search

    David Humphrey; Corban Brook; Alistair MacDonald

    2010-01-01

    The HTML5 specification introduces the audio and video media elements, and with them the opportunity to change the way media is integrated on the web. The current HTML5 media API provides ways to play and get limited information about audio and video, but no way to programatically access or create such media. In this paper we present an enhanced API

  9. Adaptive Wavelet Quantization Index Modulation Technique for Audio Watermarking

    E-print Network

    Chang, Pao-Chi

    Adaptive Wavelet Quantization Index Modulation Technique for Audio Watermarking Jong-Tzy Wang 1 (QIM) that is often used in watermarking can provide a tradeoff between robustness and transparency. In this paper, we propose a robust audio watermarking technique which adopts the wavelet QIM method

  10. A Surveillance System based on Audio and Video Sensory Agents

    E-print Network

    Menegatti, Emanuele

    A Surveillance System based on Audio and Video Sensory Agents cooperating with a Mobile Robot for video-conferences, but not for surveillance purposes. Our approach is more similar to the one described with: Institute ISIB of CNR, Padua, Italy Abstract We present a surveillance system that uses audio

  11. Audio Podcasting in a Tablet PC-Enhanced Biochemistry Course

    ERIC Educational Resources Information Center

    Lyles, Heather; Robertson, Brian; Mangino, Michael; Cox, James R.

    2007-01-01

    This report describes the effects of making audio podcasts of all lectures in a large, basic biochemistry course promptly available to students. The audio podcasts complement a previously described approach in which a tablet PC is used to annotate PowerPoint slides with digital ink to produce electronic notes that can be archived. The fundamentals…

  12. Comparison between Munich and Gammachirp models in audio compression

    Microsoft Academic Search

    Khalil Abid; Kais Ouni

    2009-01-01

    In this paper we present a new design of a psychoacoustic model for audio coding following the model used in the standard MPEG-1 audio layer 3. This architecture is based on appropriate wavelet packet decomposition instead of a short term Fourier transformation. Its important characteristic is to propose an analysis of the frequency bands that come closer to the critical

  13. Control Mechanisms for Packet Audio in the Internet

    Microsoft Academic Search

    Jean-chrysostome Bolot; Andrés Vega-garcía

    1996-01-01

    The Internet provides a single class best effort service. From an application's point of view, this service amounts in practice to providing channels with time-varying characteristics such as delay and loss distributions. One way to support real time applications such as interactive audio given this service is to use control mechanisms that adapt the audio coding and decoding processes based

  14. Early investigations into subjective audio quality assessment using brainwave responses

    Microsoft Academic Search

    Charles D. Creusere; Srikant R. Siddenki; Joe Hardin; Jim Kroger

    2011-01-01

    In this work, we take the first steps towards quantifying changes in the perceived quality of audio by directly measuring human brainwave responses using a high-resolution electroencephelograph (EEG). Specifically, human subjects are presented with audio whose quality varies with time while being monitored by a 128-channel EEG; some of the time, they move a slider bar up and down to

  15. Conversation Clock: Visualizing audio patterns in co-located groups

    Microsoft Academic Search

    Tony Bergstrom; Karrie Karahalios

    2007-01-01

    Aural conversation is ephemeral by nature. The interaction history of conversation fades as the present moment demands the attention of participants. In this paper, we explore the nature of group interaction by augmenting aural conversation with a persistent visualization of audio input. This visualization, Conversation Clock, displays individual contribution via audio input and provides a corresponding social mirror over the

  16. Research on dynamic range control used to audio directional system

    Microsoft Academic Search

    Jicai Liang; Song Gao; Yi Li

    2011-01-01

    The algorithm of the dynamic range control (DRC) was applied in the research of the audio directional system. According to the feature of audio directional system, the implementation of the static curve was modified, and the model of DRC based on simulink was built. The calculation results demonstrate that DRC is an effective optimal method to enhance the main sound

  17. MODIS: an audio motif discovery software Laurence Catanese1

    E-print Network

    Paris-Sud XI, Université de

    MODIS: an audio motif discovery software Laurence Catanese1 , Nathan Souviraŕ-Labastie2 , Bingqing will surely happen a third time." Paulo Coelho, The Alchemist (after an Arab proverb). Abstract MODIS1 material. MODIS is based on a generic approach to mine repeating audio sequences, with tolerance to motif

  18. Machine Learning

    NASA Astrophysics Data System (ADS)

    Hoffmann, Achim; Mahidadia, Ashesh

    The purpose of this chapter is to present fundamental ideas and techniques of machine learning suitable for the field of this book, i.e., for automated scientific discovery. The chapter focuses on those symbolic machine learning methods, which produce results that are suitable to be interpreted and understood by humans. This is particularly important in the context of automated scientific discovery as the scientific theories to be produced by machines are usually meant to be interpreted by humans. This chapter contains some of the most influential ideas and concepts in machine learning research to give the reader a basic insight into the field. After the introduction in Sect. 1, general ideas of how learning problems can be framed are given in Sect. 2. The section provides useful perspectives to better understand what learning algorithms actually do. Section 3 presents the Version space model which is an early learning algorithm as well as a conceptual framework, that provides important insight into the general mechanisms behind most learning algorithms. In section 4, a family of learning algorithms, the AQ family for learning classification rules is presented. The AQ family belongs to the early approaches in machine learning. The next, Sect. 5 presents the basic principles of decision tree learners. Decision tree learners belong to the most influential class of inductive learning algorithms today. Finally, a more recent group of learning systems are presented in Sect. 6, which learn relational concepts within the framework of logic programming. This is a particularly interesting group of learning systems since the framework allows also to incorporate background knowledge which may assist in generalisation. Section 7 discusses Association Rules - a technique that comes from the related field of Data mining. Section 8 presents the basic idea of the Naive Bayesian Classifier. While this is a very popular learning technique, the learning result is not well suited for human comprehension as it is essentially a large collection of probability values. In Sect. 9, we present a generic method for improving accuracy of a given learner by generatingmultiple classifiers using variations of the training data. While this works well in most cases, the resulting classifiers have significantly increased complexity and, hence, tend to destroy the human readability of the learning result that a single learner may produce. Section 10 contains a summary, mentions briefly other techniques not discussed in this chapter and presents outlook on the potential of machine learning in the future.

  19. Audio and Podcasts: The Poetry Foundation

    NSDL National Science Digital Library

    The Poetry Foundation has a myriad of wonderful resources for the lover of quatrains, hyperbole, or iambic pentameter. This corner of its site houses audio and podcasts in one convenient locale. The Poetry Off the Shelf section contains recent conversations with poets Edward Hirsch, Nathaniel Mackey, Robert Duncan, and others. Moving on, the Poem of the Day features a number of lovely works, such as "Horseflies" and "I go back to May 1937.â?ť There are six other sections here, including the Poetry Radio Project and Avant-garde All the Time. Additionally, users can sign up to receive updates when new works are added to the site.

  20. ABC News: Video and Audio Newsclips

    NSDL National Science Digital Library

    ABC News has added a section of video and audio newsclips to its news service at the GO Network, InfoSeek Corporation's Internet portal. Users can see and listen to national headline news, such as a clip from Warren Beatty's speech at an awards dinner Wednesday night (sounding rather presidential). They can also search for additional video files using Videosearch, by Virage. Beatty as a search term turned up a clip about the Clinton family's summer vacation on Martha's Vineyard that included a mention of Beatty's presidential aspirations and opinions on the Democratic Party, but no additional pictures of Beatty.

  1. On Steganography in Lost Audio Packets

    E-print Network

    Mazurczyk, Wojciech; Szczypiorski, Krzysztof

    2011-01-01

    The paper presents a new hidden data insertion procedure based on estimated probability of the remaining time of the call for steganographic method called LACK (Lost Audio PaCKets steganography). LACK provides hidden communication for real-time services like Voice over IP. The analytical results presented in this paper concern the influence of LACK's hidden data insertion procedures on the method's impact on quality of voice transmission and its resistance to steganalysis. The proposed hidden data insertion procedure is also compared to previous steganogram insertion approach based on estimated remaining average call duration.

  2. Layered indexing of home video based on audio signals

    NASA Astrophysics Data System (ADS)

    Ogawa, Tomomi; Aizawa, Kiyoharu

    2003-12-01

    In this paper, we propose a home video indexing using an audio information to detect an event both a rules-based method and a GMM-based method. Although exclusive audio segmentation and classification was usually used, various sounds overlap in practice, in which case an audio in which various sound overlapped is expressed by a labeling layered index. With the rules-based method, low-level audio features are used to determine indexes, which are classied such as speech, silence, music, and EVN(Environment Noise). The GMM-based method which uses the same features as the rule based method also classifies an audio into the four classes. Smoothing is applied in order to determine the index. We show experiments in a few home video data.

  3. Systematic acquisition of audio classes for elevator surveillance

    NASA Astrophysics Data System (ADS)

    Radhakrishnan, Regunathan; Divakaran, Ajay

    2005-03-01

    We present a systematic framework for arriving at audio classes for detection of crimes in elevators. We use a time series analysis framework to analyze the low-level features extracted from the audio of an elevator surveillance content to perform an inlier/outlier based temporal segmentation. Since suspicious events in elevators are outliers in a background of usual events, such a segmentation help bring out such events without any a priori knowledge. Then, by performing an automatic clustering on the detected outliers, we identify consistent patterns for which we can train supervised detectors. We apply the proposed framework to a collection of elevator surveillance audio data to systematically acquire audio classes such as banging, footsteps, non-neutral speech and normal speech etc. Based on the observation that the banging audio class and non-neutral speech class are indicative of suspicious events in the elevator data set, we are able to detect all of the suspicious activities without any misses.

  4. Simple Machines

    NSDL National Science Digital Library

    Wakild, Terri

    The goals for this introduction activity to Simple Machines are.: - Generate scientific questions about the world based on observation - Design and conduct scientific investigations - Use tools and equipment appropriate to scientific investigations - Use sources of information in support of scientific investigation - Write and follow procedures in the form of step-by-step instructions, formulas, flow diagram, and sketches - Show how common themes of science, mathematics, and technology apply in real-world contexts - Recognize the contributions made in science by cultures and individuals of diverse backgrounds - Design strategies for moving objects by application of forces, including the use of simple machines MERC Online Reviewer Comments: Good computer activities for under-represented students who want to pursue manufacturing education. Distance Learning is a plus.

  5. Simple Machines

    NSDL National Science Digital Library

    Miss Stewart

    2010-03-24

    Can you identify the six types of simple machines? 1. What do you know about Inclined Planes? Draw an example on your graphic organizer and state one fact.Inclined Plane 2. What do you know about levers? Draw an example on your graphic organizer and state one fact.Lever. 3. What do you know about pulleys? Draw an example on your graphic organizer and ...

  6. Mining machine

    SciTech Connect

    Mendola, C.F.

    1981-11-03

    A mining machine is disclosed in which a cutting drum undercuts a vein of coal and side relief cutters make vertical kerfs in the vein upwardly from the undercut. A chisel plate is forced into the coal vein and breaks loose the material above the undercut and between the side relief cuts. The coal falls into conveyors and is loaded into mine shuttle cars for removal from the mine. The side relief cutters and chisel assembly are progressively raised to extract higher levels of coal from the vein until the desired roof height has been reached. The tramming track assembly, which propels the machine, may be rotated 90/sup 0/ to permit extraction from the vein immediately adjacent the initial extraction. All power supplied near the working face of the vein is hydraulic to minimize the risk of fire or explosion, and a water spray system minimizes dust circulation. Hydraulic roof and floor jacks are provided to increase the stability of the mining machine when exceptionally hard material is encountered in the coal vein.

  7. COMPARISON OF MPEG-7 AUDIO SPECTRUM PROJECTION FEATURES AND MFCC APPLIED TO SPEAKER RECOGNITION, SOUND CLASSIFICATION AND AUDIO

    E-print Network

    Wichmann, Felix

    COMPARISON OF MPEG-7 AUDIO SPECTRUM PROJECTION FEATURES AND MFCC APPLIED TO SPEAKER RECOGNITION Audio Spectrum Projection (ASP) features for general sound recognition performance vs. well established adopted a feature extraction method based on the projection of a spectrum onto a low

  8. Modeling Perceptual Similarity of Audio Signals for Blind Source Separation Evaluation

    Microsoft Academic Search

    Brendan Fox; Andrew Sabin; Bryan Pardo; Alec Zopf

    2007-01-01

    Existing perceptual models of audio quality, such as PEAQ, were designed to measure audio codec performance and are not well suited to evaluation of audio source separation algorithms. The rela- tionship of many other signal quality measures to human perception is not well established. We collected subjective human assessments of dis- tortions encountered when separating audio sources from mixtures of

  9. A Classifier-Based Approch to Score-Guided Music Audio Source Christopher Raphael

    E-print Network

    Raphael, Christopher

    decomposition problem are deemed blind source separation, meaning that the audio is decomposed without explicitA Classifier-Based Approch to Score-Guided Music Audio Source Separation Christopher Raphael June on data from a commercial compact disc. 1 Introduction Audio source separation seeks to decompose an audio

  10. Scrolling Through Time: Improving Interfaces for Searching and Navigating Continuous Audio Timelines

    Microsoft Academic Search

    Eric Lee; Henning Kiel; Jan Borchers

    2006-01-01

    Abstract. Existing work has produced a variety of techniques to improve in- terfaces for navigating an audio timeline. These interfaces typically map,user input to either a change in play rate, or playback position. Audio feedback while scrolling at arbitrary rates can be provided by: skipping immediately to the new position in the audio; resampling the audio, which introduces pitch-shifts; time-

  11. Communicative Competence in Audio Classrooms: A Position Paper for the CADE 1991 Conference.

    ERIC Educational Resources Information Center

    Burge, Liz

    Classroom practitioners need to move their attention away from the technological and logistical competencies required for audio conferencing (AC) to the required communicative competencies in order to advance their skills in handling the psychodynamics of audio virtual classrooms which include audio alone and audio with graphics. While the…

  12. Digital audio forensics: a first practical evaluation on microphone and environment classification

    Microsoft Academic Search

    Christian Kraetzer; Andrea Oermann; Jana Dittmann; Andreas Lang

    2007-01-01

    In this paper a first approach for digital media forensics is presented to determine the used microphones and the envi- ronments of recorded digital audio samples by using known audio steganalysis features. Our first evaluation is based on a limited exemplary test set of 10 different audio reference signals recorded as mono audio data by four microphones in 10 different

  13. Learning about Simple Machines

    NSDL National Science Digital Library

    Mrs. Keller

    2010-01-17

    This activity is designed to learn about simple machines and to have fun doing so! First, use this website to learn backround information on the basics of simple machines. Try the quiz! Simple Machines Learning Site Next, play a game that tests your ability to identify simple machines.... Edheads: Simple Machines Finally, view this video to see how students your age used applied simple machines to do a cool task... Building Simple Machines: A Glass of Milk, Please ...

  14. Machine Design

    NSDL National Science Digital Library

    This website, the homepage of Machine Design.com, contains resources on a variety of information for engineers and technicians related to devices, components, design applications, products, and systems in the manufacturing technology sector. The site also features a CAD library, eBooks, audiovisual aids, webinars, whitepapers and a reference center. Some of the resources require a free login. The page offers an RSS feed to keep users up to date on new resources. A free login may be required to access some of these items.

  15. Survey of compressed domain audio features and their expressiveness

    NASA Astrophysics Data System (ADS)

    Pfeiffer, Silvia; Vincent, Thomas

    2003-01-01

    We give an overview of existing audio analysis approaches in the compressed domain and incorporate them into a coherent formal structure. After examining the kinds of information accessible in an MPEG-1 compressed audio stream, we describe a coherent approach to determine features from them and report on a number of applications they enable. Most of them aim at creating an index to the audio stream by segmenting the stream into temporally coherent regions, which may be classified into pre-specified types of sounds such as music, speech, speakers, animal sounds, sound effects, or silence. Other applications centre around sound recognition such as gender, beat or speech recognition.

  16. Musical examination to bridge audio data and sheet music

    NASA Astrophysics Data System (ADS)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly useful for teaching music lessons on the web. The developed system is evaluated with songs played with guitar, keyboard, violin, and other popular musical instruments (primarily electronic or stringed instruments). The Musicians Aid system is successful at both representing and analyzing audio data and it is also powerful in assisting individuals interested in learning and understanding music.

  17. Course info Machine Learning

    E-print Network

    Shi, Qinfeng "Javen"

    Course info Machine Learning Real life problems Lecture 1: Machine Learning Problem Qinfeng (Javen) Shi 28 July 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Qinfeng (Javen) Shi Lecture 1: Machine Learning Problem #12;Course info Machine Learning Real life problems Table of Contents I 1 Course

  18. Illinois State Museum: Audio-Video Barn

    NSDL National Science Digital Library

    If you want to explore the world of agriculture in Illinois, you should make a beeline for this interesting and thoughtful website. The Audio-Video Barn is a collaborative project designed by the Illinois State Museum, working in partnership with other local institutions and with funding from the Institute of Museum and Library Services. The project is a logical outgrowth of the Museum's "longstanding interest in human interactions with the natural world." So step right into the "barn" and listen to oral history interviews from the 1950s to the 1990s, locate interviews from a state-wide map, or look over the "User's Guide" for navigation tips. Visitors shouldn't miss the "Sit-Down Interviews" area, as they can just scan through photos and select an interviewee who looks interesting. To get started, visitors should check out some of the "Stories from the Barn", such as "My Father the Great Reader" and "Making Rails".

  19. Frequency dependent squeezed light at audio frequencies

    NASA Astrophysics Data System (ADS)

    Miller, John

    2015-04-01

    Following successful implementation in the previous generation of instruments, squeezed states of light represent a proven technology for the reduction of quantum noise in ground-based interferometric gravitational-wave detectors. As a result of lower noise and increased circulating power, the current generation of detectors places one further demand on this technique - that the orientation of the squeezed ellipse be rotated as function of frequency. This extension allows previously negligible quantum radiation pressure noise to be mitigated in addition to quantum shot noise. I will present the results of an experiment which performs the appropriate rotation by reflecting the squeezed state from a detuned high-finesse optical cavity, demonstrating frequency dependent squeezing at audio frequencies for the first time and paving the way for broadband quantum noise reduction in Advanced LIGO. Further, I will indicate how a realistic implementation of this approach will impact Advanced LIGO both alone and in combination with other potential upgrades.

  20. A direct broadcast satellite-audio experiment

    NASA Technical Reports Server (NTRS)

    Vaisnys, Arvydas; Abbe, Brian; Motamedi, Masoud

    1992-01-01

    System studies have been carried out over the past three years at the Jet Propulsion Laboratory (JPL) on digital audio broadcasting (DAB) via satellite. The thrust of the work to date has been on designing power and bandwidth efficient systems capable of providing reliable service to fixed, mobile, and portable radios. It is very difficult to predict performance in an environment which produces random periods of signal blockage, such as encountered in mobile reception where a vehicle can quickly move from one type of terrain to another. For this reason, some signal blockage mitigation techniques were built into an experimental DAB system and a satellite experiment was conducted to obtain both qualitative and quantitative measures of performance in a range of reception environments. This paper presents results from the experiment and some conclusions on the effectiveness of these blockage mitigation techniques.

  1. The Fields Institute: Lecture Audio and Slides

    NSDL National Science Digital Library

    The Fields Institute for Research in Mathematical Sciences aims to "enhance mathematical activity in Canada by bringing together mathematicians from Canada and abroad, and by promoting contact and collaboration between professional mathematicians and the increasing numbers of users of mathematics." They support research in pure and applied mathematics, statistics and computer science, as well as collaborative projects between mathematicians and those applying mathematics in areas such as engineering, the physical and biological sciences, medicine, economics and finance, telecommunications and information systems. They offer this website with audio files and slides from events and lectures at the Fields Institute. The lectures, given by scientists from around the world, address such topics as Quantitative Finance, String Theory, Homological Algebra, Combinatorics, and much more. The files are organized by academic year and series title. In cases where the files are not available to download, they provide information on how to obtain the files.

  2. Philadelphia Museum of Art: Audio Tours

    NSDL National Science Digital Library

    Going to the Philadelphia Museum of Art and wandering around can be a great experience. But what if there were also some audio podcasts to enhance this experience? This site provides visitors access to short podcasts that can be used while in the museum, or just while sitting in front of one's computer screen. The podcasts are organized into thematic categories that include "Arms and Armor", "Modern and Contemporary Art", and "Constantine Tapestries". Many of the podcasts include digitized images of the object in question, along with information about its provenance and country of origin. It's easy to see how an assemblage of these podcasts could be organized for use by an art history class or someone who's just developing an interest about a certain aspect of art.

  3. Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.

    2010-01-01

    From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.

  4. Noise-Canceling Helmet Audio System

    NASA Technical Reports Server (NTRS)

    Seibert, Marc A.; Culotta, Anthony J.

    2007-01-01

    A prototype helmet audio system has been developed to improve voice communication for the wearer in a noisy environment. The system was originally intended to be used in a space suit, wherein noise generated by airflow of the spacesuit life-support system can make it difficult for remote listeners to understand the astronaut s speech and can interfere with the astronaut s attempt to issue vocal commands to a voice-controlled robot. The system could be adapted to terrestrial use in helmets of protective suits that are typically worn in noisy settings: examples include biohazard, fire, rescue, and diving suits. The system (see figure) includes an array of microphones and small loudspeakers mounted at fixed positions in a helmet, amplifiers and signal-routing circuitry, and a commercial digital signal processor (DSP). Notwithstanding the fixed positions of the microphones and loudspeakers, the system can accommodate itself to any normal motion of the wearer s head within the helmet. The system operates in conjunction with a radio transceiver. An audio signal arriving via the transceiver intended to be heard by the wearer is adjusted in volume and otherwise conditioned and sent to the loudspeakers. The wearer s speech is collected by the microphones, the outputs of which are logically combined (phased) so as to form a microphone- array directional sensitivity pattern that discriminates in favor of sounds coming from vicinity of the wearer s mouth and against sounds coming from elsewhere. In the DSP, digitized samples of the microphone outputs are processed to filter out airflow noise and to eliminate feedback from the loudspeakers to the microphones. The resulting conditioned version of the wearer s speech signal is sent to the transceiver.

  5. Sony's Data Discman: A Look at These New Portable Information Machines and What They Mean for CD-ROM Developers.

    ERIC Educational Resources Information Center

    Bonime, Andrew

    1992-01-01

    Describes a portable CD-ROM machine intended for the mass market that provides access to searchable text, graphics, and audio through a user-friendly interface. Six search modes and other system features are reviewed, and electronic texts for the unit are introduced. A table compares features of the two available models. (NRP)

  6. 2002 SPS MACHINE STATISTICS

    E-print Network

    Desforges, B; CERN. Geneva. SPS and LHC Division

    2002-01-01

    2002 SPS MACHINE STATISTICS Fixed Target Periods with Protons (comments on machine operation, tables and diagrams, comparative tables and diagrams) Fixed Target Periods with Ions (comments on machine operation, tables and diagrams, comparative tables and diagrams)

  7. Mind & Machine

    NSDL National Science Digital Library

    Dunn, Ashley.

    Mind & Machine is a weekly column provided by Ashley Dunn for the New York Times Cybertimes that discusses topics related to computing, technology, and the Internet. Recent columns have addressed the topics of the development of Internet telephony, possible futures of user interfaces, the history of technology and standards, and the Internet as a vehicle for community. Articles are well written, opinionated, and thought provoking. Mr. Dunn is a free lance writer who has written for such papers as the New York Times, the Los Angeles Times, the Seattle Post-Intelligencer, and the South China Morning Post. Note that the site is available only upon registration and is free of charge only in the US.

  8. Development and Exploration of a Timbre Space Representation of Audio

    E-print Network

    Brewster, Stephen

    Development and Exploration of a Timbre Space Representation of Audio Craig Andrew Nicol Submitted Craig Andrew Nicol, 2005 #12;Abstract Sound is an important part of the human experience and provides

  9. Direct broadcast satellite-audio, portable and mobile reception tradeoffs

    NASA Technical Reports Server (NTRS)

    Golshan, Nasser

    1992-01-01

    This paper reports on the findings of a systems tradeoffs study on direct broadcast satellite-radio (DBS-R). Based on emerging advanced subband and transform audio coding systems, four ranges of bit rates: 16-32 kbps, 48-64 kbps, 96-128 kbps and 196-256 kbps are identified for DBS-R. The corresponding grades of audio quality will be subjectively comparable to AM broadcasting, monophonic FM, stereophonic FM, and CD quality audio, respectively. The satellite EIRP's needed for mobile DBS-R reception in suburban areas are sufficient for portable reception in most single family houses when allowance is made for the higher G/T of portable table-top receivers. As an example, the variation of the space segment cost as a function of frequency, audio quality, coverage capacity, and beam size is explored for a typical DBS-R system.

  10. HGS Schedulers for Digital Audio Workstation like Applications

    E-print Network

    Poduval, Karthik Venugopal

    2014-08-31

    Digital Audio Workstation (DAW) applications are real-time applications that have special timing constraints. Hierarchical Group Scheduling (HGS) is a real-time scheduling framework that allows developers implement custom ...

  11. An Audio-Tutorial Course for Nonmajor Biology Students

    ERIC Educational Resources Information Center

    Husband, David D.

    1973-01-01

    A partial solution to the problem of choosing what to teach from the vast amount of subject matter was achieved by combining the philosophies of the audio-tutorial approach and the concept-pak approach. (DF)

  12. AN ONLINE SYSTEM FOR AUTOMATIC ANNOTATION OF AUDIO DOCUMENTS

    Microsoft Academic Search

    Iain McCowan; Jitendra Ajmera; Darren Moore

    This article presents a system for automatic transcription of audio documents. The system includes online implementations of recent algorithms for audio segmentation, speech\\/non-speech classifica- tion, and speaker clustering, and integrates them with large vo- cabulary speech recognition systems for both English and French. We also propose a segment-based speech confidence score, and demonstrate that this correlates well with the correctness

  13. Sampling Function of Degree 2 for DVD-Audio

    Microsoft Academic Search

    Kazuo Toraichi; Koji Nakamura

    2003-01-01

    Authors have been studying Fluency Information Theory that generalizes Shannon's sampling theorem and its applications. Among the practical application of the research, the Fluency DAC that is developed as the Digital-to-analog converter for CD audio could have received objective valuation including receipt Golden Sound Award in 1988. In recent, DVD-Audio that deal with maximum sampling rate of 192 kHz has

  14. Spread-spectrum audio watermarking: requirements, applications, and limitations

    Microsoft Academic Search

    Darko Kirovski; Henrique Malvar

    2001-01-01

    Watermarking has been adopted as a technology of choice for many applications related to e-commerce of audio content. We present a brief summary of a set of spread-spectrum watermarking techniques for effective covert communication over an audio signal carrier. Watermark robustness is enabled using redundant spread-spectrum for prevention against de-synchronization attacks. We improve watermark inaudibility by detecting and not watermarking

  15. Blind Audio Source Separation Based on Independent Component Analysis

    Microsoft Academic Search

    Shoji Makino; Hiroshi Sawada; Shoko Araki

    2007-01-01

    This keynote talk describes a state-of-the-art method for the blind source separation (BSS) of convolutive mixtures of audio\\u000a signals. Independent component analysis (ICA) is used as a major statistical tool for separating the mixtures. We provide\\u000a examples to show how ICA criteria change as the number of audio sources increases. We then discuss a frequency-domain approach\\u000a where simple instantaneous ICA

  16. A General Modular Framework for Audio Source Separation

    Microsoft Academic Search

    Alexey Ozerov; Emmanuel Vincent; Frédéric Bimbot

    2010-01-01

    \\u000a Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and\\u000a channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio\\u000a source separation framework based on a library of flexible source models that enable the incorporation of prior knowledge\\u000a about the characteristics

  17. Auditory-inspired sparse representation of audio signals

    Microsoft Academic Search

    Ramin Pichevar; Hossein Najaf-Zadeh; Louis Thibault; Hassan Lahdili

    2011-01-01

    This article deals with the generation of auditory-inspired spectro-temporal features aimed at audio coding. To do so, we first generate sparse audio representations we call spikegrams, using projections on gammatone\\/gammachirp kernels that generate neural spikes. Unlike Fourier-based representations, these representations are powerful at identifying auditory events, such as onsets, offsets, transients, and harmonic structures. We show that the introduction of

  18. The Future of Audio Reproduction - Technology - Formats - Applications

    Microsoft Academic Search

    Matthias Geier; Sascha Spors; Stefan Weinzierl

    2008-01-01

    \\u000a The introduction of new techniques for audio reproduction such as binaural technology, Wave Field Synthesis and Higher Order Ambisonics is accompanied by a paradigm shift from channel-based to object-based transmission and storage of spatial audio. The separate coding of source signal and source location is not only more efficient\\u000a considering the number of channels used for reproduction by large loudspeaker

  19. Audio Features Selection for Automatic Height Estimation from Speech

    Microsoft Academic Search

    Todor Ganchev; Iosif Mporas; Nikos Fakotakis

    2010-01-01

    \\u000a Aiming at the automatic estimation of the height of a person from speech, we investigate the applicability of various subsets\\u000a of speech features, which were formed on the basis of ranking the relevance and the individual quality of numerous audio features.\\u000a Specifically, based on the relevance ranking of the large set of openSMILE audio descriptors, we performed selection of subsets

  20. Frequency of plumbing fixture use through audio sampling

    E-print Network

    Shea, Kevin Bruce

    2013-02-22

    FREQUENCY OF PLUMBING FIXTURE USE THROUGH AUDIO SAMPLING A Senior Honors Thesis By KEVIN BRUCE SHEA Submitted to the Office of Honors Programs & Academic Scholarships Texas ARM University In partial fulfillment of the requirements... of the UNIVERSITY UNDERGRADUATE RESEARCH FELLOWS April 2000 Group. Physical Sciences FREQUENCY OF PLUMBING FIXTURE USE THROUGH AUDIO SAMPLING A Senior Honors Thesis By KEVIN BRUCE SHEA Submitted to the Office of Honors Programs & Academic Scholarships...

  1. Frank Baumgarte Application of a physiological ear model to irrelevance reduction in audio coding AES 17 th International conference on High Quality Audio Coding 1

    E-print Network

    Frank Baumgarte Application of a physiological ear model to irrelevance reduction in audio coding AES 17 th International conference on High Quality Audio Coding 1 APPLICATION OF A PHYSIOLOGICAL EAR@tnt.uni--hannover.de A previously published physiological ear model is applied as perceptual model to an audio coder complying

  2. Machine musicianship

    NASA Astrophysics Data System (ADS)

    Rowe, Robert

    2002-05-01

    The training of musicians begins by teaching basic musical concepts, a collection of knowledge commonly known as musicianship. Computer programs designed to implement musical skills (e.g., to make sense of what they hear, perform music expressively, or compose convincing pieces) can similarly benefit from access to a fundamental level of musicianship. Recent research in music cognition, artificial intelligence, and music theory has produced a repertoire of techniques that can make the behavior of computer programs more musical. Many of these were presented in a recently published book/CD-ROM entitled Machine Musicianship. For use in interactive music systems, we are interested in those which are fast enough to run in real time and that need only make reference to the material as it appears in sequence. This talk will review several applications that are able to identify the tonal center of musical material during performance. Beyond this specific task, the design of real-time algorithmic listening through the concurrent operation of several connected analyzers is examined. The presentation includes discussion of a library of C++ objects that can be combined to perform interactive listening and a demonstration of their capability.

  3. Machine Shop Lathes.

    ERIC Educational Resources Information Center

    Dunn, James

    This guide, the second in a series of five machine shop curriculum manuals, was designed for use in machine shop courses in Oklahoma. The purpose of the manual is to equip students with basic knowledge and skills that will enable them to enter the machine trade at the machine-operator level. The curriculum is designed so that it can be used in…

  4. Minimally radiating sources for personal audio.

    PubMed

    Elliott, Stephen J; Cheer, Jordan; Murfet, Harry; Holland, Keith R

    2010-10-01

    In order to reduce annoyance from the audio output of personal devices, it is necessary to maintain the sound level at the user position while minimizing the levels elsewhere. If the dark zone, within which the sound is to be minimized, extends over the whole far field of the source, the problem reduces to that of minimizing the radiated sound power while maintaining the pressure level at the user position. It is shown analytically that the optimum two-source array then has a hypercardioid directivity and gives about 7 dB reduction in radiated sound power, compared with a monopole producing the same on-axis pressure. The performance of other linear arrays is studied using monopole simulations for the motivating example of a mobile phone. The trade-off is investigated between the performance in reducing radiated noise, and the electrical power required to drive the array for different numbers of elements. It is shown for both simulations and experiments conducted on a small array of loudspeakers under anechoic conditions, that both two and three element arrays provide a reasonable compromise between these competing requirements. The implementation of the two-source array in a coupled enclosure is also shown to reduce the electrical power requirements. PMID:20968345

  5. Personal audio with a planar bright zone.

    PubMed

    Coleman, Philip; Jackson, Philip J B; Olik, Marek; Pedersen, Jan Abildgaard

    2014-10-01

    Reproduction of multiple sound zones, in which personal audio programs may be consumed without the need for headphones, is an active topic in acoustical signal processing. Many approaches to sound zone reproduction do not consider control of the bright zone phase, which may lead to self-cancellation problems if the loudspeakers surround the zones. Conversely, control of the phase in a least-squares sense comes at a cost of decreased level difference between the zones and frequency range of cancellation. Single-zone approaches have considered plane wave reproduction by focusing the sound energy in to a point in the wavenumber domain. In this article, a planar bright zone is reproduced via planarity control, which constrains the bright zone energy to impinge from a narrow range of angles via projection in to a spatial domain. Simulation results using a circular array surrounding two zones show the method to produce superior contrast to the least-squares approach, and superior planarity to the contrast maximization approach. Practical performance measurements obtained in an acoustically treated room verify the conclusions drawn under free-field conditions. PMID:25324075

  6. Robustness evaluation of transactional audio watermarking systems

    NASA Astrophysics Data System (ADS)

    Neubauer, Christian; Steinebach, Martin; Siebenhaar, Frank; Pickel, Joerg

    2003-06-01

    Distribution via Internet is of increasing importance. Easy access, transmission and consumption of digitally represented music is very attractive to the consumer but led also directly to an increasing problem of illegal copying. To cope with this problem watermarking is a promising concept since it provides a useful mechanism to track illicit copies by persistently attaching property rights information to the material. Especially for online music distribution the use of so-called transaction watermarking, also denoted with the term bitstream watermarking, is beneficial since it offers the opportunity to embed watermarks directly into perceptually encoded material without the need of full decompression/compression. Besides the concept of bitstream watermarking, former publications presented the complexity, the audio quality and the detection performance. These results are now extended by an assessment of the robustness of such schemes. The detection performance before and after applying selected attacks is presented for MPEG-1/2 Layer 3 (MP3) and MPEG-2/4 AAC bitstream watermarking, contrasted to the performance of PCM spread spectrum watermarking.

  7. Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie

    E-print Network

    Fisher, Kathleen

    the scalability as well as elim- inate the fatigue associated with prolonged human interpreta- tion. However latency. Such an ap- proach is often used by human interpreters in simultaneous in- terpretation translation technol- ogy has been gradually trying to reduce the dependence on human interpreters to improve

  8. Audio-guided audiovisual data segmentation, indexing, and retrieval

    NASA Astrophysics Data System (ADS)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-12-01

    While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.

  9. Audio-video feature correlation: faces and speech

    NASA Astrophysics Data System (ADS)

    Durand, Gwenael; Montacie, Claude; Caraty, Marie-Jose; Faudemay, Pascal

    1999-08-01

    This paper presents a study of the correlation of features automatically extracted from the audio stream and the video stream of audiovisual documents. In particular, we were interested in finding out whether speech analysis tools could be combined with face detection methods, and to what extend they should be combined. A generic audio signal partitioning algorithm as first used to detect Silence/Noise/Music/Speech segments in a full length movie. A generic object detection method was applied to the keyframes extracted from the movie in order to detect the presence or absence of faces. The correlation between the presence of a face in the keyframes and of the corresponding voice in the audio stream was studied. A third stream, which is the script of the movie, is warped on the speech channel in order to automatically label faces appearing in the keyframes with the name of the corresponding character. We naturally found that extracted audio and video features were related in many cases, and that significant benefits can be obtained from the joint use of audio and video analysis methods.

  10. Talker variability in audio-visual speech perception.

    PubMed

    Heald, Shannon L M; Nusbaum, Howard C

    2014-01-01

    A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919

  11. 36 CFR 5.5 - Commercial filming, still photography, and audio recording.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ...false Commercial filming, still photography, and audio recording. 5.5...5 Commercial filming, still photography, and audio recording. (a) Commercial filming and still photography activities are subject to...

  12. 16 CFR 307.8 - Requirements for disclosure in audiovisual and audio advertising.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...disclosure in audiovisual and audio advertising. 307.8 Section 307...HEALTH EDUCATION ACT OF 1986 Advertising Disclosures § 307.8 Requirements...disclosure in audiovisual and audio advertising. In the case of...

  13. Hybrid Quantum Cloning Machine

    E-print Network

    Satyabrata Adhikari; A. K. Pati; Indranil Chakrabarty; B. S. Choudhury

    2007-06-14

    In this work, we introduce a special kind of quantum cloning machine called Hybrid quantum cloning machine. The introduced Hybrid quantum cloning machine or transformation is nothing but a combination of pre-existing quantum cloning transformations. In this sense it creates its own identity in the field of quantum cloners. Hybrid quantum cloning machine can be of two types: (i) State dependent and (ii) State independent or Universal. We study here the above two types of Hybrid quantum cloning machines. Later we will show that the state dependent hybrid quantum-cloning machine can be applied on only four input states. We will also find in this paper another asymmetric universal quantum cloning machine constructed from the combination of optimal universal B-H quantum cloning machine and universal anti-cloning machine. The fidelities of the two outputs are different and their values lie in the neighborhood of ${5/6} $

  14. Efficient Query-by-Content Audio Retrieval by Locality Sensitive Hashing and Partial Sequence Comparison

    Microsoft Academic Search

    Yi Yu; Kazuki Joe; J. Stephen Downie

    2008-01-01

    This paper investigates suitable indexing techniques to enable efficient content-based audio retrieval in large acoustic databases. To make an index-based retrieval mechanism applicable to audio content, we investigate the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison. We propose a fast and efficient audio retrieval framework of query-by-content and develop an audio retrieval system. Based on this

  15. Say What? The Role of Audio in Multimedia Video

    NASA Astrophysics Data System (ADS)

    Linder, C. A.; Holmes, R. M.

    2011-12-01

    Audio, including interviews, ambient sounds, and music, is a critical-yet often overlooked-part of an effective multimedia video. In February 2010, Linder joined scientists working on the Global Rivers Observatory Project for two weeks of intensive fieldwork in the Congo River watershed. The team's goal was to learn more about how climate change and deforestation are impacting the river system and coastal ocean. Using stills and video shot with a lightweight digital SLR outfit and audio recorded with a pocket-sized sound recorder, Linder documented the trials and triumphs of working in the heart of Africa. Using excerpts from the six-minute Congo multimedia video, this presentation will illustrate how to record and edit an engaging audio track. Topics include interview technique, collecting ambient sounds, choosing and using music, and editing it all together to educate and entertain the viewer.

  16. Virtual environment display for a 3D audio room simulation

    NASA Technical Reports Server (NTRS)

    Chapin, William L.; Foster, Scott H.

    1992-01-01

    The development of a virtual environment simulation system integrating a 3D acoustic audio model with an immersive 3D visual scene is discussed. The system complements the acoustic model and is specified to: allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; reinforce the listener's feeling of telepresence in the acoustical environment with visual and proprioceptive sensations; enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations.

  17. Multi-channel spatialization systems for audio signals

    NASA Technical Reports Server (NTRS)

    Begault, Durand R. (inventor)

    1993-01-01

    Synthetic head related transfer functions (HRTF's) for imposing reprogrammable spatial cues to a plurality of audio input signals included, for example, in multiple narrow-band audio communications signals received simultaneously are generated and stored in interchangeable programmable read only memories (PROM's) which store both head related transfer function impulse response data and source positional information for a plurality of desired virtual source locations. The analog inputs of the audio signals are filtered and converted to digital signals from which synthetic head related transfer functions are generated in the form of linear phase finite impulse response filters. The outputs of the impulse response filters are subsequently reconverted to analog signals, filtered, mixed, and fed to a pair of headphones.

  18. Generative and Discriminative Modeling toward Semantic Context Detection in Audio Tracks

    Microsoft Academic Search

    Wei-ta Chu; Wen-huang Cheng; Ja-ling Wu

    2005-01-01

    Semantic-level content analysis is a crucial issue to achieve efficient content retrieval and management. We propose a hierarchical approach that models the statistical characteristics of several audio events over a time series to accomplish semantic context detection. Two stages, including audio event and semantic context modeling\\/testing, are devised to bridge the semantic gap between physical audio features and semantic concepts.

  19. TWO ARE BETTER THAN ONE: WHEN AUDIO COMES TO THE RESCUE OF VIDEO Alberto Albiol

    E-print Network

    Torres, Luis

    in the video sequence, the identity m will be proposed and the recognition system will verify or denyVideo shot Speaker recognition - - -- - 6 ?Image Audio Face confidence Audio confidence Face detection&recognition Accept/reject Person m audio model Person m face model Fig. 1. System overview The rest of the paper

  20. Non-Uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral

    E-print Network

    signals (Hilbert carriers) are demodulated and thresholding functions are applied in spectral domain/audio coding has been focused on high quality/low latency compression of wide-band audio signals. However, new, our approach uses relatively long temporal segments of audio signal in critical-band-sized sub

  1. Real-Time MCLT Audio Watermarking and Comparison of Several Whitening Methods in Receptor Side

    Microsoft Academic Search

    Jose Juan Garcia Hernandez; Mariko Nakano-miyatake; Héctor M. Pérez Meana

    2006-01-01

    Abstract A Real-Time audio watermarking scheme is proposed, where the strength of audio signal modifications is limited bythe,requirement ,of producing ,an output ,audio signal that is perceptually equal to the ,original one. The watermark embedding stage, based on a spread spectrum algorithm operating in the Modulated Complex Lapped Transform (MCLT) domain, inserts a watermark that is generated using a private

  2. The SPECIAL System: Self-Paced Education with Compressed Interactive Audio Learning.

    ERIC Educational Resources Information Center

    Harrigan, Kevin

    1995-01-01

    Describes a computer system (SPECIAL) that allows for the capture and playback of audio and slides from a lecture. SPECIAL provides random access to both audio and slides, and variable speed control of audio. Testing shows that learners prefer using faster speeds and that grades are higher for participants using this system rather than a textbook…

  3. Lossless and Perceptual Coding of Digital Audio Peter Noll, Tilman Liebchen

    E-print Network

    Wichmann, Felix

    for professional and customer applications. This paper will explain approaches to lossless and lossy compression coding will have to be lossless, with compression factors around two as will be shown shortly. For other for various digital audio schemes (stereophonic signals). 2 Lossless audio coding Lossless audio coding

  4. Software Architecture for Audio and Haptic Rendering Based on a Physical Model

    Microsoft Academic Search

    Hiroaki Yano; Hiroo Iwata

    This paper describes a software architecture solution for rendering audio and haptic sensation. By most of existing 3D sound systems, it is difficult to generate appropriate sounds from virtual objects, depending on the material used and the location of the impact. We developed a method that we named AudioHaptics to generate audio and haptic sensation based on a physical model

  5. Generalities about NMF NMF for audio NDS Dynamical nonnegative matrix factorisation for

    E-print Network

    Combettes, Patrick Louis

    Generalities about NMF NMF for audio NDS Dynamical nonnegative matrix factorisation for audio´evotte (CNRS) Dynamical NMF #12;Generalities about NMF NMF for audio NDS Concept of NMF Algorithms Outline NDS Concept of NMF Algorithms Nonnegative matrix factorization (NMF) Given a nonnegative matrix V

  6. A TENTATIVE TYPOLOGY OF AUDIO SOURCE SEPARATION TASKS Emmanuel Vincent Cdric Fvotte Rmi Gribonval

    E-print Network

    Boyer, Edmond

    classification scheme. 1. INTRODUCTION Blind Audio Source Separation (BASS) has been a subject of intense workA TENTATIVE TYPOLOGY OF AUDIO SOURCE SEPARATION TASKS Emmanuel Vincent CĂ©dric FĂ©votte RĂ©mi a preliminary step towards the construction of a global evaluation framework for Blind Audio Source Sep- aration

  7. Blind Audio-Visual Source Separation based on Sparse Redundant Representations

    E-print Network

    Boyer, Edmond

    Blind Audio-Visual Source Separation based on Sparse Redundant Representations Anna Llagostera a novel method which is able to detect and separate audio-visual sources present in a scene. Our method to separate the audio signal in periods during which several sources are mixed. The proposed approach has been

  8. Adaptively robust blind audio signals separation by the minimum ?-divergence method

    Microsoft Academic Search

    M. N. H. Mollah; S. Eguchi

    2007-01-01

    Recently, independent component analysis (ICA) is the most popular and promising statistical technique for blind audio source separation. This paper proposes the minimum beta-divergence based ICA as an adaptive robust audio source separation algorithm. This algorithm explores local structures of audio source signals in which the observed signals follow a mixture of several ICA models. The performance of this algorithm

  9. Single mixture audio sources separation using ISA technique in EMD domain

    Microsoft Academic Search

    Nawal El Hamdouni; Abdellah Adib

    2010-01-01

    This paper introduces a novel technique that is developed to separate the audio sources from a single mixture. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal) components. Based on this representation, Empirical Mode Decomposition (EMD) is employed to extract Intrinsic Mode Functions (IMFs) for audio mixture signal. By applying

  10. A robust method to count and locate audio sources in a multichannel underdetermined mixture

    E-print Network

    Boyer, Edmond

    . Index Terms--Blind source separation, multichannel audio, delay estimation, sparse component analysis1 A robust method to count and locate audio sources in a multichannel underdetermined mixture Simon together, and what are the original source signals. In the context of audio sources, the measured signals

  11. Parametric Packet-Layer Model for Evaluation Audio Quality in Multimedia Streaming Services

    NASA Astrophysics Data System (ADS)

    Egi, Noritsugu; Hayashi, Takanori; Takahashi, Akira

    We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.

  12. Development and Assessment of Web Courses That Use Streaming Audio and Video Technologies.

    ERIC Educational Resources Information Center

    Ingebritsen, Thomas S.; Flickinger, Kathleen

    Iowa State University, through a program called Project BIO (Biology Instructional Outreach), has been using RealAudio technology for about 2 years in college biology courses that are offered entirely via the World Wide Web. RealAudio is a type of streaming media technology that can be used to deliver audio content and a variety of other media…

  13. Responding Effectively to Composition Students: Comparing Student Perceptions of Written and Audio Feedback

    ERIC Educational Resources Information Center

    Bilbro, J.; Iluzada, C.; Clark, D. E.

    2013-01-01

    The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…

  14. Design and Usability Testing of an Audio Platform Game for Players with Visual Impairments

    ERIC Educational Resources Information Center

    Oren, Michael; Harding, Chris; Bonebright, Terri L.

    2008-01-01

    This article reports on the evaluation of a novel audio platform game that creates a spatial, interactive experience via audio cues. A pilot study with players with visual impairments, and usability testing comparing the visual and audio game versions using both sighted players and players with visual impairments, revealed that all the…

  15. Audio Use in E-Learning: What, Why, When, and How?

    ERIC Educational Resources Information Center

    Calandra, Brendan; Barron, Ann E.; Thompson-Sellers, Ingrid

    2008-01-01

    Decisions related to the implementation of audio in e-learning are perplexing for many instructional designers, and deciphering theory and principles related to audio use can be difficult for practitioners. Yet, as bandwidth on the Internet increases, digital audio is becoming more common in online courses. This article provides a review of…

  16. Hearing You Loud and Clear: Student Perspectives of Audio Feedback in Higher Education

    ERIC Educational Resources Information Center

    Gould, Jill; Day, Pat

    2013-01-01

    The use of audio feedback for students in a full-time community nursing degree course is appraised. The aim of this mixed methods study was to examine student views on audio feedback for written assignments. Questionnaires and a focus group were used to capture student opinion of this pilot project. The majority of students valued audio feedback…

  17. A Collaborative Interface for Multimodal Ink and Audio Documents Amit Regmi and Stephen M. Watt

    E-print Network

    Watt, Stephen M.

    A Collaborative Interface for Multimodal Ink and Audio Documents Amit Regmi and Stephen M. Watt and to archive multi-party communication sessions that involve audio and digital ink on a shared canvas be used to represent, to transmit, to record and to synchronize ink and audio channels. We find Ink

  18. INCORPORATING PRIOR KNOWLEDGE ON THE DIGITAL MEDIA CREATION PROCESS INTO AUDIO CLASSIFIERS

    E-print Network

    Richard, Gaël

    INCORPORATING PRIOR KNOWLEDGE ON THE DIGITAL MEDIA CREATION PROCESS INTO AUDIO CLASSIFIERS M an automatic audio classifier. In this paper, it is shown that the incorporation of prior knowledge of the digital media creation chain can clearly improve the robustness of the audio classifiers, which

  19. An Improved Psychoacoustic Model for Audio Coding Based on Wavelet Packet

    Microsoft Academic Search

    Samar Krimi; Kaďs Ouni; Noureddine Ellouze

    2007-01-01

    This paper describes a new design of a psychoacoustic model for audio coding following the model used in the standard MPEG-1 audio layer 3 using an appropriate wavelet packet decomposition of the speech\\/audio signal. The design of a psychoacoustic model is achieved by wavelet packet decomposition whose connections are selected in such a way that sub bands correspond to the

  20. A Gammatone-based Psychoacoustical Modeling Approach for Speech and Audio Coding

    Microsoft Academic Search

    Ghassan Charestan; Richard Heusdens; Steven van de Par

    2001-01-01

    We propose a new approach for modeling auditory masking based on gammatone filters for applica- tion areas including speech\\/audio coding and audio water- marking. Besides the use of gammatone filters, this model differs from existing audio coding psychoacoustical models (e.g., the ones used in MPEG), in taking into account the contribution of a range of filters in computing the distor-

  1. The Case for FEC-based Error Control for Packet Audio in the Internet

    Microsoft Academic Search

    Andrs Vega-garca; Jean-chrysostome Bolot

    1997-01-01

    We consider the problem of distributing real-time packet audio overnetworks such as the Internet which do not provide support for real-timeapplications. Experiments with such networks indicate that audio qualityis mediocre in large part because of excessive audio packet losses. In thispaper, we show using measurements over the Internet as well as analyticmodeling that most loss periods involve a small number

  2. Building ensembles of audio and lyrics features to improve musical genre classification

    Microsoft Academic Search

    Rudolf Mayer; Andreas Rauber

    2010-01-01

    Digital audio has become an almost ubiquitously spread medium, and for many consumers, digital audio is the major distribution and storage form of music. Numerous on-line music stores account for a growing share of record sales. The widespread adoption of digital audio on home computers and es- pecially mobile devices, and numerous on-line music stores show the size of this

  3. Frequency-Based Coloring of the Audio Waveform Display Stephen V. Rice

    E-print Network

    Rice, Stephen V.

    Frequency-Based Coloring of the Audio Waveform Display Stephen V. Rice The University to represent the frequency content to make sounds more visible. Audio-editing systems are enhanced-track audio-editing system presents one waveform display for each track. However, frequency information

  4. Video-assisted segmentation of speech and audio track

    NASA Astrophysics Data System (ADS)

    Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.

    1999-08-01

    Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.

  5. Three dimensional audio versus head down TCAS displays

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Pittman, Marc T.

    1994-01-01

    The advantage of a head up auditory display was evaluated in an experiment designed to measure and compare the acquisition time for capturing visual targets under two conditions: Standard head down traffic collision avoidance system (TCAS) display, and three-dimensional (3-D) audio TCAS presentation. Ten commercial airline crews were tested under full mission simulation conditions at the NASA Ames Crew-Vehicle Systems Research Facility Advanced Concepts Flight Simulator. Scenario software generated targets corresponding to aircraft which activated a 3-D aural advisory or a TCAS advisory. Results showed a significant difference in target acquisition time between the two conditions, favoring the 3-D audio TCAS condition by 500 ms.

  6. LISP Machine Progress Report

    E-print Network

    Bawden, Alan

    1977-08-01

    This informal paper introduces the LISP Machine, describes the goals and current status of the project, and explicates some of the key ideas. It covers the LISP machine implementation, LISP as a system language, ...

  7. Infinite Time Turing Machines

    Microsoft Academic Search

    Joel David Hamkins

    2002-01-01

    Infinite time Turing machines extend the operation of ordinary Turing machines into transfinite ordinal time. By doing so, they provide a natural model of infinitary computability, a theoretical setting for the analysis of the power and limitations of supertask algorithms.

  8. What is Machine Learning? About the Course Example Machine Learning

    E-print Network

    Kjellström, Hedvig

    What is Machine Learning? About the Course Example Machine Learning DD2431 ¨Orjan Ekeberg Oct­Dec, 2007 #12;What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition A Hypothetical Project #12;What is Machine Learning? About the Course Example 1 What is Machine Learning

  9. What is Machine Learning? About the Course Example Machine Learning

    E-print Network

    Kjellström, Hedvig

    What is Machine Learning? About the Course Example Machine Learning DD2431 ¨Orjan Ekeberg Oct­Dec, 2008 What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition A Hypothetical Project What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition

  10. What is Machine Learning? About the Course Example Machine Learning

    E-print Network

    Kjellström, Hedvig

    What is Machine Learning? About the Course Example Machine Learning DD2431 ¨Orjan Ekeberg Oct­Dec, 2008 #12;What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition A Hypothetical Project #12;What is Machine Learning? About the Course Example 1 What is Machine Learning

  11. What is Machine Learning? About the Course Example Machine Learning

    E-print Network

    Kjellström, Hedvig

    What is Machine Learning? About the Course Example Machine Learning DD2431 ¨Orjan Ekeberg Oct­Dec, 2007 What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition A Hypothetical Project What is Machine Learning? About the Course Example 1 What is Machine Learning? Definition

  12. Primary Masters in Machine Learning

    E-print Network

    Primary Masters in Machine Learning Student Handbook #12;#12;Page 1 Masters in Machine Learning:.......................................................................................8 Machine Learning Journal Club ..................................................................12 #12;Page 3 Introduction The field of machine learning is concerned with the question of how

  13. Stochastic Optimization for Machine Learning

    E-print Network

    Powell, Warren B.

    Stochastic Optimization for Machine Learning ICML 2010, Haifa, Israel Tutorial by Nati Srebro Descent: formulation, analysis and use in machine learning · Learn about extensions and generalizations, and their Machine Learning counterparts Main Goal: Machine Learning is Stochastic Optimization #12;Outline

  14. Covering selfish machines

    Microsoft Academic Search

    Leah Epstein; Rob Van Stee

    2006-01-01

    We consider the machine covering problem for selfish related machines. For a constant number of machines, m, we show a monotone polynomial time approximation scheme (PTAS) with running time that is linear in the number of jobs. It uses a new technique for reducing the number of jobs while remaining close to the optimal solution. We also present an FPTAS

  15. Automated Slide Staining Machine

    PubMed Central

    Drew, W. Lawrence; Pedersen, Anders N.; Roy, Jacques J.

    1972-01-01

    A machine is described which can perform the Gram stain. Comparison of slides stained by machine versus hand revealed no difference in reproducibility or accuracy. In addition to providing clean, dry, uniformly stained slides, the machine saves 24 sec per slide when compared with a hand staining technique. Images PMID:4110426

  16. Review on ultrasonic machining

    Microsoft Academic Search

    T. B. Thoe; D. K. Aspinwall; M. L. H. Wise

    1998-01-01

    Ultrasonic machining is of particular interest for the cutting of non-conductive, brittle workpiece materials such as engineering ceramics. Unlike other non-traditional processes such as laser beam, and electrical discharge machining, etc., ultrasonic machining does not thermally damage the workpiece or appear to introduce significant levels of residual stress, which is important for the survival of brittle materials in service. The

  17. Find the Simple Machines

    NSDL National Science Digital Library

    2012-07-17

    This is a web activity about simple machines. Learners will explore a lawn mower and identify six different simple machines which work together to help make our lives easier. This is an excellent activity for exploring how simple machines, and science in general, apply to learners' everyday lives.

  18. Semantic context detection based on hierarchical audio models

    Microsoft Academic Search

    Wen-Huang Cheng; Wei-Ta Chu; Ja-Ling Wu

    2003-01-01

    Semantic context detection is one of the key techniques to facilitate efficient multimedia retrieval. Semantic context is a scene that completely represents a meaningful information segment to human beings. In this paper, we propose a novel hierarchical approach that models the statistical characteristics of several audio events, over a time series, to accomplish semantic context detection. The approach consists of

  19. Statistical audio watermarking algorithm based on perceptual analysis

    Microsoft Academic Search

    Xiaomei Quan; Hongbin Zhang

    2005-01-01

    In this paper, we describe a novel statistical audio watermarking scheme. Under the control of the masking thresholds, watermark is embedded adaptively and transparently in the perceptual significant portions in wavelet packet domain by a statistical method. Watermark detection can be done without access to the original signal. Experimental results show the proposed scheme can survive common signal manipulations and

  20. MIXPLORATION: Rethinking the Audio Mixer Interface Mark Cartwright

    E-print Network

    Pardo, Bryan

    for exploring the space of possible mixes of four audio tracks. In a user study with 24 participants, we compared the effectiveness of this interface to the traditional paradigm for exploring alternative mixes or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights

  1. Auto-summarization of audio-video presentations

    Microsoft Academic Search

    Li-wei He; Elizabeth Sanocki; Anoop Gupta; Jonathan Grudin

    1999-01-01

    As streaming audio-video technology becomes widespread, there is a dramatic increase in the amount of multimedia content available on the net. Users face a new challenge: How to examine large amounts of multimedia content quickly. One technique that can enable quick overview of multimedia is video summaries; that is, a shorter version assembled by picking important segments from the original.

  2. Investigations of Noise in Audio Frequency Amplifiers Using Junction Transistors

    Microsoft Academic Search

    P. M. Bargellini; M. B. Herscher

    1955-01-01

    An investigation of noise from modem junction transistors in audio frequency amplifiers is presented. Different circuit configurations are examined and the effects on noise factor of the input termination and operating point are discussed. At least three distinct sources of noise corresponding to different physical phenomena contributing to total noise are identified. In modem junction transistors shot noise and thermal

  3. Audio and Video Extensions to Graphical User Interface Toolkits

    Microsoft Academic Search

    Rei Hamakawa; Hidekazu Sakagami; Jun Rekimoto

    1992-01-01

    This paper describes audio and video extensions to graphical user interface (GUI) toolkits for multimedia systems. These extensions are based on a new object-oriented model for handling multimedia data. The introduction of temporal glue, and a mechanism for constructing composite multimedia data hierarchically in the model, make it quite easy to edit and reuse composite multimedia data. Programs created on

  4. Adaptive delay estimation for low jitter audio over Internet

    Microsoft Academic Search

    Aman Kansal; Abhay Karandikar

    2001-01-01

    Real time voice applications typically produce uniformly spaced voice packets and faithful reconstruction demands that these be played out at the same intervals. Best effort packet networks, however, produce variable delays on different packets and the receiver is required to buffer the received packets before playout. Excessive buffering delays deteriorate the system performance for interactive audio and so intelligent algorithms

  5. Distributed audio feature extraction for music Stuart Bray

    E-print Network

    Tzanetakis, George

    -frequency analysis techniques such as the short time Fourier transform, wavelets and auditory filterbanks. MoreoverDistributed audio feature extraction for music Stuart Bray Computer Science Department University of Victoria 3800 Finnerty Rd Victoria BC, Canada sbray@csc.uvic.ca George Tzanetakis Computer Science

  6. Reading's SLiCK with New Audio Texts and Strategies.

    ERIC Educational Resources Information Center

    Boyle, Elizabeth A.; Washburn, Shari Gallin; Rosenberg, Michael S.; Connelly, Vincent J.; Brinckerhoff, Loring C.; Banerjee, Manju

    2002-01-01

    This article discusses challenges for secondary students with disabilities and alternative instructional methods that teachers of students with poor reading skills can use to convey content information effectively and efficiently. The use of audio textbooks on CD-ROMs is emphasized and the SLiCK strategy is explained as a support for the CD-ROM.…

  7. Enhancing the performance of subband audio coders for speech signals

    Microsoft Academic Search

    Henrique S. Malvar

    1998-01-01

    Transform or subband audio coders can deliver high quality reconstruction at rates around two bits per sample. Most quantization strategies take into account masking properties of the human ear to make the quantization noise less noticeable. In this paper we describe a new coder in which we extend such quantization strategies by incorporating run-length and arithmetic encoders. They lead to

  8. Incorporation of biorthogonality into lapped transforms for audio compression

    Microsoft Academic Search

    Shiufun Cheung; Jae S. Lim

    1995-01-01

    Acoustic signal representations used in current audio coding algorithms can be improved by the incorporation of biorthogonality into Malvar's extended lapped transform (ELT). Biorthogonality allows more flexibility in the design of the analysis and synthesis windows by increasing the number of degrees of freedom. This paper examines this increase for two special cases and demonstrates the importance of the additional

  9. Scalable audio coding using the nonuniform modulated complex lapped transform

    Microsoft Academic Search

    A. S. Scheuble; Zixiang Xiong

    2001-01-01

    This paper introduces a scalable audio coder using the nonuniform modulated complex lapped transform (NMCLT), which is a new nonuniform oversampled filter bank with a better combination of time- and frequency-domain localization than previous designs. Masking functions for different critical Bark bands are first calculated directly from the NMCLT coefficients as perceptual weights and arithmetic coding is then used to

  10. Effective browsing of long audio recordings Camille Goudeseune

    E-print Network

    Hasegawa-Johnson, Mark

    .5 [Information Interfaces and Presentation]: Sound and Music Computing--signal analysis, synthesis, and process signal-based, like spectrograms, or model-based, like categorical classifiers. Unlike conventional audio.00. spectrograms transformed to reduce the visual salience of non-anomalous events [12], and output log likelihoods

  11. Characteristics of Streaming Audio and Video Stored on the Internet

    Microsoft Academic Search

    Mingzhe Li; Mark Claypool; Robert Kinicki; James Nichols

    The increasing power and connectivity of to- day's computers have spurred the growth in streaming au- dio and video available on the Internet through the Web. While there is substantial research characterizing the per- formance of streaming media and characterizing documents stored on the Internet, there have been few studies charac- terizing streaming audio and video stored on the Web.

  12. Subword-based spoken term detection in audio course lectures

    Microsoft Academic Search

    Richard C. Rose; Atta Norouzian; Aarthi Reddy; André Coy; Vishwa Gupta; Martin Karafiát

    2010-01-01

    This paper investigates spoken term detection (STD) from audio recordings of course lectures obtained from an existing media repository. STD is performed from word lattices generated offline using an automatic speech recognition (ASR) system configured from a meetings domain. An efficient STD approach is presented where lattice paths which are likely to contain search terms are identified and an efficient

  13. NWX NASA GSFC AUDIO CORE Moderator: Nancy Jones

    E-print Network

    NWX NASA GSFC AUDIO CORE Moderator: Nancy Jones 08-19-10/3:10 pm CT Confirmation # 4174779 Page 1 NWX NASA GSFC AUDI CORE Moderator: Nancy Jones August 19, 2010 3:10 pm CT Coordinator: Welcome the meeting over to Ms. Nancy Jones. Ma'am, you may begin. Nancy Jones: Thank you. Good afternoon. My name

  14. Low jitter audio range PLL with ultra low power dissipation

    Microsoft Academic Search

    Fu Luo; Godi Fischer

    2011-01-01

    This paper presents the design of an ultra low power Phase-Locked Loop (PLL) intended for applications in the extended audio range. The PLL is well suited for battery operated systems, where small size and low power operation are crucially important. The presented implementation is based on a current controlled relaxation oscillator, which creates a sawtooth output with a frequency range

  15. Permanent Collection Audio Tour Plants of the World

    E-print Network

    Westneat, Mark W.

    Messages From the Wilderness Nature Walk Lions of Tsavo Africa Inside Ancient Egypt Stanley Field Hall of Tsavo 8 9 10 11 12 13 14 Africa: Benin Bronzes Inside Ancient Egypt: Harwa* Elizabeth Hubert Malott Hall1010 Permanent Collection Audio Tour Plants of the World The Ancient Americas Northwest Coast

  16. Recognition of Instrument Timbres in Real Polytimbral Audio Recordings

    E-print Network

    Ras, Zbigniew W.

    , the proposed systems were validated mainly on audio data obtained through mixing of isolated sounds of musical to the computer's microphone; also, the user might be in mood to listen to jazz with solo trumpet, or classic music with sweet violin sound. More advanced person (a musician) might need scores for the piece

  17. Deep networks for audio event classification in soccer videos

    Microsoft Academic Search

    Lamberto Ballan; Alessio Bazzica; Marco Bertini; Alberto Del Bimbo; Giuseppe Serra

    2009-01-01

    In this work is presented a novel approach for the classifica- tion of audio concepts in broadcast soccer videos using deep belief network (DBN), a probabilistic neural network with several hidden layers. Comparison with support vector ma- chine (SVM) classifiers has been carried on, showing that our preliminary results are promisingly comparable to the state- of-the-art.

  18. What Makes Preschoolers Listen to Narrative Audio Tapes?

    Microsoft Academic Search

    Peter Vorderer; Saskia Böcking; Christoph Klimmt; Ute Ritterfeld

    2006-01-01

    Most communication studies on children and media have focused solely on television. Other popular media products such as narrative audio tapes have been neglected. The present article addresses factors that influence preschoolers' selective exposure to these tapes. In line with past research, the emotional attractiveness of a story's protagonist and some formal design elements of the product are regarded as

  19. Low bit rate transparent audio compression using adapted wavelets

    Microsoft Academic Search

    Deepen Sinha; Ahmed H. Tewfik

    1993-01-01

    Describes a novel wavelet based audio synthesis and coding method. The method uses optimal adaptive wavelet selection and wavelet coefficients quantization procedures together with a dynamic dictionary approach. The adaptive wavelet transform selection and transform coefficient bit allocation procedures are designed to take advantage of the masking effect in human hearing. They minimize the number of bits required to represent

  20. Virtualized Audio as a Distributed Interactive Application Peter A. Dinda

    E-print Network

    Dinda, Peter A.

    participants, singers, musical instruments) from their native acoustical spaces, and insert them into a virtual acoustical space that is shared by a number of listeners. One example of virtualized audio would have become largely deaf to their shortcomings. However, as we attempt to add au- dio to new modes

  1. A survey of packet loss recovery techniques for streaming audio

    Microsoft Academic Search

    Colin Perkins; Orion Hodson; Vicky Hardman

    1998-01-01

    We survey a number of packet loss recovery techniques for streaming audio applications operating using IP multicast. We begin with a discussion of the loss and delay characteristics of an IP multicast channel, and from this show the need for packet loss recovery. Recovery techniques may be divided into two classes: sender- and receiver-based. We compare and contrast several sender-based

  2. AUDIO-DRIVEN HUMAN BODY MOTION ANALYSIS AND SYNTHESIS

    Microsoft Academic Search

    F. Ofli; C. Canton-Ferrer; J. Tilmannec; Y. Demir; E. Bozkurt; E. Erzina Yemez; A. M. Tekalp

    This paper presents a framework for audio-driven human body motion analysis and synthesis. We address the problem in the con- text of a dance performance, where gestures and movements of the dancer are mainly driven by a musical piece and characterized by the repetition of a set of dance figures. The system is trained in a su- pervised manner using

  3. Audio-driven human body motion analysis and synthesis

    Microsoft Academic Search

    Ferda Ofli; Cristian Canton-Ferrer; Joelle Tilmanne; Yasemin Demir; Elif Bozkurt; Yucel Yemez; Engin Erzin; A. Murat Tekalp

    2008-01-01

    This paper presents a framework for audio-driven human body motion analysis and synthesis. We address the problem in the con- text of a dance performance, where gestures and movements of the dancer are mainly driven by a musical piece and characterized by the repetition of a set of dance!gures. The system is trained in a su- pervised manner using the

  4. Secure spread spectrum watermarking for images, audio and video

    Microsoft Academic Search

    I. J. Cox; J. Kilian; T. Leighton; T. Shamoon

    1996-01-01

    We describe a digital watermarking method for use in audio, image, video and multimedia data. We argue that a watermark must be placed in perceptually significant components of a signal if it is to be robust to common signal distortions and malicious attack. However, it is well known that modification of these components can lead to perceptual degradation of the

  5. Listening and Learning: Audio Cassettes at Deakin University.

    ERIC Educational Resources Information Center

    Gough, J. E.

    Student attitudes about using audio cassettes in the course "Images of Man" at Deakin University, Australia, were evaluated in 1979. A total of 192 off-campus and 39 on-campus students responded to a mail questionnaire. Responses indicate the following: students stopped cassettes to take a break and to replay sections; 70 percent listened to…

  6. AUDIO INFORMATION RETRIEVAL USING SEMANTIC SIMILARITY Luke Barrington1

    E-print Network

    Poon, Chung Keung

    with both semantic- and acoustic- based retrieval systems on a sound effects database and show query-by-example for content-based audio information retrieval by ranking items in a database based on semantic similarity, rather than acoustic similarity, to a query example. The retrieval system is based

  7. The Audio-Visual Marketing Handbook for Independent Schools.

    ERIC Educational Resources Information Center

    Griffith, Tom

    This how-to booklet offers specific advice on producing video or slide/tape programs for marketing independent schools. Five chapters present guidelines for various stages in the process: (1) Audio-Visual Marketing in Context (aesthetics and economics of audiovisual marketing); (2) A Question of Identity (identifying the audience and deciding on…

  8. Audio-Visual Communications, A Tool for the Professional

    ERIC Educational Resources Information Center

    Journal of Environmental Health, 1976

    1976-01-01

    The manner in which the Cuyahoga County, Ohio Department of Environmental Health utilizes audio-visual presentations for communication with business and industry, professional public health agencies and the general public is presented. Subjects including food sanitation, radiation protection and safety are described. (BT)

  9. Iterative monaural audio source separation for subspace grouping

    Microsoft Academic Search

    Martin Spiertz; Volker Gnann

    2009-01-01

    Monaural blind audio source separation usually separates a mixture into more signals than active sources. Therefore, a clustering of the separated signals is needed to reconstruct the sources. We propose a new iterative clustering and show that this approach outperforms classical clustering approaches which use features of the separated signals for clustering. The iterative clustering starts with the separation into

  10. A TENTATIVE TYPOLOGY OF AUDIO SOURCE SEPARATION TASKS

    Microsoft Academic Search

    Emmanuel Vincent; Cédric Févotte; Rémi Gribonval; Xavier Rodet; Éric Le Carpentier; Laurent Benaroya; Axel Röbel; Frédéric Bimbot

    2003-01-01

    We propose a preliminary step towards the construction of a global evaluation framework for Blind Audio Source Sep- aration (BASS) algorithms. BASS covers many potential applications that involve a more restricted number of tasks. An algorithm may perform well on some tasks and poorly on others. Various factors affect the difficulty of each task and the criteria that should be

  11. Underdetermined audio source separation using fast parametric decomposition

    Microsoft Academic Search

    A. Aissa-El-Bey; K. Abed-Meraim; Y. Grenier

    2007-01-01

    In this paper, we consider the problem of underdetermined blind source separation using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal) components. Based on this representation, we propose a two steps approach consisting of a signal analysis (extraction of the modal components) followed by a signal synthesis

  12. UNDERDETERMINED BLIND SEPARATION OF AUDIO SOURCES IN TIME FREQUENCY DOMAIN

    Microsoft Academic Search

    K. Abed-Meraim; Y. Grenier

    This paper considers the blind separation of audio sources in the underdetermined case, where we have more sources than sensors. A recent algorithm applies time-frequency distri bu- tions (TFDs) to this problem and gives good separation per- formance in the case where sources are disjoint in the time- frequency (TF) plane. However, in the non-disjoint case, the reconstruction of the

  13. THE EFFECT OF SPEECH AND AUDIO COMPRESSION ON SPEECH RECOGNITION

    E-print Network

    Boyer, Edmond

    THE EFFECT OF SPEECH AND AUDIO COMPRESSION ON SPEECH RECOGNITION PERFORMANCE L. Besacier, C on the performance of our continuous speech recognition engine. GSM full rate, G711, G723.1 and MPEG coders are investigated. It is shown that MPEG transcoding degrades the speech recognition performance for low bitrates

  14. AUDIO SOURCE SEPARATION WITH ONE SENSOR FOR ROBUST SPEECH RECOGNITION

    E-print Network

    Paris-Sud XI, Université de

    AUDIO SOURCE SEPARATION WITH ONE SENSOR FOR ROBUST SPEECH RECOGNITION L. Benaroya, F. Bimbot, G of noise compensa- tion in speech signals for robust speech recognition. Sev- eral classical denoising- perimposed to the voice of the speaker(s). While automatic speech recognition is a rather mature technology

  15. Perceptual Coding of High-Quality Digital Audio

    E-print Network

    Allen, Jont

    -bit-rate, high-quality audio coding formats number in many billions, built into por- table media players, mobile psychoacoustic models. This technology is now abun- dant, with gadgets named after a standard (mp3 players for these systems is based on filterbanks, followed by quantization and coding, controlled by a model of human

  16. Social audio features for advanced music retrieval interfaces

    Microsoft Academic Search

    Michael Kuhn; Roger Wattenhofer; Samuel Welten

    2010-01-01

    The size of personal music collections has constantly increased over the past years. As a result, the traditional metadata based lists to browse these collections have reached their limits. Interfaces that are based on music similarity offer an alternative and thus are increasingly gaining attention. Music similarity is typically either derived from audio-features (objective approach) or from user driven information

  17. GpsTunes: controlling navigation via audio feedback

    Microsoft Academic Search

    Steven Strachan; Parisa Eslambolchilar; Roderick Murray-Smith; Stephen Hughes; Sile O'Modhrain

    2005-01-01

    We combine the functionality of a mobile Global Positioning System (GPS) with that of an MP3 player, implemented on a PocketPC, to produce a handheld system capable of guiding a user to their desired target location via continuously adapted music feedback. We illustrate how the approach to presentation of the audio display can benefit from insights from control theory, such

  18. A Robust Algorithm for Binaural Audio Reproduction Using Loudspeakers

    Microsoft Academic Search

    Wang Jie; Ye Qing-hua; Zheng Cheng-shi; Li Xiao-dong

    2010-01-01

    Crosstalk cancellation system (CCS) is a technique for spatial sound reproduction by using two loudspeakers to deliver binaural audio signals to a listener's ears. The problem of this system is that it is quite sensitive to the listener's head movement. In this paper, we propose a multi-position weighted method, which results in an optimal design for the overall CSS and

  19. Mobile Audio - from MP3 to AAC and further

    Microsoft Academic Search

    Henri Autti; Johnny Biström

    The purpose of this paper is to evaluate the advanced audio codec's and reflect over their suitability for mobile needs of today and tomorrow. The historical development of different codec's for different purposes is analyzed. The features of the most common codec's are discussed in parallel with performance and other criteria. The capabilities of mobile devices and the telecommunication possibilities

  20. Social Audio Features for Advanced Music Retrieval Michael Kuhn

    E-print Network

    Social Audio Features for Advanced Music Retrieval Interfaces Michael Kuhn Computer Engineering and Networks Laboratory ETH Zurich, Switzerland swelten@tik.ee.ethz.ch ABSTRACT The size of personal music acceptance. Categories and Subject Descriptors H.5.1 [Information Systems]: Multimedia Information Systems

  1. Audio-Visual Multimedia Retrieval on Mobile Iftikhar Ahmad1

    E-print Network

    Gabbouj, Moncef

    devices is a challenge. 8.1 Introduction The amount of personal digital information is increasing8 Audio-Visual Multimedia Retrieval on Mobile Devices Iftikhar Ahmad1 and Moncef Gabbouj2 1 Nokia of devices (hand held phones to personal computers). Mobile devices are not only limited in size, shape

  2. Audio-Visual Perception System for a Humanoid Robotic Head

    PubMed Central

    Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M.; Bandera, Juan P.; Romero-Garces, Adrian; Reche-Lopez, Pedro

    2014-01-01

    One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework. PMID:24878593

  3. Audio-Visual Affective Expression Recognition Through Multistream Fused HMM

    Microsoft Academic Search

    Zhihong Zeng; Jilin Tu; Brian M. Pianfetti; Jr .

    2008-01-01

    Advances in computer processing power and emerging algorithms are allowing new ways of envisioning human-computer interaction. Although the benefit of audio-visual fusion is expected for affect recognition from the psychological and engineering perspectives, most of existing approaches to automatic human affect analysis are unimodal: information processed by computer system is limited to either face images or the speech signals. This

  4. SNR Based Audio Watermarking Scheme for Blind Detection

    Microsoft Academic Search

    M. R. Patil; S. D. Apte

    2008-01-01

    Audio watermarking method presented in this paper embeds the watermark data in the wavelet domain and performs the blind detection of the watermark. To embed the watermark the additive watermark embedding approach is used in which scaling parameter is computed based on the SNR (signal to noise ratio). The technique is implemented in DWT domain. 32 times 32 binary image

  5. Adaptive audio watermarking for Indian musical signals by GOS modification

    Microsoft Academic Search

    Meenakshi R. Patil; S. D. Apte

    2009-01-01

    Spread spectrum audio watermarking technique is developed by modifying the group of samples (GOS) in DCT domain. To check the robustness of the technique the watermarked signal is passed through different signal processing attacks and is observed that the watermark remains present after all signal processing attacks. The system is able to hide 1024 data bits in a clip of

  6. Integrated Spacesuit Audio System Enhances Speech Quality and Reduces Noise

    NASA Technical Reports Server (NTRS)

    Huang, Yiteng Arden; Chen, Jingdong; Chen, Shaoyan Sharyl

    2009-01-01

    A new approach has been proposed for increasing astronaut comfort and speech capture. Currently, the special design of a spacesuit forms an extreme acoustic environment making it difficult to capture clear speech without compromising comfort. The proposed Integrated Spacesuit Audio (ISA) system is to incorporate the microphones into the helmet and use software to extract voice signals from background noise.

  7. Audio-Described Educational Materials: Ugandan Teachers' Experiences

    ERIC Educational Resources Information Center

    Wormnaes, Siri; Sellaeg, Nina

    2013-01-01

    This article describes and discusses a qualitative, descriptive, and exploratory study of how 12 visually impaired teachers in Uganda experienced audio-described educational video material for teachers and student teachers. The study is based upon interviews with these teachers and observations while they were using the material either…

  8. Learning from Animated Concept Maps with Concurrent Audio Narration

    ERIC Educational Resources Information Center

    Nesbit, John C.; Adesope, Olusola O.

    2011-01-01

    An animated concept map is a presentation of a network diagram in which nodes and links are sequentially added or modified. An experiment compared learning from animated concept maps and text by randomly assigning 133 undergraduates to study 1 of 4 narrated animations presenting semantically equivalent information accompanied by identical audio

  9. Exploratory Evaluation of Audio Email Technology in Formative Assessment Feedback

    ERIC Educational Resources Information Center

    Macgregor, George; Spiers, Alex; Taylor, Chris

    2011-01-01

    Formative assessment generates feedback on students' performance, thereby accelerating and improving student learning. Anecdotal evidence gathered by a number of evaluations has hypothesised that audio feedback may be capable of enhancing student learning more than other approaches. In this paper we report on the preliminary findings of a…

  10. Audio and Video Reflections to Promote Social Justice

    ERIC Educational Resources Information Center

    Boske, Christa

    2011-01-01

    Purpose: The purpose of this paper is to examine how 15 graduate students enrolled in a US school leadership preparation program understand issues of social justice and equity through a reflective process utilizing audio and/or video software. Design/methodology/approach: The study is based on the tradition of grounded theory. The researcher…

  11. Digital Audio Broadcasting in the Short Wave Bands

    NASA Technical Reports Server (NTRS)

    Vaisnys, Arvydas

    1998-01-01

    For many decades the Short Wae broadcasting service has used high power, double-sideband AM signals to reach audiences far and wide. While audio quality was usually not very high, inexpensive receivers could be used to tune into broadcasts fro distant countries.

  12. Developing a Framework for Effective Audio Feedback: A Case Study

    ERIC Educational Resources Information Center

    Hennessy, Claire; Forrester, Gillian

    2014-01-01

    The increase in the use of technology-enhanced learning in higher education has included a growing interest in new approaches to enhance the quality of feedback given to students. Audio feedback is one method that has become more popular, yet evaluating its role in feedback delivery is still an emerging area for research. This paper is based on a…

  13. An Audio Watermarking Method Based On Molecular Matching Pursuit

    E-print Network

    Paris-Sud XI, Université de

    An Audio Watermarking Method Based On Molecular Matching Pursuit Mathieu Parvaix1 , Sridhar (Sri introduce a new watermarking model combining a joint time frequency (TF) representation using the molecular a psychoacoustic model we can embed a watermark efficiently on the signal. By selecting atoms of TF components

  14. Recognition of blue movies by fusion of audio and video

    Microsoft Academic Search

    Haiqiang Zuo; Ou Wu; Weiming Hu; Bo Xu

    2008-01-01

    Along with the explosive growth of the Internet, comes the proliferation of pornography. Compared with the pornographic texts and images, blue movies can do much harm to children, due to the greater realism and voyeurism of blue movies. In this paper, a framework for recognizing blue movies by fusing the audio and video information is described. A one-class Gaussian mixture

  15. Dynomite: a dynamically organized ink and audio notebook

    Microsoft Academic Search

    Lynn D. Wilcox; Bill N. Schilit

    1997-01-01

    Dynomite is a portable electronic notebook for the capture and retrieval of handwritten and audio notes. The goal of Dynomite is to merge the organization, search, and data acquisition capabilities of a computer with the benefits of a paper-based notebook. Dynomite provides novel solutions in four key problem areas. First, Dynomite uses a casual, low cognitive overhead interface. Second, for

  16. Increasing the capacity of LSB-based audio steganography

    Microsoft Academic Search

    Nedeljko Cvejic; Tapio Seppänen

    2002-01-01

    Conventionally, a perceptual limit of three bits per sample is imposed to the basic LSB audio steganography method. In this paper, we present a novel modification to standard LSB algorithm that is able to embed four bits per sample, thus improving the capacity of data hiding channel by 33%. The proposed algorithm makes use of minimum error replacement method for

  17. A SPATIAL AUDIO USER INTERFACE FOR GENERATING MUSIC PLAYLISTS

    Microsoft Academic Search

    Jarmo Hiipakka; Gaëtan Lorho

    2003-01-01

    ABSTRACT This paper presents a user ,interface (UI) designed for non-visual interaction with a music collection. The UI system can be utilized to navigate a large list of musical items organized in a hierarchical structure, and to generate personal playlists. The interface only relies on stereo audio output and tactile input using a limited small of keys. This interaction scheme,enables

  18. A Probabilistic Model for Music Recommendation Considering Audio Features

    Microsoft Academic Search

    Qing Li; Sung-hyon Myaeng; Byeong Man Kim

    2005-01-01

    In order to make personalized recommendations, many collaborative music recommender systems (CMRS) focused on capturing precise similarities among users or items based on user historical ratings. Despite the valuable in- formation from audio features of music itself, however, few studies have inves- tigated how to directly extract and utilize information from music for personal- ized recommendation in CMRS. In this

  19. PATTERN EXTRACTION IN SPARSE REPRESENTATIONS WITH APPLICATION TO AUDIO CODING

    Microsoft Academic Search

    Ramin Pichevar; Hossein Najaf-Zadeh

    2009-01-01

    1. ABSTRACT This article deals with the extraction of frequency-domain auditory objects in sparse representations. To do so, we first generate sparse audio representations we call spikegrams, based on neural spikes using gammatone\\/gammachirp ker- nels and matching pursuit. We then propose a method to extract frequent auditory objects (patterns) in the afore- mentioned sparse representations. The extracted frequency- domain patterns

  20. Enactive Mandala: Audio-visualizing Brain Waves Tomohiro Tokunaga

    E-print Network

    Lyons, Michael J.

    and animated visual music. Trans- parent real-time audio-visual feedback of brainwave quali- ties supports, this has largely reduced technical and financial barriers to getting involved in research on brainwaves a constructive approach to understanding brainwave data in the context of musical/artistic expression. Real

  1. On-demand soundscape generation using spatial audio mixing

    Microsoft Academic Search

    Satoshi Innami; Hiroyuki Kasai

    2011-01-01

    This paper proposes an on-demand soundscape generation and provisioning for a user to experience a real world in a requested remote place. This generation is achieved by spatial audio mixing considering a real world condition like geographical features or townscapes as well as dynamic situation such as town events or weather. The proposed velocity vector-based clustering can reduce the cost

  2. RESERVOIR COMPUTING: A POWERFUL BLACKBOX FRAMEWORK FOR NONLINEAR AUDIO PROCESSING

    Microsoft Academic Search

    Georg Holzmann

    2009-01-01

    This paper proposes reservoir computing as a general framework for nonlinear audio processing. Reservoir computing is a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. It can in theory approximate arbitrary nonlinear dynamical systems with arbitrary precision, has an inherent temporal processing capability and is therefore well suited for many

  3. Investigating Electrophysiology for Measuring Emotions Triggered by Audio Stimuli

    E-print Network

    Paris-Sud XI, Université de

    Investigating Electrophysiology for Measuring Emotions Triggered by Audio Stimuli F. Mazza LUNAM.perreiradasilva,patrick.lecallet}@univ-nantes.fr Abstract--Multimedia quality evaluation recently started to take into account also analysis of emotional-assessed affective reports are commonly used for this purpose. Nevertheless, measuring emotions via physiological

  4. MULTIOBJECTIVES GENETIC SNAKES: APPLICATION ON AUDIO-VISUAL SPEECH RECOGNITION

    E-print Network

    Coello, Carlos A. Coello

    MULTIOBJECTIVES GENETIC SNAKES: APPLICATION ON AUDIO-VISUAL SPEECH RECOGNITION Renaud SĂ©guier.Seguier@supelec.fr, Nicolas.Cladel@supelec.fr Abstract: We propose in this article a new optimization of Genetic Snakes (GS): Multiobjectives Genetics Snakes (MGS) faster and simpler to implement. They enable us to make converge two snakes

  5. Audio compression with non-uniform modulated complex lapped transform 

    E-print Network

    Scheuble, Anne-Sophie Maud

    2000-01-01

    that satisfy low bit rates as well as high fidelity reproduction have recently emerged. However, pre-echoes subsist when processing transient sounds. This thesis describes a new audio coder, whose principal objective is to alleviate the pre-echo problem...

  6. Evaluating preservation strategies for audio and video files

    E-print Network

    Rauber,Andreas

    Evaluating preservation strategies for audio and video files Carl Rauch1 , Franz Pavuza2 , Stephan and the wide range of preservation strategies make the choice for an opti- mal preservation solution a highly and on the evaluation of different input formats for long-term preservation. 1 Introduction The long-term preservation

  7. NAVA: Tying In to the Information Machine

    ERIC Educational Resources Information Center

    McIntyre, Joe

    1975-01-01

    The article describes the types of memberships and the services (conventions, publications, and workshops) of the National Audio-Visual Association (NAVA), a dealer organization, emphasizing their availability and importance to manufacturers and users of audio-visual equipment. (MS)

  8. Drilling for Weird Life

    NSDL National Science Digital Library

    Henry Bortman

    This magazine article introduces the Mars Analog Research and Technology Experiment (MARTE). Featuring an interview with NASA scientist Carol Stoker, the article describes Rio Tinto, a river in Spain with highly acidic water the color or red wine, and explains why scientists are looking to the subsurface pyrite deposits near this river's edge for signs of microbial life. Stoker describes the field site and discusses some of the research team's early results. This is the first of a four-part interview series. The resource includes images from Rio Tinto and the Mars project, links to related web sites, and an MP3 Audio Machine text-to-speech option.

  9. Coping with Contamination

    NSDL National Science Digital Library

    Bortman, Henry

    This magazine article features an interview with Mars Analog Research and Technology Experiment (MARTE) scientist Carol Stoker. In this third session of the four-part series, Stoker describes how the MARTE team avoids contaminating their drill-core samples. Her team is drilling into the pyrite subsurface of Spain's Rio Tinto in search for microbes existing in an iron-sulfur-based energy system, similar to that of Mars. She describes the technical challenges to be faced in the waterless environment of other-world drilling. The resource includes images from the Mars rover project, links to related web sites, and an MP3 Audio Machine text-to-speech option.

  10. Deutsch Durch Audio-Visuelle Methode: An Audio-Lingual-Oral Approach to the Teaching of German.

    ERIC Educational Resources Information Center

    Dickinson Public Schools, ND. Instructional Media Center.

    This teaching guide, designed to accompany Chilton's "Deutsch Durch Audio-Visuelle Methode" for German 1 and 2 in a three-year secondary school program, focuses major attention on the operational plan of the program and a student orientation unit. A section on teaching a unit discusses four phases: (1) presentation, (2) explanation, (3)…

  11. Transcript of Audio Narrative Portion of: Scandinavian Heritage. A Set of Five Audio-Visual Film Strip/Cassette Presentations.

    ERIC Educational Resources Information Center

    Anderson, Gerald D.; Olson, David B.

    The document presents the transcript of the audio narrative portion of approximately 100 interviews with first and second generation Scandinavian immigrants to the United States. The document is intended for use by secondary school classroom teachers as they develop and implement educational programs related to the Scandinavian heritage in…

  12. Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap

    PubMed Central

    Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin’Ya

    2013-01-01

    It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap. PMID:23658549

  13. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, SPECIAL ISSUE ON DATA MINING OF SPEECH, AUDIO AND DIALOG 1 Mining Customer Care Dialogs for ``Daily News''

    E-print Network

    Volinsky, Chris

    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, SPECIAL ISSUE ON DATA MINING OF SPEECH, AUDIO describes the ``VoiceTone Daily News'' data mining tool for analyzing this information and presenting turns. Index Terms--- speech mining, data mining, spoken dialog systems, dialog success, business

  14. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, SPECIAL ISSUE ON DATA MINING OF SPEECH, AUDIO AND DIALOG 1 Mining Customer Care Dialogs for "Daily News"

    E-print Network

    Volinsky, Chris

    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, SPECIAL ISSUE ON DATA MINING OF SPEECH, AUDIO describes the "VoiceTone Daily News" data mining tool for analyzing this information and presenting turns. Index Terms-- speech mining, data mining, spoken dialog systems, dialog success, business

  15. Machine tool locator

    DOEpatents

    Hanlon, John A. (Los Alamos, NM); Gill, Timothy J. (Stanley, NM)

    2001-01-01

    Machine tools can be accurately measured and positioned on manufacturing machines within very small tolerances by use of an autocollimator on a 3-axis mount on a manufacturing machine and positioned so as to focus on a reference tooling ball or a machine tool, a digital camera connected to the viewing end of the autocollimator, and a marker and measure generator for receiving digital images from the camera, then displaying or measuring distances between the projection reticle and the reference reticle on the monitoring screen, and relating the distances to the actual position of the autocollimator relative to the reference tooling ball. The images and measurements are used to set the position of the machine tool and to measure the size and shape of the machine tool tip, and examine cutting edge wear. patent

  16. Fault Tolerant State Machines

    NASA Technical Reports Server (NTRS)

    Burke, Gary R.; Taft, Stephanie

    2004-01-01

    State machines are commonly used to control sequential logic in FPGAs and ASKS. An errant state machine can cause considerable damage to the device it is controlling. For example in space applications, the FPGA might be controlling Pyros, which when fired at the wrong time will cause a mission failure. Even a well designed state machine can be subject to random errors us a result of SEUs from the radiation environment in space. There are various ways to encode the states of a state machine, and the type of encoding makes a large difference in the susceptibility of the state machine to radiation. In this paper we compare 4 methods of state machine encoding and find which method gives the best fault tolerance, as well as determining the resources needed for each method.

  17. Quantum Learning Machine

    E-print Network

    Jeongho Bang; James Lim; M. S. Kim; Jinhyoung Lee

    2008-03-31

    We propose a novel notion of a quantum learning machine for automatically controlling quantum coherence and for developing quantum algorithms. A quantum learning machine can be trained to learn a certain task with no a priori knowledge on its algorithm. As an example, it is demonstrated that the quantum learning machine learns Deutsch's task and finds itself a quantum algorithm, that is different from but equivalent to the original one.

  18. Bike machine energy education

    Microsoft Academic Search

    Sanford Jay Rotter; James Lee Ravenscroft; Raul Gonzalez

    2010-01-01

    A bike-powered machine offers physical kinesthetic learning and play. This is equivalent to past generation's playing with Erector Sets to building Heathkits [1], [2]. The excitement of riding a bike machine offers a different path to learning that turns a short-term memory into a long-term memory. Bike machines provide energy education to the public in order for them to understand

  19. Machine Learning and Agents

    Microsoft Academic Search

    Piotr J?drzejowicz

    \\u000a The paper reviews current research results integrating machine learning and agent technologies. Although complementary solutions\\u000a from both fields are discussed the focus is on using agent technology in the field of machine learning with a particular interest\\u000a on applying agent-based solutions to supervised learning. The paper contains a short review of applications, in which machine\\u000a learning methods have been used

  20. Perspex machine II: visualization

    NASA Astrophysics Data System (ADS)

    Anderson, James A. D. W.

    2004-12-01

    We review the perspex machine and improve it by reducing its halting conditions to one condition. We also introduce a data structure, called the "access column," that can accelerate a wide class of perspex programs. We show how the perspex can be visualised as a tetrahedron, artificial neuron, computer program, and as a geometrical transformation. We discuss the temporal properties of the perspex machine, dissolve the famous time travel paradox, and present a hypothetical time machine. Finally, we discuss some mental properties and show how the perspex machine solves the mind-body problem and, specifically, how it provides one physical explanation for the occurrence of paradigm shifts.

  1. Perspex machine II: visualization

    NASA Astrophysics Data System (ADS)

    Anderson, James A. D. W.

    2005-01-01

    We review the perspex machine and improve it by reducing its halting conditions to one condition. We also introduce a data structure, called the "access column," that can accelerate a wide class of perspex programs. We show how the perspex can be visualised as a tetrahedron, artificial neuron, computer program, and as a geometrical transformation. We discuss the temporal properties of the perspex machine, dissolve the famous time travel paradox, and present a hypothetical time machine. Finally, we discuss some mental properties and show how the perspex machine solves the mind-body problem and, specifically, how it provides one physical explanation for the occurrence of paradigm shifts.

  2. Chaotic Boltzmann machines.

    PubMed

    Suzuki, Hideyuki; Imura, Jun-ichi; Horio, Yoshihiko; Aihara, Kazuyuki

    2013-01-01

    The chaotic Boltzmann machine proposed in this paper is a chaotic pseudo-billiard system that works as a Boltzmann machine. Chaotic Boltzmann machines are shown numerically to have computing abilities comparable to conventional (stochastic) Boltzmann machines. Since no randomness is required, efficient hardware implementation is expected. Moreover, the ferromagnetic phase transition of the Ising model is shown to be characterised by the largest Lyapunov exponent of the proposed system. In general, a method to relate probabilistic models to nonlinear dynamics by derandomising Gibbs sampling is presented. PMID:23558425

  3. Parallel Kinematic Machines (PKM)

    SciTech Connect

    Henry, R.S.

    2000-03-17

    The purpose of this 3-year cooperative research project was to develop a parallel kinematic machining (PKM) capability for complex parts that normally require expensive multiple setups on conventional orthogonal machine tools. This non-conventional, non-orthogonal machining approach is based on a 6-axis positioning system commonly referred to as a hexapod. Sandia National Laboratories/New Mexico (SNL/NM) was the lead site responsible for a multitude of projects that defined the machining parameters and detailed the metrology of the hexapod. The role of the Kansas City Plant (KCP) in this project was limited to evaluating the application of this unique technology to production applications.

  4. On-Machine Acceptance

    SciTech Connect

    Arnold, K.F.

    2000-02-14

    Probing processes are used intermittently and not effectively as an on-line measurement device. This project was needed to evolve machine probing from merely a setup aid to an on-the-machine inspection system. Use of probing for on-machine inspection would significantly decrease cycle time by elimination of the need for first-piece inspection (at a remote location). Federal Manufacturing and Technologies (FM and T) had the manufacturing facility and the ability to integrate the system into production. The Contractor had a system that could optimize the machine tool to compensate for thermal growth and related error.

  5. Characterization of HF Propagation for Digital Audio Broadcasting

    NASA Technical Reports Server (NTRS)

    Vaisnys, Arvydas

    1997-01-01

    The purpose of this presentation is to give a brief overview of some propagation measurements in the Short Wave (3-30 MHz) bands, made in support of a digital audio transmission system design for the Voice of America. This task is a follow on to the Digital Broadcast Satellite Radio task, during which several mitigation techniques would be applicable to digital audio in the Short Wave bands as well, in spite of the differences in propagation impairments in these two bands. Two series of propagation measurements were made to quantify the range of impairments that could be expected. An assessment of the performance of a prototype version of the receiver was also made.

  6. Quantization Audio Watermarking with Optimal Scaling on Wavelet Coefficients

    E-print Network

    Chen, S -T; Tu, S -Y

    2011-01-01

    In recent years, discrete wavelet transform (DWT) provides an useful platform for digital information hiding and copyright protection. Many DWT-based algorithms for this aim are proposed. The performance of these algorithms is in term of signal-to-noise ratio (SNR) and bit-error-rate (BER) which are used to measure the quality and the robustness of an embedded audio. However, there is a tradeoff relationship between the embedded-audio quality and robustness. The tradeoff relationship is a signal processing problem in the wavelet domain. To solve this problem, this study presents an optimization-based scaling scheme using optimal multi-coefficients quantization in the wavelet domain. Firstly, the multi-coefficients quantization technique is rewritten as an equation with arbitrary scaling on DWT coefficients and set SNR to be a performance index. Then, a functional connecting the equation and the performance index is derived. Secondly, Lagrange Principle is used to obtain the optimal solution. Thirdly, the scal...

  7. Web-enabled 3D talking avatars based on WebGL and HTML5

    E-print Network

    Beskow, Jonas

    Web-enabled 3D talking avatars based on WebGL and HTML5 Jonas Beskow and Kalin Stefanov KTH Speech, in synchrony with text-to-speech synthesis, played back us- ing HTML5 audio functionalty. The implementation in the browser. Keywords: talking avatar, webGL, html5, text-to-speech 1 Introduction The web as a platform

  8. Hybrid machining of Inconel 718

    Microsoft Academic Search

    Z. Y Wang; K. P Rajurkar; J Fan; S Lei; Y. C Shin; G Petrescu

    2003-01-01

    A new approach for machining of Inconel 718 is presented in this paper. It combines traditional turning with cryogenically enhanced machining and plasma enhanced machining. Cryogenically enhanced machining is used to reduce the temperatures in the cutting tool, and thus reduces temperature-dependent tool wear to prolong tool life, whereas plasma enhanced machining is used to increase the temperatures in the

  9. High quality audio transform coding at 64 kbit\\/s

    Microsoft Academic Search

    Yannick Mahieux

    1992-01-01

    This paper presents a transform coding algorithm designed for audio coding at a bit rate of 64 kbit\\/s. It enables the transmission\\u000a of a high quality stereo sound through the 2B channels of isdn. Although a complete system including framing, synchronization\\u000a and error correction has been developed, only the bit rate compression algorithm is described here. A detailed analysis of

  10. Coding Overcomplete Representations of Audio Using the MCLT

    Microsoft Academic Search

    Byung-jun Yoon; Henrique S. Malvar

    2008-01-01

    We propose a system for audio coding using the modulated complex lapped transform (MCLT). In general, it is difficult to encode signals using overcomplete representations without avoiding a penalty in rate-distortion performance. We show that the penalty can be significantly reduced for MCLT-based representations, without the need for iterative methods of sparsity reduction. We achieve that via a magnitude-phase polar

  11. An active development environment for structured audio performance and composition

    Microsoft Academic Search

    M. Alchin

    2000-01-01

    Catnip Audio is a small group of musicians and coders from the UK, US, and who have joined to work on a new music program project titled ReTrack. ReTrack is a music entry system based initially on the trackers of the Internet music scene, but expanded to include a comprehensive set of MIDI functions as well as incorporate MPEG-4 Structured

  12. Audio-Assisted Memory Training with Early Alzheimer's Patients

    Microsoft Academic Search

    Sharon M. Arkin

    1993-01-01

    Two consecutive memory training interventions using an audio cassette recorder to present personally significant narrative information and interactive quizzes were administered to two early stage Alzheimer's patients (Folstein Mini-Mental State scores: 23). Training sessions were held 2-4 times a day for six consecutive days. Each 3-day cycle was preceded by a pre-test and followed by a post-test. The pre-test, post-test,

  13. A Distributed Real-Time MPEG Video Audio Player

    Microsoft Academic Search

    Shanwei Cen; Calton Pu; Richard Staehli; Crispin Cowan; Jonathan Walpole

    1995-01-01

    . This paper presents the design, implementation and experimentalanalysis of a distributed, real-time MPEG video and audio player.The player is designed for use across the Internet, a shared environmentwith variable traffic and with great diversity in network bandwidth andhost processing speed. We use a novel toolkit approach to build softwarefeedback mechanisms for client\\/server synchronization, dynamicQuality-of-Service control, and system adaptiveness. Our

  14. Audio classification using acoustic images for retrieval from multimedia databases

    Microsoft Academic Search

    Ioannis Paraskevas; Edward Chilton

    2003-01-01

    With the increasing use of audio-visual databases, the need for automatic content-based classification has grown in importance. In this paper, a novel method for the automatic recognition of acoustic utterances is presented using acoustic images as the basis for the feature extraction. This method effectively employs the spectrogram, the Wigner-Ville distribution and co-occurrence matrices. The images are then compressed, using

  15. An Improved Technique for Blind Audio Source Separation

    Microsoft Academic Search

    Namgook Cho; Yu Shiu; C.-C. Jay Kuo

    2006-01-01

    A blind audio source separation technique with an ill-posed mixing matrix and additive noise is proposed in this work. With this technique, we divide the solution into two steps. The first step is to estimate the ill-posed mixing matrix and the second step is to separate original sources. To estimate the ill-posed mixing matrix, an enhanced soft-assignment method is used

  16. An Improved Technique for Blind Audio Source Separation

    Microsoft Academic Search

    Namgook Cho; Yu Shiu; C. C. Jay Kuo

    2006-01-01

    A blind audio source separation technique with an ill-posed mixing matrix and additive noise is proposed in this work. With this technique, we divide the solution into two steps. The first step is to estimate the ill-posed mixing matrix and the second step is to separate orig- inal sources. To estimate the ill-posed mixing matrix, an enhanced soft-assignment method is

  17. Open loop rate-distortion optimized audio coding

    Microsoft Academic Search

    Fredrik Nordén; Mads Grćsbřll Christensen; S. H. Jensen

    2005-01-01

    The paper addresses complexity reduced rate-distortion optimized audio coding under rate constraint. A technique where distortion minimizing coding templates, chosen from a set of templates, are jointly selected for a set of segments. This optimization requires knowledge of rate-distortion pairs for all segments, and for each coding template, which is often costly to obtain. The proposed framework exchanges true rate-distortion

  18. An audio\\/video surveillance system for wildlife

    Microsoft Academic Search

    Roman Gula; Jörn Theuerkauf; Sophie Rouys; Andrew Legault

    2010-01-01

    We report 7 years of experience with an inexpensive and reliable continuous audio\\/video recording system. The main components\\u000a of the system are commercial, infrared illuminator surveillance cameras, mini microphones and portable digital video recorders,\\u000a powered by deep cycle lead-acid batteries. We used the system for monitoring 41 broods of four endemic bird species in tropical\\u000a rainforests of New Caledonia. We recorded

  19. Fault-tolerant Ethernet networks with Audio and Video Bridging

    Microsoft Academic Search

    Oliver Kleineberg; Peter Frohlich; Donal Heffernan

    2011-01-01

    Industrial Ethernet networks are now a common feature on today’s factory floors. Vendor-specific technologies, such as Profinet IRT, have demonstrated Ethernet networks with hard real-time (RT) properties. Specified by the IEEE, the Audio and Video Bridging (AVB) technology promises a standardized approach to RT Ethernet. However, AVB has been conceived for other application fields, e.g. home entertainment systems. Several aspects

  20. A new bandwidth scalable wideband speech\\/audio coder

    Microsoft Academic Search

    Kyung Tae Kim; Sung Kyo Jung; Young Cheol Park; Dae Hee Youn

    2002-01-01

    In this paper, we present a new bandwidth-scalable coder for wide band speech and audio signals. The proposed coder splits 8 kHz signal bandwidth into two narrow bands, and different coding schemes are applied to each band. The lower-band speech is coded with ITU-T G.729 Annex E, and the higher-band signal is compressed using a new algorithm based on the

  1. Automatic Sports Video Analysis using Audio Clues and Context Knowledge

    Microsoft Academic Search

    Weilun Lao; Jungong Han

    2006-01-01

    Sports analysis has recently become popular in research and professional applications. This paper presents a scheme for automatic sports video analysis based on audio clues and specific game context knowledge. We propose a simple, two-step racket-hit detection for achieving accurate event classification for tennis video. To implement the mapping between the sample-level feature space and the semantic-level space, we employ

  2. SWAicons: spoken web audio icons - design, implications and evaluation

    Microsoft Academic Search

    Saurabh Srivastava; Nitendra Rajput; Gururaj Mahajan

    2012-01-01

    In this paper, we extend the concepts of audio-icons to provide auditory cues, for improving navigation in Spoken Web for low-literate population in rural India. The SWAicons are a suite of contextual auditory elements, which act as acoustic clues to a user. The SWAicons are expected to increase the user perception for the content and the structure of VoiceSite.

  3. Efficient Query-by-Content Audio Retrieval by Locality Sensitive Hashing and Partial Sequence Comparison

    NASA Astrophysics Data System (ADS)

    Yu, Yi; Joe, Kazuki; Downie, J. Stephen

    This paper investigates suitable indexing techniques to enable efficient content-based audio retrieval in large acoustic databases. To make an index-based retrieval mechanism applicable to audio content, we investigate the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison. We propose a fast and efficient audio retrieval framework of query-by-content and develop an audio retrieval system. Based on this framework, four different audio retrieval schemes, LSH-Dynamic Programming (DP), LSH-Sparse DP (SDP), Exact Euclidian LSH (E2LSH)-DP, E2LSH-SDP, are introduced and evaluated in order to better understand the performance of audio retrieval algorithms. The experimental results indicate that compared with the traditional DP and the other three compititive schemes, E2LSH-SDP exhibits the best tradeoff in terms of the response time, retrieval accuracy and computation cost.

  4. High capacity reversible watermarking for audio by histogram shifting and predicted error expansion.

    PubMed

    Wang, Fei; Xie, Zhaoxin; Chen, Zuo

    2014-01-01

    Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability. PMID:25097883

  5. High Capacity Reversible Watermarking for Audio by Histogram Shifting and Predicted Error Expansion

    PubMed Central

    Wang, Fei; Chen, Zuo

    2014-01-01

    Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability. PMID:25097883

  6. AudioSense: Enabling Real-time Evaluation of Hearing Aid Technology In-Situ

    PubMed Central

    Hasan, Syed Shabih; Lai, Farley; Chipara, Octav; Wu, Yu-Hsiang

    2014-01-01

    AudioSense integrates mobile phones and web technology to measure hearing aid performance in real-time and in-situ. Measuring the performance of hearing aids in the real world poses significant challenges as it depends on the patient's listening context. AudioSense uses Ecological Momentary Assessment methods to evaluate both the perceived hearing aid performance as well as to characterize the listening environment using electronic surveys. AudioSense further characterizes a patient's listening context by recording their GPS location and sound samples. By creating a time-synchronized record of listening performance and listening contexts, AudioSense will allow researchers to understand the relationship between listening context and hearing aid performance. Performance evaluation shows that AudioSense is reliable, energy-efficient, and can estimate Signal-to-Noise Ratio (SNR) levels from captured audio samples. PMID:25013874

  7. 14. Interior, Machine Shop, Roundhouse Machine Shop Extension, Southern Pacific ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    14. Interior, Machine Shop, Roundhouse Machine Shop Extension, Southern Pacific Railroad Carlin Shops, view to north (90mm lens). - Southern Pacific Railroad, Carlin Shops, Roundhouse Machine Shop Extension, Foot of Sixth Street, Carlin, Elko County, NV

  8. BRITISH MOLDING MACHINE, PBQ AUTOMATIC COPE AND DRAG MOLDING MACHINE ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    BRITISH MOLDING MACHINE, PBQ AUTOMATIC COPE AND DRAG MOLDING MACHINE MAKES BOTH MOLD HALVES INDIVIDUALLY WHICH ARE LATER ROTATED, ASSEMBLED, AND LOWERED TO POURING CONVEYORS BY ASSISTING MACHINES. - Southern Ductile Casting Company, Casting, 2217 Carolina Avenue, Bessemer, Jefferson County, AL

  9. NFL Films audio, video, and film production facilities

    NASA Astrophysics Data System (ADS)

    Berger, Russ; Schrag, Richard C.; Ridings, Jason J.

    2003-04-01

    The new NFL Films 200,000 sq. ft. headquarters is home for the critically acclaimed film production that preserves the NFL's visual legacy week-to-week during the football season, and is also the technical plant that processes and archives football footage from the earliest recorded media to the current network broadcasts. No other company in the country shoots more film than NFL Films, and the inclusion of cutting-edge video and audio formats demands that their technical spaces continually integrate the latest in the ever-changing world of technology. This facility houses a staggering array of acoustically sensitive spaces where music and sound are equal partners with the visual medium. Over 90,000 sq. ft. of sound critical technical space is comprised of an array of sound stages, music scoring stages, audio control rooms, music writing rooms, recording studios, mixing theaters, video production control rooms, editing suites, and a screening theater. Every production control space in the building is designed to monitor and produce multi channel surround sound audio. An overview of the architectural and acoustical design challenges encountered for each sophisticated listening, recording, viewing, editing, and sound critical environment will be discussed.

  10. Drum cutter mining machine

    SciTech Connect

    Oberste-beulmann, K.; Schupphaus, H.

    1980-02-19

    A drum cutter mining machine includes a machine frame with a winch having a drive wheel to engage a rack or chain which extends along the path of travel by the mining machine to propel the machine along a mine face. The mining machine is made up of discrete units which include a machine body and machine housings joined to opposite sides of the machine body. The winch is either coupled through a drive train with a feed drive motor or coupled to the drive motor for cutter drums. The machine housings each support a pivot shaft coupled by an arm to a drum cutter. One of these housings includes a removable end cover and a recess adapted to receive a support housing for a spur gear system used to transmit torque from a feed drive motor to a reduction gear system which is, in turn, coupled to the drive wheel of the winch. In one embodiment, a removable end cover on the machine housing provides access to the feed drive motor. The feed drive motor is arranged so that the rotational axis of its drive output shaft extends transversely to the stow side of the machine frame. In another embodiment, the reduction gear system is arranged at one side of the pivot shaft for the cutter drum while the drive motor therefor is arranged at the other side of the pivot shaft and coupled thereto through the spur gear system. In a further embodiment, the reduction gear system is disposed between the feed motor and the pivot shaft.

  11. Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition

    Microsoft Academic Search

    Jana Trojanová; Marek Hrúz; Pavel Campr; Milos Zelezný

    2008-01-01

    In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual speech recognition system. The name of the database is UWB-07-ICAVR, where ICAVR stands for Impaired Condition Audio Visual speech Recognition. The corpus consist of 10000 utterances of continuous speech obtained from 50 speakers.

  12. Spatial Audio on the Web: Or Why Can't I hear Anything Over There?

    NASA Technical Reports Server (NTRS)

    Wenzel, Elizabeth M.; Schlickenmaier, Herbert (Technical Monitor); Johnson, Gerald (Technical Monitor); Frey, Mary Anne (Technical Monitor); Schneider, Victor S. (Technical Monitor); Ahunada, Albert J. (Technical Monitor)

    1997-01-01

    Auditory complexity, freedom of movement and interactivity is not always possible in a "true" virtual environment, much less in web-based audio. However, a lot of the perceptual and engineering constraints (and frustrations) that researchers, engineers and listeners have experienced in virtual audio are relevant to spatial audio on the web. My talk will discuss some of these engineering constraints and their perceptual consequences, and attempt to relate these issues to implementation on the web.

  13. Audio object individual operation and its application to earphone leakage noise reduction

    Microsoft Academic Search

    Shota Suzuki; Shigeki Miyabe; Noriyoshi Kamado; Hiroshi Saruwatari; Kiyohiro Shikano; Toshiyuki Nomura

    2010-01-01

    In this paper, we propose a new extension framework of multichannel audio coding based on temporal quantization of spatial information. In our previous study, multiple-audio-object signal can be encoded\\/decoded via prototypes of directional clustering for each audio object. This paper, first, pays attention to the fact that quantized information corresponds to the spatial image of each sound object, and is

  14. ViSQOLAudio: An objective audio quality metric for low bitrate codecs.

    PubMed

    Hines, Andrew; Gillen, Eoin; Kelly, Damien; Skoglund, Jan; Kokaram, Anil; Harte, Naomi

    2015-06-01

    Streaming services seek to optimise their use of bandwidth across audio and visual channels to maximise the quality of experience for users. This letter evaluates whether objective quality metrics can predict the audio quality for music encoded at low bitrates by comparing objective predictions with results from listener tests. Three objective metrics were benchmarked: PEAQ, POLQA, and VISQOLAudio. The results demonstrate objective metrics designed for speech quality assessment have a strong potential for quality assessment of low bitrate audio codecs. PMID:26093454

  15. Diamond machine tool face lapping machine

    DOEpatents

    Yetter, H.H.

    1985-05-06

    An apparatus for shaping, sharpening and polishing diamond-tipped single-point machine tools. The isolation of a rotating grinding wheel from its driving apparatus using an air bearing and causing the tool to be shaped, polished or sharpened to be moved across the surface of the grinding wheel so that it does not remain at one radius for more than a single rotation of the grinding wheel has been found to readily result in machine tools of a quality which can only be obtained by the most tedious and costly processing procedures, and previously unattainable by simple lapping techniques.

  16. Millikelvin Lab Machine Shop

    E-print Network

    McQuade, D. Tyler

    Millikelvin Lab OP105­112 Machine Shop OP132 Resistive Magnet Shop CICC Winding Area Transformers Transformers Part Shop OP128 Dock Control Room 45 T Physical Plant Helium Recovery System MagnetCells Shipping This building is home to the Millikelvin lab, the control room, the resistive magnet and machine shops, the CICC

  17. Precision laser machining program

    Microsoft Academic Search

    L. N. Durvasula

    2001-01-01

    High brightness, diode pumped solid state laser technology has now progressed to a point that it is envisioned as the next-generation industrial laser. In response, the Precision Laser Machining (PLM) Consortium was formed as part of the US Defense Advanced Research Project Agency (DARPA) Technology Reinvestment Project. The goal of PLM is to develop a new generation of laser machine

  18. Knowledge Based Machine Translation

    Microsoft Academic Search

    Ghulam Rasool Tahir; Sohail Asghar; Nayyer Masood

    2010-01-01

    Machine translation, a part of computational Linguistics, belongs to Natural Language Processing (NLP) and is a hot issue in the computational society. Gap between the linguist and the computer programmer, gives birth to so many problems like lexical ambiguity, syntactic and structural ambiguity, polysemy, induction, discourses, anaphoric ambiguity and different shade of meanings. Mostly English-to-Urdu machine translation systems were developed

  19. Machine Translation Project

    NASA Technical Reports Server (NTRS)

    Bajis, Katie

    1993-01-01

    The characteristics and capabilities of existing machine translation systems were examined and procurement recommendations were developed. Four systems, SYSTRAN, GLOBALINK, PC TRANSLATOR, and STYLUS, were determined to meet the NASA requirements for a machine translation system. Initially, four language pairs were selected for implementation. These are Russian-English, French-English, German-English, and Japanese-English.

  20. The Set Covering Machine

    Microsoft Academic Search

    Mario Marchand; John Shawe-taylor

    2002-01-01

    We extend the classical algorithms of Valiant and Haussler for learning compact conjunc- tions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general- purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a