based audio steganography: Topics by Science.gov

Sample records for based audio steganography

A multi-layer steganographic method based on audio time domain segmented and network steganography

NASA Astrophysics Data System (ADS)

Xue, Pengfei; Liu, Hanlin; Hu, Jingsong; Hu, Ronggui

2018-05-01

Both audio steganography and network steganography are belong to modern steganography. Audio steganography has a large capacity. Network steganography is difficult to detect or track. In this paper, a multi-layer steganographic method based on the collaboration of them (MLS-ATDSS&NS) is proposed. MLS-ATDSS&NS is realized in two covert layers (audio steganography layer and network steganography layer) by two steps. A new audio time domain segmented steganography (ATDSS) method is proposed in step 1, and the collaboration method of ATDSS and NS is proposed in step 2. The experimental results showed that the advantage of MLS-ATDSS&NS over others is better trade-off between capacity, anti-detectability and robustness, that means higher steganographic capacity, better anti-detectability and stronger robustness.
Audio Steganography with Embedded Text

NASA Astrophysics Data System (ADS)

Teck Jian, Chua; Chai Wen, Chuah; Rahman, Nurul Hidayah Binti Ab.; Hamid, Isredza Rahmi Binti A.

2017-08-01

Audio steganography is about hiding the secret message into the audio. It is a technique uses to secure the transmission of secret information or hide their existence. It also may provide confidentiality to secret message if the message is encrypted. To date most of the steganography software such as Mp3Stego and DeepSound use block cipher such as Advanced Encryption Standard or Data Encryption Standard to encrypt the secret message. It is a good practice for security. However, the encrypted message may become too long to embed in audio and cause distortion of cover audio if the secret message is too long. Hence, there is a need to encrypt the message with stream cipher before embedding the message into the audio. This is because stream cipher provides bit by bit encryption meanwhile block cipher provide a fixed length of bits encryption which result a longer output compare to stream cipher. Hence, an audio steganography with embedding text with Rivest Cipher 4 encryption cipher is design, develop and test in this project.
Exploring the Implementation of Steganography Protocols on Quantum Audio Signals

NASA Astrophysics Data System (ADS)

Chen, Kehan; Yan, Fei; Iliyasu, Abdullah M.; Zhao, Jianping

2018-02-01

Two quantum audio steganography (QAS) protocols are proposed, each of which manipulates or modifies the least significant qubit (LSQb) of the host quantum audio signal that is encoded as an FRQA (flexible representation of quantum audio) audio content. The first protocol (i.e. the conventional LSQb QAS protocol or simply the cLSQ stego protocol) is built on the exchanges between qubits encoding the quantum audio message and the LSQb of the amplitude information in the host quantum audio samples. In the second protocol, the embedding procedure to realize it implants information from a quantum audio message deep into the constraint-imposed most significant qubit (MSQb) of the host quantum audio samples, we refer to it as the pseudo MSQb QAS protocol or simply the pMSQ stego protocol. The cLSQ stego protocol is designed to guarantee high imperceptibility between the host quantum audio and its stego version, whereas the pMSQ stego protocol ensures that the resulting stego quantum audio signal is better immune to illicit tampering and copyright violations (a.k.a. robustness). Built on the circuit model of quantum computation, the circuit networks to execute the embedding and extraction algorithms of both QAS protocols are determined and simulation-based experiments are conducted to demonstrate their implementation. Outcomes attest that both protocols offer promising trade-offs in terms of imperceptibility and robustness.
An Efficient Method for Image and Audio Steganography using Least Significant Bit (LSB) Substitution

NASA Astrophysics Data System (ADS)

Chadha, Ankit; Satam, Neha; Sood, Rakshak; Bade, Dattatray

2013-09-01

In order to improve the data hiding in all types of multimedia data formats such as image and audio and to make hidden message imperceptible, a novel method for steganography is introduced in this paper. It is based on Least Significant Bit (LSB) manipulation and inclusion of redundant noise as secret key in the message. This method is applied to data hiding in images. For data hiding in audio, Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) both are used. All the results displayed prove to be time-efficient and effective. Also the algorithm is tested for various numbers of bits. For those values of bits, Mean Square Error (MSE) and Peak-Signal-to-Noise-Ratio (PSNR) are calculated and plotted. Experimental results show that the stego-image is visually indistinguishable from the original cover-image when n<=4, because of better PSNR which is achieved by this technique. The final results obtained after steganography process does not reveal presence of any hidden message, thus qualifying the criteria of imperceptible message.
Steganalysis for Audio Data

DTIC Science & Technology

2006-03-31

from existing image steganography and steganalysis techniques, the overall objective of Task (b) is to design and implement audio steganography in...general design of the VoIP steganography algorithm is based on known LSB hiding techniques (used for example in StegHide (http...system. Nasir Memon et. al. described a steganalyzer based on image quality metrics [AMS03]. Basically, the main idea to detect steganography by
Surmounting the Effects of Lossy Compression on Steganography

DTIC Science & Technology

1996-10-01

and can be exploited to export sensitive information. Since images are fre- quently compressed for storage or transmission, effective steganography ... steganography is that which is stored with an accuracy far greater than necessary for the data’s use and display. Image , Postscript, and audio files are...information can be concealed in bitmapped image files with little or no visible degradation of the image [4.]. This process, called steganography , is
A novel quantum LSB-based steganography method using the Gray code for colored quantum images

NASA Astrophysics Data System (ADS)

Heidari, Shahrokh; Farzadnia, Ehsan

2017-10-01

As one of the prevalent data-hiding techniques, steganography is defined as the act of concealing secret information in a cover multimedia encompassing text, image, video and audio, imperceptibly, in order to perform interaction between the sender and the receiver in which nobody except the receiver can figure out the secret data. In this approach a quantum LSB-based steganography method utilizing the Gray code for quantum RGB images is investigated. This method uses the Gray code to accommodate two secret qubits in 3 LSBs of each pixel simultaneously according to reference tables. Experimental consequences which are analyzed in MATLAB environment, exhibit that the present schema shows good performance and also it is more secure and applicable than the previous one currently found in the literature.
Blind Linguistic Steganalysis against Translation Based Steganography

NASA Astrophysics Data System (ADS)

Chen, Zhili; Huang, Liusheng; Meng, Peng; Yang, Wei; Miao, Haibo

Translation based steganography (TBS) is a kind of relatively new and secure linguistic steganography. It takes advantage of the "noise" created by automatic translation of natural language text to encode the secret information. Up to date, there is little research on the steganalysis against this kind of linguistic steganography. In this paper, a blind steganalytic method, which is named natural frequency zoned word distribution analysis (NFZ-WDA), is presented. This method has improved on a previously proposed linguistic steganalysis method based on word distribution which is targeted for the detection of linguistic steganography like nicetext and texto. The new method aims to detect the application of TBS and uses none of the related information about TBS, its only used resource is a word frequency dictionary obtained from a large corpus, or a so called natural frequency dictionary, so it is totally blind. To verify the effectiveness of NFZ-WDA, two experiments with two-class and multi-class SVM classifiers respectively are carried out. The experimental results show that the steganalytic method is pretty promising.
LSB Based Quantum Image Steganography Algorithm

NASA Astrophysics Data System (ADS)

Jiang, Nan; Zhao, Na; Wang, Luo

2016-01-01

Quantum steganography is the technique which hides a secret message into quantum covers such as quantum images. In this paper, two blind LSB steganography algorithms in the form of quantum circuits are proposed based on the novel enhanced quantum representation (NEQR) for quantum images. One algorithm is plain LSB which uses the message bits to substitute for the pixels' LSB directly. The other is block LSB which embeds a message bit into a number of pixels that belong to one image block. The extracting circuits can regain the secret message only according to the stego cover. Analysis and simulation-based experimental results demonstrate that the invisibility is good, and the balance between the capacity and the robustness can be adjusted according to the needs of applications.
Combination Base64 Algorithm and EOF Technique for Steganography

NASA Astrophysics Data System (ADS)

Rahim, Robbi; Nurdiyanto, Heri; Hidayat, Rahmat; Saleh Ahmar, Ansari; Siregar, Dodi; Putera Utama Siahaan, Andysah; Faisal, Ilham; Rahman, Sayuti; Suita, Diana; Zamsuri, Ahmad; Abdullah, Dahlan; Napitupulu, Darmawan; Ikhsan Setiawan, Muhammad; Sriadhi, S.

2018-04-01

The steganography process combines mathematics and computer science. Steganography consists of a set of methods and techniques to embed the data into another media so that the contents are unreadable to anyone who does not have the authority to read these data. The main objective of the use of base64 method is to convert any file in order to achieve privacy. This paper discusses a steganography and encoding method using base64, which is a set of encoding schemes that convert the same binary data to the form of a series of ASCII code. Also, the EoF technique is used to embed encoding text performed by Base64. As an example, for the mechanisms a file is used to represent the texts, and by using the two methods together will increase the security level for protecting the data, this research aims to secure many types of files in a particular media with a good security and not to damage the stored files and coverage media that used.
LSB-based Steganography Using Reflected Gray Code for Color Quantum Images

NASA Astrophysics Data System (ADS)

Li, Panchi; Lu, Aiping

2018-02-01

At present, the classical least-significant-bit (LSB) based image steganography has been extended to quantum image processing. For the existing LSB-based quantum image steganography schemes, the embedding capacity is no more than 3 bits per pixel. Therefore, it is meaningful to study how to improve the embedding capacity of quantum image steganography. This work presents a novel LSB-based steganography using reflected Gray code for colored quantum images, and the embedding capacity of this scheme is up to 4 bits per pixel. In proposed scheme, the secret qubit sequence is considered as a sequence of 4-bit segments. For the four bits in each segment, the first bit is embedded in the second LSB of B channel of the cover image, and and the remaining three bits are embedded in LSB of RGB channels of each color pixel simultaneously using reflected-Gray code to determine the embedded bit from secret information. Following the transforming rule, the LSB of stego-image are not always same as the secret bits and the differences are up to almost 50%. Experimental results confirm that the proposed scheme shows good performance and outperforms the previous ones currently found in the literature in terms of embedding capacity.
Steganography: Past, Present, Future

DOE Office of Scientific and Technical Information (OSTI.GOV)

Judge, J C

Steganography (a rough Greek translation of the term Steganography is secret writing) has been used in various forms for 2500 years. It has found use in variously in military, diplomatic, personal and intellectual property applications. Briefly stated, steganography is the term applied to any number of processes that will hide a message within an object, where the hidden message will not be apparent to an observer. This paper will explore steganography from its earliest instances through potential future application.
Improved Adaptive LSB Steganography Based on Chaos and Genetic Algorithm

NASA Astrophysics Data System (ADS)

Yu, Lifang; Zhao, Yao; Ni, Rongrong; Li, Ting

2010-12-01

We propose a novel steganographic method in JPEG images with high performance. Firstly, we propose improved adaptive LSB steganography, which can achieve high capacity while preserving the first-order statistics. Secondly, in order to minimize visual degradation of the stego image, we shuffle bits-order of the message based on chaos whose parameters are selected by the genetic algorithm. Shuffling message's bits-order provides us with a new way to improve the performance of steganography. Experimental results show that our method outperforms classical steganographic methods in image quality, while preserving characteristics of histogram and providing high capacity.
Steganography Detection Using Entropy Measures

DTIC Science & Technology

2012-11-16

latter leads to the level of compression of the image . 3.3. Least Significant Bit ( LSB ) The object of steganography is to prevent suspicion upon the...structured user interface developer tools. Steganography Detection Using Entropy Measures Technical Report By Eduardo Meléndez Universidad Politécnica de ...6 2.3. Different kinds of steganography . . . . . . . . . . . . . . . . . . . . . 6 II. Steganography 8 3. Images and Significance of
Image Steganography for Hidden Communication

DTIC Science & Technology

2000-04-01

ARMY RESEARCH LABORATORY Image Steganography for Hidden Communication by Lisa M. Marvel sx:8 lÄPSilll msmmmmsi IH :’:-:’X^:-:-:-:o-x...2000 Image Steganography for Hidden Communication Lisa M. Marvel Information Science and Technology Directorate, ARL Approved for public release...Capacity for Image Steganography 14 3.4 Summary 1’ 4. Spread Spectrum Image Steganography (SSIS) 19 4.1 Modulation 21 4.1.1 Sign-Detector System
Towards Statistically Undetectable Steganography

DTIC Science & Technology

2011-06-30

payload size. Middle, payload proportional to y/N. Right, proportional to N. LSB replacement steganography in never-compressed cover images , detected...Books. (1) J. Fridrich, Steganography in Digital Media: Principles, Algorithms , and Applications, Cambridge University Press, November 2009. Journal... Images for Applications in Steganography ," IEEE Trans, on Info. Forensics and Security, vol. 3(2), pp. 247-258, 2008. Conference papers. (1) T. Filler
Researcher’s Perspective of Substitution Method on Text Steganography

NASA Astrophysics Data System (ADS)

Zamir Mansor, Fawwaz; Mustapha, Aida; Azah Samsudin, Noor

2017-08-01

The linguistic steganography studies are still in the stage of development and empowerment practices. This paper will present several text steganography on substitution methods based on the researcher’s perspective, all scholar paper will analyse and compared. The objective of this paper is to give basic information in the substitution method of text domain steganography that has been applied by previous researchers. The typical ways of this method also will be identified in this paper to reveal the most effective method in text domain steganography. Finally, the advantage of the characteristic and drawback on these techniques in generally also presented in this paper.
Fourier Phase Domain Steganography: Phase Bin Encoding Via Interpolation

NASA Astrophysics Data System (ADS)

Rivas, Edward

2007-04-01

In recent years there has been an increased interest in audio steganography and watermarking. This is due primarily to two reasons. First, an acute need to improve our national security capabilities in light of terrorist and criminal activity has driven new ideas and experimentation. Secondly, the explosive proliferation of digital media has forced the music industry to rethink how they will protect their intellectual property. Various techniques have been implemented but the phase domain remains a fertile ground for improvement due to the relative robustness to many types of distortion and immunity to the Human Auditory System. A new method for embedding data in the phase domain of the Discrete Fourier Transform of an audio signal is proposed. Focus is given to robustness and low perceptibility, while maintaining a relatively high capacity rate of up to 172 bits/s.
Steganography and Steganalysis in Digital Images

DTIC Science & Technology

2012-01-01

Nonetheless, to hide a message in a BMP using this algorithm it would require a large image used as a cover. STEGANOGRAPHY TOOLS There were eight tools in...REPORT Steganography and Steganalysis in Digital Images 14. ABSTRACT 16. SECURITY CLASSIFICATION OF: Steganography (from the Greek for "covered writing...12211 Research Triangle Park, NC 27709-2211 15. SUBJECT TERMS Least Significant Bit ( LSB ), steganography , steganalysis, stegogramme. Dr. Jeff Duffany
Steganography in arrhythmic electrocardiogram signal.

PubMed

Edward Jero, S; Ramu, Palaniappan; Ramakrishnan, S

2015-08-01

Security and privacy of patient data is a vital requirement during exchange/storage of medical information over communication network. Steganography method hides patient data into a cover signal to prevent unauthenticated accesses during data transfer. This study evaluates the performance of ECG steganography to ensure secured transmission of patient data where an abnormal ECG signal is used as cover signal. The novelty of this work is to hide patient data into two dimensional matrix of an abnormal ECG signal using Discrete Wavelet Transform and Singular Value Decomposition based steganography method. A 2D ECG is constructed according to Tompkins QRS detection algorithm. The missed R peaks are computed using RR interval during 2D conversion. The abnormal ECG signals are obtained from the MIT-BIH arrhythmia database. Metrics such as Peak Signal to Noise Ratio, Percentage Residual Difference, Kullback-Leibler distance and Bit Error Rate are used to evaluate the performance of the proposed approach.

Steganography: LSB Methodology

DTIC Science & Technology

2012-08-02

images ; LSB Embedding Angel Sierra, Dr. Alfredo Cruz (Advisor) Polytechnic University of Puerto Rico 377 Ponce De Leon Hato Rey San Juan, PR 00918...notepad document as the message input. - Reviewed the battlesteg algorithm java code. POLYTECHNIC UNIVERSITY OF PUERTO RICO Steganography : LSB ...of LSB steganography in grayscale and color images . In J. Dittmann, K. Nahrstedt, and P. Wohlmacher, editors, Proceedings of the ACM, Special
Quantum color image watermarking based on Arnold transformation and LSB steganography

NASA Astrophysics Data System (ADS)

Zhou, Ri-Gui; Hu, Wenwen; Fan, Ping; Luo, Gaofeng

In this paper, a quantum color image watermarking scheme is proposed through twice-scrambling of Arnold transformations and steganography of least significant bit (LSB). Both carrier image and watermark images are represented by the novel quantum representation of color digital images model (NCQI). The image sizes for carrier and watermark are assumed to be 2n×2n and 2n‑1×2n‑1, respectively. At first, the watermark is scrambled into a disordered form through image preprocessing technique of exchanging the image pixel position and altering the color information based on Arnold transforms, simultaneously. Then, the scrambled watermark with 2n‑1×2n‑1 image size and 24-qubit grayscale is further expanded to an image with size 2n×2n and 6-qubit grayscale using the nearest-neighbor interpolation method. Finally, the scrambled and expanded watermark is embedded into the carrier by steganography of LSB scheme, and a key image with 2n×2n size and 3-qubit information is generated at the meantime, which only can use the key image to retrieve the original watermark. The extraction of watermark is the reverse process of embedding, which is achieved by applying a sequence of operations in the reverse order. Simulation-based experimental results involving different carrier and watermark images (i.e. conventional or non-quantum) are simulated based on the classical computer’s MATLAB 2014b software, which illustrates that the present method has a good performance in terms of three items: visual quality, robustness and steganography capacity.
Wavelet-based audio embedding and audio/video compression

NASA Astrophysics Data System (ADS)

Mendenhall, Michael J.; Claypoole, Roger L., Jr.

2001-12-01

Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.
Spread spectrum image steganography.

PubMed

Marvel, L M; Boncelet, C R; Retter, C T

1999-01-01

In this paper, we present a new method of digital steganography, entitled spread spectrum image steganography (SSIS). Steganography, which means "covered writing" in Greek, is the science of communicating in a hidden manner. Following a discussion of steganographic communication theory and review of existing techniques, the new method, SSIS, is introduced. This system hides and recovers a message of substantial length within digital imagery while maintaining the original image size and dynamic range. The hidden message can be recovered using appropriate keys without any knowledge of the original image. Image restoration, error-control coding, and techniques similar to spread spectrum are described, and the performance of the system is illustrated. A message embedded by this method can be in the form of text, imagery, or any other digital signal. Applications for such a data-hiding scheme include in-band captioning, covert communication, image tamperproofing, authentication, embedded control, and revision tracking.
Detection of LSB+/-1 steganography based on co-occurrence matrix and bit plane clipping

NASA Astrophysics Data System (ADS)

Abolghasemi, Mojtaba; Aghaeinia, Hassan; Faez, Karim; Mehrabi, Mohammad Ali

2010-01-01

Spatial LSB+/-1 steganography changes smooth characteristics between adjoining pixels of the raw image. We present a novel steganalysis method for LSB+/-1 steganography based on feature vectors derived from the co-occurrence matrix in the spatial domain. We investigate how LSB+/-1 steganography affects the bit planes of an image and show that it changes more least significant bit (LSB) planes of it. The co-occurrence matrix is derived from an image in which some of its most significant bit planes are clipped. By this preprocessing, in addition to reducing the dimensions of the feature vector, the effects of embedding were also preserved. We compute the co-occurrence matrix in different directions and with different dependency and use the elements of the resulting co-occurrence matrix as features. This method is sensitive to the data embedding process. We use a Fisher linear discrimination (FLD) classifier and test our algorithm on different databases and embedding rates. We compare our scheme with the current LSB+/-1 steganalysis methods. It is shown that the proposed scheme outperforms the state-of-the-art methods in detecting the LSB+/-1 steganographic method for grayscale images.
Wireless steganography

NASA Astrophysics Data System (ADS)

Agaian, Sos S.; Akopian, David; D'Souza, Sunil

2006-02-01

Modern mobile devices are some of the most technologically advanced devices that people use on a daily basis and the current trends in mobile phone technology indicate that tasks achievable by mobile devices will soon exceed our imagination. This paper undertakes a case study of the development and implementation of one of the first known steganography (data hiding) applications on a mobile device. Steganography is traditionally accomplished using the high processing speeds of desktop or notebook computers. With the introduction of mobile platform operating systems, there arises an opportunity for the users to develop and embed their own applications. We take advantage of this opportunity with the introduction of wireless steganographic algorithms. Thus we demonstrates that custom applications, popular with security establishments, can be developed also on mobile systems independent of both the mobile device manufacturer and mobile service provider. For example, this might be a very important feature if the communication is to be controlled exclusively by authorized personnel. The paper begins by reviewing the technological capabilities of modern mobile devices. Then we address a suitable development platform which is based on Symbian TM/Series60 TM architecture. Finally, two data hiding applications developed for Symbian TM/Series60 TM mobile phones are presented.
Steganography Detection Using Entropy Measures

DTIC Science & Technology

2012-08-19

embedded steganography . For this research freely available software for embedding hidden messages will be used to create sample image files to... LSB ) is a simple approach to modify an image , while at the same time, making the change imperceptible to the human eye. By considering the redundant...to 00001110, we have applied the least significant bit technique. 2.4 Significant Bit Image Depiction Steganography fails to comply in its purpose, at
Quantum red-green-blue image steganography

NASA Astrophysics Data System (ADS)

Heidari, Shahrokh; Pourarian, Mohammad Rasoul; Gheibi, Reza; Naseri, Mosayeb; Houshmand, Monireh

One of the most considering matters in the field of quantum information processing is quantum data hiding including quantum steganography and quantum watermarking. This field is an efficient tool for protecting any kind of digital data. In this paper, three quantum color images steganography algorithms are investigated based on Least Significant Bit (LSB). The first algorithm employs only one of the image’s channels to cover secret data. The second procedure is based on LSB XORing technique, and the last algorithm utilizes two channels to cover the color image for hiding secret quantum data. The performances of the proposed schemes are analyzed by using software simulations in MATLAB environment. The analysis of PSNR, BER and Histogram graphs indicate that the presented schemes exhibit acceptable performances and also theoretical analysis demonstrates that the networks complexity of the approaches scales squarely.
On LSB Spatial Domain Steganography and Channel Capacity

DTIC Science & Technology

2008-03-21

reveal the hidden information should not be taken as proof that the image is now clean. The survivability of LSB type spatial domain steganography ...the mindset that JPEG compressing an image is sufficient to destroy the steganography for spatial domain LSB type stego. We agree that JPEGing...modeling of 2 bit LSB steganography shows that theoretically there is non-zero stego payload possible even though the image has been JPEGed. We wish to
Extreme learning machine based optimal embedding location finder for image steganography

PubMed Central

Aljeroudi, Yazan

2017-01-01

In image steganography, determining the optimum location for embedding the secret message precisely with minimum distortion of the host medium remains a challenging issue. Yet, an effective approach for the selection of the best embedding location with least deformation is far from being achieved. To attain this goal, we propose a novel approach for image steganography with high-performance, where extreme learning machine (ELM) algorithm is modified to create a supervised mathematical model. This ELM is first trained on a part of an image or any host medium before being tested in the regression mode. This allowed us to choose the optimal location for embedding the message with best values of the predicted evaluation metrics. Contrast, homogeneity, and other texture features are used for training on a new metric. Furthermore, the developed ELM is exploited for counter over-fitting while training. The performance of the proposed steganography approach is evaluated by computing the correlation, structural similarity (SSIM) index, fusion matrices, and mean square error (MSE). The modified ELM is found to outperform the existing approaches in terms of imperceptibility. Excellent features of the experimental results demonstrate that the proposed steganographic approach is greatly proficient for preserving the visual information of an image. An improvement in the imperceptibility as much as 28% is achieved compared to the existing state of the art methods. PMID:28196080
LSB-Based Steganography Using Reflected Gray Code

NASA Astrophysics Data System (ADS)

Chen, Chang-Chu; Chang, Chin-Chen

Steganography aims to hide secret data into an innocuous cover-medium for transmission and to make the attacker cannot recognize the presence of secret data easily. Even the stego-medium is captured by the eavesdropper, the slight distortion is hard to be detected. The LSB-based data hiding is one of the steganographic methods, used to embed the secret data into the least significant bits of the pixel values in a cover image. In this paper, we propose an LSB-based scheme using reflected-Gray code, which can be applied to determine the embedded bit from secret information. Following the transforming rule, the LSBs of stego-image are not always equal to the secret bits and the experiment shows that the differences are up to almost 50%. According to the mathematical deduction and experimental results, the proposed scheme has the same image quality and payload as the simple LSB substitution scheme. In fact, our proposed data hiding scheme in the case of G1 (one bit Gray code) system is equivalent to the simple LSB substitution scheme.
Adaptive steganography

NASA Astrophysics Data System (ADS)

Chandramouli, Rajarathnam; Li, Grace; Memon, Nasir D.

2002-04-01

Steganalysis techniques attempt to differentiate between stego-objects and cover-objects. In recent work we developed an explicit analytic upper bound for the steganographic capacity of LSB based steganographic techniques for a given false probability of detection. In this paper we look at adaptive steganographic techniques. Adaptive steganographic techniques take explicit steps to escape detection. We explore different techniques that can be used to adapt message embedding to the image content or to a known steganalysis technique. We investigate the advantages of adaptive steganography within an analytical framework. We also give experimental results with a state-of-the-art steganalysis technique demonstrating that adaptive embedding results in a significant number of bits embedded without detection.
A New Paradigm Hidden in Steganography

DTIC Science & Technology

2000-01-01

In steganography , we do not make the \\strong" assumption that Eve has knowledge of the steganographic algorithm . This is why there may, or may not be...the n least signi cant bits ( LSB ) of each pixel in the cov- erimage, with the n most signi cant bits (MSB) from the corresponding pixel of the image to...e.g., 2 LSB are (0,0) ) to 3 (e.g., 2 LSB are (1,1) ), it is visually impossible for Eve to detect the steganography . Of course, if Eve has knowl
A comprehensive bibliography of linguistic steganography

NASA Astrophysics Data System (ADS)

Bergmair, Richard

2007-02-01

In this paper, we will attempt to give a comprehensive bibliographic account of the work in linguistic steganography published up to date. As the field is still in its infancy there is no widely accepted publication venue. Relevant work on the subject is scattered throughout the literature on information security, information hiding, imaging and watermarking, cryptology, and natural language processing. Bibliographic references within the field are very sparse. This makes literature research on linguistic steganography a tedious task and a comprehensive bibliography a valuable aid to the researcher.
Quantum Image Steganography and Steganalysis Based On LSQu-Blocks Image Information Concealing Algorithm

NASA Astrophysics Data System (ADS)

A. AL-Salhi, Yahya E.; Lu, Songfeng

2016-08-01

Quantum steganography can solve some problems that are considered inefficient in image information concealing. It researches on Quantum image information concealing to have been widely exploited in recent years. Quantum image information concealing can be categorized into quantum image digital blocking, quantum image stereography, anonymity and other branches. Least significant bit (LSB) information concealing plays vital roles in the classical world because many image information concealing algorithms are designed based on it. Firstly, based on the novel enhanced quantum representation (NEQR), image uniform blocks clustering around the concrete the least significant Qu-block (LSQB) information concealing algorithm for quantum image steganography is presented. Secondly, a clustering algorithm is proposed to optimize the concealment of important data. Finally, we used Con-Steg algorithm to conceal the clustered image blocks. Information concealing located on the Fourier domain of an image can achieve the security of image information, thus we further discuss the Fourier domain LSQu-block information concealing algorithm for quantum image based on Quantum Fourier Transforms. In our algorithms, the corresponding unitary Transformations are designed to realize the aim of concealing the secret information to the least significant Qu-block representing color of the quantum cover image. Finally, the procedures of extracting the secret information are illustrated. Quantum image LSQu-block image information concealing algorithm can be applied in many fields according to different needs.
Secure steganography designed for mobile platforms

NASA Astrophysics Data System (ADS)

Agaian, Sos S.; Cherukuri, Ravindranath; Sifuentes, Ronnie R.

2006-05-01

Adaptive steganography, an intelligent approach to message hiding, integrated with matrix encoding and pn-sequences serves as a promising resolution to recent security assurance concerns. Incorporating the above data hiding concepts with established cryptographic protocols in wireless communication would greatly increase the security and privacy of transmitting sensitive information. We present an algorithm which will address the following problems: 1) low embedding capacity in mobile devices due to fixed image dimensions and memory constraints, 2) compatibility between mobile and land based desktop computers, and 3) detection of stego images by widely available steganalysis software [1-3]. Consistent with the smaller available memory, processor capabilities, and limited resolution associated with mobile devices, we propose a more magnified approach to steganography by focusing adaptive efforts at the pixel level. This deeper method, in comparison to the block processing techniques commonly found in existing adaptive methods, allows an increase in capacity while still offering a desired level of security. Based on computer simulations using high resolution, natural imagery and mobile device captured images, comparisons show that the proposed method securely allows an increased amount of embedding capacity but still avoids detection by varying steganalysis techniques.
Steganography and Cryptography Inspired Enhancement of Introductory Programming Courses

ERIC Educational Resources Information Center

Kortsarts, Yana; Kempner, Yulia

2015-01-01

Steganography is the art and science of concealing communication. The goal of steganography is to hide the very existence of information exchange by embedding messages into unsuspicious digital media covers. Cryptography, or secret writing, is the study of the methods of encryption, decryption and their use in communications protocols.…
Multi-Class Classification for Identifying JPEG Steganography Embedding Methods

DTIC Science & Technology

2008-09-01

B.H. (2000). STEGANOGRAPHY: Hidden Images, A New Challenge in the Fight Against Child Porn . UPDATE, Volume 13, Number 2, pp. 1-4, Retrieved June 3...Other crimes involving the use of steganography include child pornography where the stego files are used to hide a predator’s location when posting
Steganography based on pixel intensity value decomposition

NASA Astrophysics Data System (ADS)

Abdulla, Alan Anwar; Sellahewa, Harin; Jassim, Sabah A.

2014-05-01

This paper focuses on steganography based on pixel intensity value decomposition. A number of existing schemes such as binary, Fibonacci, Prime, Natural, Lucas, and Catalan-Fibonacci (CF) are evaluated in terms of payload capacity and stego quality. A new technique based on a specific representation is proposed to decompose pixel intensity values into 16 (virtual) bit-planes suitable for embedding purposes. The proposed decomposition has a desirable property whereby the sum of all bit-planes does not exceed the maximum pixel intensity value, i.e. 255. Experimental results demonstrate that the proposed technique offers an effective compromise between payload capacity and stego quality of existing embedding techniques based on pixel intensity value decomposition. Its capacity is equal to that of binary and Lucas, while it offers a higher capacity than Fibonacci, Prime, Natural, and CF when the secret bits are embedded in 1st Least Significant Bit (LSB). When the secret bits are embedded in higher bit-planes, i.e., 2nd LSB to 8th Most Significant Bit (MSB), the proposed scheme has more capacity than Natural numbers based embedding. However, from the 6th bit-plane onwards, the proposed scheme offers better stego quality. In general, the proposed decomposition scheme has less effect in terms of quality on pixel value when compared to most existing pixel intensity value decomposition techniques when embedding messages in higher bit-planes.
Application of Genetic Algorithm and Particle Swarm Optimization techniques for improved image steganography systems

NASA Astrophysics Data System (ADS)

Jude Hemanth, Duraisamy; Umamaheswari, Subramaniyan; Popescu, Daniela Elena; Naaji, Antoanela

2016-01-01

Image steganography is one of the ever growing computational approaches which has found its application in many fields. The frequency domain techniques are highly preferred for image steganography applications. However, there are significant drawbacks associated with these techniques. In transform based approaches, the secret data is embedded in random manner in the transform coefficients of the cover image. These transform coefficients may not be optimal in terms of the stego image quality and embedding capacity. In this work, the application of Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) have been explored in the context of determining the optimal coefficients in these transforms. Frequency domain transforms such as Bandelet Transform (BT) and Finite Ridgelet Transform (FRIT) are used in combination with GA and PSO to improve the efficiency of the image steganography system.

Steganography on multiple MP3 files using spread spectrum and Shamir's secret sharing

NASA Astrophysics Data System (ADS)

Yoeseph, N. M.; Purnomo, F. A.; Riasti, B. K.; Safiie, M. A.; Hidayat, T. N.

2016-11-01

The purpose of steganography is how to hide data into another media. In order to increase security of data, steganography technique is often combined with cryptography. The weakness of this combination technique is the data was centralized. Therefore, a steganography technique is develop by using combination of spread spectrum and secret sharing technique. In steganography with secret sharing, shares of data is created and hidden in several medium. Medium used to concealed shares were MP3 files. Hiding technique used was Spread Spectrum. Secret sharing scheme used was Shamir's Secret Sharing. The result showed that steganography with spread spectrum combined with Shamir's Secret Share using MP3 files as medium produce a technique that could hid data into several cover. To extract and reconstruct the data hidden in stego object, it is needed the amount of stego object which more or equal to its threshold. Furthermore, stego objects were imperceptible and robust.
A Novel Quantum Image Steganography Scheme Based on LSB

NASA Astrophysics Data System (ADS)

Zhou, Ri-Gui; Luo, Jia; Liu, XingAo; Zhu, Changming; Wei, Lai; Zhang, Xiafen

2018-06-01

Based on the NEQR representation of quantum images and least significant bit (LSB) scheme, a novel quantum image steganography scheme is proposed. The sizes of the cover image and the original information image are assumed to be 4 n × 4 n and n × n, respectively. Firstly, the bit-plane scrambling method is used to scramble the original information image. Then the scrambled information image is expanded to the same size of the cover image by using the key only known to the operator. The expanded image is scrambled to be a meaningless image with the Arnold scrambling. The embedding procedure and extracting procedure are carried out by K 1 and K 2 which are under control of the operator. For validation of the presented scheme, the peak-signal-to-noise ratio (PSNR), the capacity, the security of the images and the circuit complexity are analyzed.
Fuzzy Logic-Based Audio Pattern Recognition

NASA Astrophysics Data System (ADS)

Malcangi, M.

2008-11-01

Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
The Quantum Steganography Protocol via Quantum Noisy Channels

NASA Astrophysics Data System (ADS)

Wei, Zhan-Hong; Chen, Xiu-Bo; Niu, Xin-Xin; Yang, Yi-Xian

2015-08-01

As a promising branch of quantum information hiding, Quantum steganography aims to transmit secret messages covertly in public quantum channels. But due to environment noise and decoherence, quantum states easily decay and change. Therefore, it is very meaningful to make a quantum information hiding protocol apply to quantum noisy channels. In this paper, we make the further research on a quantum steganography protocol for quantum noisy channels. The paper proved that the protocol can apply to transmit secret message covertly in quantum noisy channels, and explicity showed quantum steganography protocol. In the protocol, without publishing the cover data, legal receivers can extract the secret message with a certain probability, which make the protocol have a good secrecy. Moreover, our protocol owns the independent security, and can be used in general quantum communications. The communication, which happen in our protocol, do not need entangled states, so our protocol can be used without the limitation of entanglement resource. More importantly, the protocol apply to quantum noisy channels, and can be used widely in the future quantum communication.
Quantum watermarking scheme through Arnold scrambling and LSB steganography

NASA Astrophysics Data System (ADS)

Zhou, Ri-Gui; Hu, Wenwen; Fan, Ping

2017-09-01

Based on the NEQR of quantum images, a new quantum gray-scale image watermarking scheme is proposed through Arnold scrambling and least significant bit (LSB) steganography. The sizes of the carrier image and the watermark image are assumed to be 2n× 2n and n× n, respectively. Firstly, a classical n× n sized watermark image with 8-bit gray scale is expanded to a 2n× 2n sized image with 2-bit gray scale. Secondly, through the module of PA-MOD N, the expanded watermark image is scrambled to a meaningless image by the Arnold transform. Then, the expanded scrambled image is embedded into the carrier image by the steganography method of LSB. Finally, the time complexity analysis is given. The simulation experiment results show that our quantum circuit has lower time complexity, and the proposed watermarking scheme is superior to others.
Steganography -- The New Intelligence Threat

DTIC Science & Technology

2004-01-01

Information can be embedded within text files, digital music and videos, and digital photographs by simply changing bits and bytes. HOW IT WORKS...International Airport could be embedded in Brittany Spears’ latest music release in MP3 format. The wide range of steganography capabilities has been
Quantum steganography with large payload based on entanglement swapping of χ-type entangled states

NASA Astrophysics Data System (ADS)

Qu, Zhi-Guo; Chen, Xiu-Bo; Luo, Ming-Xing; Niu, Xin-Xin; Yang, Yi-Xian

2011-04-01

In this paper, we firstly propose a new simple method to calculate entanglement swapping of χ-type entangled states, and then present a novel quantum steganography protocol with large payload. The new protocol adopts entanglement swapping to build up the hidden channel within quantum secure direct communication with χ-type entangled states for securely transmitting secret messages. Comparing with the previous quantum steganographies, the capacity of the hidden channel is much higher, which is increased to eight bits. Meanwhile, due to the quantum uncertainty theorem and the no-cloning theorem its imperceptibility is proved to be great in the analysis, and its security is also analyzed in detail, which is proved that intercept-resend attack, measurement-resend attack, ancilla attack, man-in-the-middle attack or even Dos(Denial of Service) attack couldn't threaten it. As a result, the protocol can be applied in various fields of quantum communication.
A novel quantum steganography scheme for color images

NASA Astrophysics Data System (ADS)

Li, Panchi; Liu, Xiande

In quantum image steganography, embedding capacity and security are two important issues. This paper presents a novel quantum steganography scheme using color images as cover images. First, the secret information is divided into 3-bit segments, and then each 3-bit segment is embedded into the LSB of one color pixel in the cover image according to its own value and using Gray code mapping rules. Extraction is the inverse of embedding. We designed the quantum circuits that implement the embedding and extracting process. The simulation results on a classical computer show that the proposed scheme outperforms several other existing schemes in terms of embedding capacity and security.
Anti-Noise Bidirectional Quantum Steganography Protocol with Large Payload

NASA Astrophysics Data System (ADS)

Qu, Zhiguo; Chen, Siyi; Ji, Sai; Ma, Songya; Wang, Xiaojun

2018-06-01

An anti-noise bidirectional quantum steganography protocol with large payload protocol is proposed in this paper. In the new protocol, Alice and Bob enable to transmit classical information bits to each other while teleporting secret quantum states covertly. The new protocol introduces the bidirectional quantum remote state preparation into the bidirectional quantum secure communication, not only to expand secret information from classical bits to quantum state, but also extract the phase and amplitude values of secret quantum state for greatly enlarging the capacity of secret information. The new protocol can also achieve better imperceptibility, since the eavesdropper can hardly detect the hidden channel or even obtain effective secret quantum states. Comparing with the previous quantum steganography achievements, due to its unique bidirectional quantum steganography, the new protocol can obtain higher transmission efficiency and better availability. Furthermore, the new algorithm can effectively resist quantum noises through theoretical analysis. Finally, the performance analysis proves the conclusion that the new protocol not only has good imperceptibility, high security, but also large payload.
Anti-Noise Bidirectional Quantum Steganography Protocol with Large Payload

NASA Astrophysics Data System (ADS)

Qu, Zhiguo; Chen, Siyi; Ji, Sai; Ma, Songya; Wang, Xiaojun

2018-03-01

An anti-noise bidirectional quantum steganography protocol with large payload protocol is proposed in this paper. In the new protocol, Alice and Bob enable to transmit classical information bits to each other while teleporting secret quantum states covertly. The new protocol introduces the bidirectional quantum remote state preparation into the bidirectional quantum secure communication, not only to expand secret information from classical bits to quantum state, but also extract the phase and amplitude values of secret quantum state for greatly enlarging the capacity of secret information. The new protocol can also achieve better imperceptibility, since the eavesdropper can hardly detect the hidden channel or even obtain effective secret quantum states. Comparing with the previous quantum steganography achievements, due to its unique bidirectional quantum steganography, the new protocol can obtain higher transmission efficiency and better availability. Furthermore, the new algorithm can effectively resist quantum noises through theoretical analysis. Finally, the performance analysis proves the conclusion that the new protocol not only has good imperceptibility, high security, but also large payload.
Audio CAPTCHA for SIP-Based VoIP

NASA Astrophysics Data System (ADS)

Soupionis, Yannis; Tountas, George; Gritzalis, Dimitris

Voice over IP (VoIP) introduces new ways of communication, while utilizing existing data networks to provide inexpensive voice communications worldwide as a promising alternative to the traditional PSTN telephony. SPam over Internet Telephony (SPIT) is one potential source of future annoyance in VoIP. A common way to launch a SPIT attack is the use of an automated procedure (bot), which generates calls and produces audio advertisements. In this paper, our goal is to design appropriate CAPTCHA to fight such bots. We focus on and develop audio CAPTCHA, as the audio format is more suitable for VoIP environments and we implement it in a SIP-based VoIP environment. Furthermore, we suggest and evaluate the specific attributes that audio CAPTCHA should incorporate in order to be effective, and test it against an open source bot implementation.
Subjective audio quality evaluation of embedded-optimization-based distortion precompensation algorithms.

PubMed

Defraene, Bruno; van Waterschoot, Toon; Diehl, Moritz; Moonen, Marc

2016-07-01

Subjective audio quality evaluation experiments have been conducted to assess the performance of embedded-optimization-based precompensation algorithms for mitigating perceptible linear and nonlinear distortion in audio signals. It is concluded with statistical significance that the perceived audio quality is improved by applying an embedded-optimization-based precompensation algorithm, both in case (i) nonlinear distortion and (ii) a combination of linear and nonlinear distortion is present. Moreover, a significant positive correlation is reported between the collected subjective and objective PEAQ audio quality scores, supporting the validity of using PEAQ to predict the impact of linear and nonlinear distortion on the perceived audio quality.
TECHNICAL NOTE: Portable audio electronics for impedance-based measurements in microfluidics

NASA Astrophysics Data System (ADS)

Wood, Paul; Sinton, David

2010-08-01

We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1-50 mM), flow rate (2-120 µL min-1) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ~10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems.
Video content parsing based on combined audio and visual information

NASA Astrophysics Data System (ADS)

Zhang, Tong; Kuo, C.-C. Jay

1999-08-01

While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

PubMed

Giannakopoulos, Theodoros

2015-01-01

Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.
pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis

PubMed Central

Giannakopoulos, Theodoros

2015-01-01

Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library. PMID:26656189
StirMark Benchmark: audio watermarking attacks based on lossy compression

NASA Astrophysics Data System (ADS)

Steinebach, Martin; Lang, Andreas; Dittmann, Jana

2002-04-01

StirMark Benchmark is a well-known evaluation tool for watermarking robustness. Additional attacks are added to it continuously. To enable application based evaluation, in our paper we address attacks against audio watermarks based on lossy audio compression algorithms to be included in the test environment. We discuss the effect of different lossy compression algorithms like MPEG-2 audio Layer 3, Ogg or VQF on a selection of audio test data. Our focus is on changes regarding the basic characteristics of the audio data like spectrum or average power and on removal of embedded watermarks. Furthermore we compare results of different watermarking algorithms and show that lossy compression is still a challenge for most of them. There are two strategies for adding evaluation of robustness against lossy compression to StirMark Benchmark: (a) use of existing free compression algorithms (b) implementation of a generic lossy compression simulation. We discuss how such a model can be implemented based on the results of our tests. This method is less complex, as no real psycho acoustic model has to be applied. Our model can be used for audio watermarking evaluation of numerous application fields. As an example, we describe its importance for e-commerce applications with watermarking security.
Audio-based queries for video retrieval over Java enabled mobile devices

NASA Astrophysics Data System (ADS)

Ahmad, Iftikhar; Cheikh, Faouzi Alaya; Kiranyaz, Serkan; Gabbouj, Moncef

2006-02-01

In this paper we propose a generic framework for efficient retrieval of audiovisual media based on its audio content. This framework is implemented in a client-server architecture where the client application is developed in Java to be platform independent whereas the server application is implemented for the PC platform. The client application adapts to the characteristics of the mobile device where it runs such as screen size and commands. The entire framework is designed to take advantage of the high-level segmentation and classification of audio content to improve speed and accuracy of audio-based media retrieval. Therefore, the primary objective of this framework is to provide an adaptive basis for performing efficient video retrieval operations based on the audio content and types (i.e. speech, music, fuzzy and silence). Experimental results approve that such an audio based video retrieval scheme can be used from mobile devices to search and retrieve video clips efficiently over wireless networks.
DWT-Based High Capacity Audio Watermarking

NASA Astrophysics Data System (ADS)

Fallahpour, Mehdi; Megías, David

This letter suggests a novel high capacity robust audio watermarking algorithm by using the high frequency band of the wavelet decomposition, for which the human auditory system (HAS) is not very sensitive to alteration. The main idea is to divide the high frequency band into frames and then, for embedding, the wavelet samples are changed based on the average of the relevant frame. The experimental results show that the method has very high capacity (about 5.5kbps), without significant perceptual distortion (ODG in [-1, 0] and SNR about 33dB) and provides robustness against common audio signal processing such as added noise, filtering, echo and MPEG compression (MP3).
Content-based audio authentication using a hierarchical patchwork watermark embedding

NASA Astrophysics Data System (ADS)

Gulbis, Michael; Müller, Erika

2010-05-01

Content-based audio authentication watermarking techniques extract perceptual relevant audio features, which are robustly embedded into the audio file to protect. Manipulations of the audio file are detected on the basis of changes between the original embedded feature information and the anew extracted features during verification. The main challenges of content-based watermarking are on the one hand the identification of a suitable audio feature to distinguish between content preserving and malicious manipulations. On the other hand the development of a watermark, which is robust against content preserving modifications and able to carry the whole authentication information. The payload requirements are significantly higher compared to transaction watermarking or copyright protection. Finally, the watermark embedding should not influence the feature extraction to avoid false alarms. Current systems still lack a sufficient alignment of watermarking algorithm and feature extraction. In previous work we developed a content-based audio authentication watermarking approach. The feature is based on changes in DCT domain over time. A patchwork algorithm based watermark was used to embed multiple one bit watermarks. The embedding process uses the feature domain without inflicting distortions to the feature. The watermark payload is limited by the feature extraction, more precisely the critical bands. The payload is inverse proportional to segment duration of the audio file segmentation. Transparency behavior was analyzed in dependence of segment size and thus the watermark payload. At a segment duration of about 20 ms the transparency shows an optimum (measured in units of Objective Difference Grade). Transparency and/or robustness are fast decreased for working points beyond this area. Therefore, these working points are unsuitable to gain further payload, needed for the embedding of the whole authentication information. In this paper we present a hierarchical extension

Paper-Based Textbooks with Audio Support for Print-Disabled Students.

PubMed

Fujiyoshi, Akio; Ohsawa, Akiko; Takaira, Takuya; Tani, Yoshiaki; Fujiyoshi, Mamoru; Ota, Yuko

2015-01-01

Utilizing invisible 2-dimensional codes and digital audio players with a 2-dimensional code scanner, we developed paper-based textbooks with audio support for students with print disabilities, called "multimodal textbooks." Multimodal textbooks can be read with the combination of the two modes: "reading printed text" and "listening to the speech of the text from a digital audio player with a 2-dimensional code scanner." Since multimodal textbooks look the same as regular textbooks and the price of a digital audio player is reasonable (about 30 euro), we think multimodal textbooks are suitable for students with print disabilities in ordinary classrooms.
Design of batch audio/video conversion platform based on JavaEE

NASA Astrophysics Data System (ADS)

Cui, Yansong; Jiang, Lianpin

2018-03-01

With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.
The Use of Audio and Animation in Computer Based Instruction.

ERIC Educational Resources Information Center

Koroghlanian, Carol; Klein, James D.

This study investigated the effects of audio, animation, and spatial ability in a computer-based instructional program for biology. The program presented instructional material via test or audio with lean text and included eight instructional sequences presented either via static illustrations or animations. High school students enrolled in a…
Improved diagonal queue medical image steganography using Chaos theory, LFSR, and Rabin cryptosystem.

PubMed

Jain, Mamta; Kumar, Anil; Choudhary, Rishabh Charan

2017-06-01

In this article, we have proposed an improved diagonal queue medical image steganography for patient secret medical data transmission using chaotic standard map, linear feedback shift register, and Rabin cryptosystem, for improvement of previous technique (Jain and Lenka in Springer Brain Inform 3:39-51, 2016). The proposed algorithm comprises four stages, generation of pseudo-random sequences (pseudo-random sequences are generated by linear feedback shift register and standard chaotic map), permutation and XORing using pseudo-random sequences, encryption using Rabin cryptosystem, and steganography using the improved diagonal queues. Security analysis has been carried out. Performance analysis is observed using MSE, PSNR, maximum embedding capacity, as well as by histogram analysis between various Brain disease stego and cover images.
A review on "A Novel Technique for Image Steganography Based on Block-DCT and Huffman Encoding"

NASA Astrophysics Data System (ADS)

Das, Rig; Tuithung, Themrichon

2013-03-01

This paper reviews the embedding and extraction algorithm proposed by "A. Nag, S. Biswas, D. Sarkar and P. P. Sarkar" on "A Novel Technique for Image Steganography based on Block-DCT and Huffman Encoding" in "International Journal of Computer Science and Information Technology, Volume 2, Number 3, June 2010" [3] and shows that the Extraction of Secret Image is Not Possible for the algorithm proposed in [3]. 8 bit Cover Image of size is divided into non joint blocks and a two dimensional Discrete Cosine Transformation (2-D DCT) is performed on each of the blocks. Huffman Encoding is performed on an 8 bit Secret Image of size and each bit of the Huffman Encoded Bit Stream is embedded in the frequency domain by altering the LSB of the DCT coefficients of Cover Image blocks. The Huffman Encoded Bit Stream and Huffman Table
Audio-visual imposture

NASA Astrophysics Data System (ADS)

Karam, Walid; Mokbel, Chafic; Greige, Hanna; Chollet, Gerard

2006-05-01

A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.
Audio-visual biofeedback for respiratory-gated radiotherapy: Impact of audio instruction and audio-visual biofeedback on respiratory-gated radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

George, Rohini; Department of Biomedical Engineering, Virginia Commonwealth University, Richmond, VA; Chung, Theodore D.

2006-07-01

Purpose: Respiratory gating is a commercially available technology for reducing the deleterious effects of motion during imaging and treatment. The efficacy of gating is dependent on the reproducibility within and between respiratory cycles during imaging and treatment. The aim of this study was to determine whether audio-visual biofeedback can improve respiratory reproducibility by decreasing residual motion and therefore increasing the accuracy of gated radiotherapy. Methods and Materials: A total of 331 respiratory traces were collected from 24 lung cancer patients. The protocol consisted of five breathing training sessions spaced about a week apart. Within each session the patients initially breathedmore » without any instruction (free breathing), with audio instructions and with audio-visual biofeedback. Residual motion was quantified by the standard deviation of the respiratory signal within the gating window. Results: Audio-visual biofeedback significantly reduced residual motion compared with free breathing and audio instruction. Displacement-based gating has lower residual motion than phase-based gating. Little reduction in residual motion was found for duty cycles less than 30%; for duty cycles above 50% there was a sharp increase in residual motion. Conclusions: The efficiency and reproducibility of gating can be improved by: incorporating audio-visual biofeedback, using a 30-50% duty cycle, gating during exhalation, and using displacement-based gating.« less
A novel fuzzy logic-based image steganography method to ensure medical data security.

PubMed

Karakış, R; Güler, I; Çapraz, I; Bilir, E

2015-12-01

This study aims to secure medical data by combining them into one file format using steganographic methods. The electroencephalogram (EEG) is selected as hidden data, and magnetic resonance (MR) images are also used as the cover image. In addition to the EEG, the message is composed of the doctor׳s comments and patient information in the file header of images. Two new image steganography methods that are based on fuzzy-logic and similarity are proposed to select the non-sequential least significant bits (LSB) of image pixels. The similarity values of the gray levels in the pixels are used to hide the message. The message is secured to prevent attacks by using lossless compression and symmetric encryption algorithms. The performance of stego image quality is measured by mean square of error (MSE), peak signal-to-noise ratio (PSNR), structural similarity measure (SSIM), universal quality index (UQI), and correlation coefficient (R). According to the obtained result, the proposed method ensures the confidentiality of the patient information, and increases data repository and transmission capacity of both MR images and EEG signals. Copyright © 2015 Elsevier Ltd. All rights reserved.
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

PubMed

Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

2018-05-01

Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.
Hierarchical structure for audio-video based semantic classification of sports video sequences

NASA Astrophysics Data System (ADS)

Kolekar, M. H.; Sengupta, S.

2005-07-01

A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.
Website-based PNG image steganography using the modified Vigenere Cipher, least significant bit, and dictionary based compression methods

NASA Astrophysics Data System (ADS)

Rojali, Salman, Afan Galih; George

2017-08-01

Along with the development of information technology in meeting the needs, various adverse actions and difficult to avoid are emerging. One of such action is data theft. Therefore, this study will discuss about cryptography and steganography that aims to overcome these problems. This study will use the Modification Vigenere Cipher, Least Significant Bit and Dictionary Based Compression methods. To determine the performance of study, Peak Signal to Noise Ratio (PSNR) method is used to measure objectively and Mean Opinion Score (MOS) method is used to measure subjectively, also, the performance of this study will be compared to other method such as Spread Spectrum and Pixel Value differencing. After comparing, it can be concluded that this study can provide better performance when compared to other methods (Spread Spectrum and Pixel Value Differencing) and has a range of MSE values (0.0191622-0.05275) and PSNR (60.909 to 65.306) with a hidden file size of 18 kb and has a MOS value range (4.214 to 4.722) or image quality that is approaching very good.
Impact of audio narrated animation on students' understanding and learning environment based on gender

NASA Astrophysics Data System (ADS)

Nasrudin, Ajeng Ratih; Setiawan, Wawan; Sanjaya, Yayan

2017-05-01

This study is titled the impact of audio narrated animation on students' understanding in learning humanrespiratory system based on gender. This study was conducted in eight grade of junior high school. This study aims to investigate the difference of students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation. Research method that is used is quasy experiment with matching pre-test post-test comparison group design. The procedures of study are: (1) preliminary study and learning habituation using audio narrated animation; (2) implementation of learning using audio narrated animation and taking data; (3) analysis and discussion. The result of analysis shows that there is significant difference on students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation, both in general and specifically in achieving learning indicators. The discussion related to the impact of audio narrated animation, gender characteristics, and constructivist learning environment. It can be concluded that there is significant difference of students' understanding at boys and girls classes in learning human respiratory system using audio narrated animation. Additionally, based on interpretation of students' respond, there is the difference increment of agreement level in learning environment.
Three-Dimensional Audio Client Library

NASA Technical Reports Server (NTRS)

Rizzi, Stephen A.

2005-01-01

The Three-Dimensional Audio Client Library (3DAudio library) is a group of software routines written to facilitate development of both stand-alone (audio only) and immersive virtual-reality application programs that utilize three-dimensional audio displays. The library is intended to enable the development of three-dimensional audio client application programs by use of a code base common to multiple audio server computers. The 3DAudio library calls vendor-specific audio client libraries and currently supports the AuSIM Gold-Server and Lake Huron audio servers. 3DAudio library routines contain common functions for (1) initiation and termination of a client/audio server session, (2) configuration-file input, (3) positioning functions, (4) coordinate transformations, (5) audio transport functions, (6) rendering functions, (7) debugging functions, and (8) event-list-sequencing functions. The 3DAudio software is written in the C++ programming language and currently operates under the Linux, IRIX, and Windows operating systems.
Estimation of inhalation flow profile using audio-based methods to assess inhaler medication adherence.

PubMed

Taylor, Terence E; Lacalle Muls, Helena; Costello, Richard W; Reilly, Richard B

2018-01-01

Asthma and chronic obstructive pulmonary disease (COPD) patients are required to inhale forcefully and deeply to receive medication when using a dry powder inhaler (DPI). There is a clinical need to objectively monitor the inhalation flow profile of DPIs in order to remotely monitor patient inhalation technique. Audio-based methods have been previously employed to accurately estimate flow parameters such as the peak inspiratory flow rate of inhalations, however, these methods required multiple calibration inhalation audio recordings. In this study, an audio-based method is presented that accurately estimates inhalation flow profile using only one calibration inhalation audio recording. Twenty healthy participants were asked to perform 15 inhalations through a placebo Ellipta™ DPI at a range of inspiratory flow rates. Inhalation flow signals were recorded using a pneumotachograph spirometer while inhalation audio signals were recorded simultaneously using the Inhaler Compliance Assessment device attached to the inhaler. The acoustic (amplitude) envelope was estimated from each inhalation audio signal. Using only one recording, linear and power law regression models were employed to determine which model best described the relationship between the inhalation acoustic envelope and flow signal. Each model was then employed to estimate the flow signals of the remaining 14 inhalation audio recordings. This process repeated until each of the 15 recordings were employed to calibrate single models while testing on the remaining 14 recordings. It was observed that power law models generated the highest average flow estimation accuracy across all participants (90.89±0.9% for power law models and 76.63±2.38% for linear models). The method also generated sufficient accuracy in estimating inhalation parameters such as peak inspiratory flow rate and inspiratory capacity within the presence of noise. Estimating inhaler inhalation flow profiles using audio based methods may be
Estimation of inhalation flow profile using audio-based methods to assess inhaler medication adherence

PubMed Central

Lacalle Muls, Helena; Costello, Richard W.; Reilly, Richard B.

2018-01-01

Asthma and chronic obstructive pulmonary disease (COPD) patients are required to inhale forcefully and deeply to receive medication when using a dry powder inhaler (DPI). There is a clinical need to objectively monitor the inhalation flow profile of DPIs in order to remotely monitor patient inhalation technique. Audio-based methods have been previously employed to accurately estimate flow parameters such as the peak inspiratory flow rate of inhalations, however, these methods required multiple calibration inhalation audio recordings. In this study, an audio-based method is presented that accurately estimates inhalation flow profile using only one calibration inhalation audio recording. Twenty healthy participants were asked to perform 15 inhalations through a placebo Ellipta™ DPI at a range of inspiratory flow rates. Inhalation flow signals were recorded using a pneumotachograph spirometer while inhalation audio signals were recorded simultaneously using the Inhaler Compliance Assessment device attached to the inhaler. The acoustic (amplitude) envelope was estimated from each inhalation audio signal. Using only one recording, linear and power law regression models were employed to determine which model best described the relationship between the inhalation acoustic envelope and flow signal. Each model was then employed to estimate the flow signals of the remaining 14 inhalation audio recordings. This process repeated until each of the 15 recordings were employed to calibrate single models while testing on the remaining 14 recordings. It was observed that power law models generated the highest average flow estimation accuracy across all participants (90.89±0.9% for power law models and 76.63±2.38% for linear models). The method also generated sufficient accuracy in estimating inhalation parameters such as peak inspiratory flow rate and inspiratory capacity within the presence of noise. Estimating inhaler inhalation flow profiles using audio based methods may be
Understanding Cognitive Engagement in Online Discussion: Use of a Scaffolded, Audio-Based Argumentation Activity

ERIC Educational Resources Information Center

Oh, Eunjung Grace; Kim, Hyun Song

2016-01-01

The purpose of this paper is to explore how adult learners engage in asynchronous online discussion through the implementation of an audio-based argumentation activity. The study designed scaffolded audio-based argumentation activities to promote students' cognitive engagement. The research was conducted in an online graduate course at a liberal…
Multi-bit wavelength coding phase-shift-keying optical steganography based on amplified spontaneous emission noise

NASA Astrophysics Data System (ADS)

Wang, Cheng; Wang, Hongxiang; Ji, Yuefeng

2018-01-01

In this paper, a multi-bit wavelength coding phase-shift-keying (PSK) optical steganography method is proposed based on amplified spontaneous emission noise and wavelength selection switch. In this scheme, the assignment codes and the delay length differences provide a large two-dimensional key space. A 2-bit wavelength coding PSK system is simulated to show the efficiency of our proposed method. The simulated results demonstrate that the stealth signal after encoded and modulated is well-hidden in both time and spectral domains, under the public channel and noise existing in the system. Besides, even the principle of this scheme and the existence of stealth channel are known to the eavesdropper, the probability of recovering the stealth data is less than 0.02 if the key is unknown. Thus it can protect the security of stealth channel more effectively. Furthermore, the stealth channel will results in 0.48 dB power penalty to the public channel at 1 × 10-9 bit error rate, and the public channel will have no influence on the receiving of the stealth channel.
Comparing Learning Gains: Audio Versus Text-based Instructor Communication in a Blended Online Learning Environment

NASA Astrophysics Data System (ADS)

Shimizu, Dominique

Though blended course audio feedback has been associated with several measures of course satisfaction at the postsecondary and graduate levels compared to text feedback, it may take longer to prepare and positive results are largely unverified in K-12 literature. The purpose of this quantitative study was to investigate the time investment and learning impact of audio communications with 228 secondary students in a blended online learning biology unit at a central Florida public high school. A short, individualized audio message regarding the student's progress was given to each student in the audio group; similar text-based messages were given to each student in the text-based group on the same schedule; a control got no feedback. A pretest and posttest were employed to measure learning gains in the three groups. To compare the learning gains in two types of feedback with each other and to no feedback, a controlled, randomized, experimental design was implemented. In addition, the creation and posting of audio and text feedback communications were timed in order to assess whether audio feedback took longer to produce than text only feedback. While audio feedback communications did take longer to create and post, there was no difference between learning gains as measured by posttest scores when student received audio, text-based, or no feedback. Future studies using a similar randomized, controlled experimental design are recommended to verify these results and test whether the trend holds in a broader range of subjects, over different time frames, and using a variety of assessment types to measure student learning.
Implementing Audio-CASI on Windows’ Platforms

PubMed Central

Cooley, Philip C.; Turner, Charles F.

2011-01-01

Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743
Audio-based bolt-loosening detection technique of bolt joint

NASA Astrophysics Data System (ADS)

Zhang, Yang; Zhao, Xuefeng; Su, Wensheng; Xue, Zhigang

2018-03-01

Bolt joint, as the commonest coupling structure, is widely used in electro-mechanical system. However, it is the weakest part of the whole system. The increase of preload tension force can raise the reliability and strength of the bolt joint. Therefore, the pretension force is one of the most important factors to ensure the stability of bolt joint. According to the way of generating pretension force, the pretension force can be monitored by bolt torque, degrees and elongation. The existing bolt-loosening monitoring methods all require expensive equipment, which greatly restricts the practicality of the bolt-loosening monitoring. In this paper, a new method of bolt-loosening detection technique based on audio is proposed. The sound that bolt is hit by a hammer is recorded on the Smartphone, and the collected audio signal is classified and identified by support vector machine algorithm. First, a verification test was designed and the results show that this new method can identify the damage of bolt looseness accurately. Second, a variety of bolt-loosening was identified. The results indicate that this method has a high accuracy in multiclass classification of the bolt looseness. This bolt-loosening detection technique based on audio not only can reduce the requirements of technical and professional experience, but also make bolt-loosening monitoring simpler and easier.

Robust High-Capacity Audio Watermarking Based on FFT Amplitude Modification

NASA Astrophysics Data System (ADS)

Fallahpour, Mehdi; Megías, David

This paper proposes a novel robust audio watermarking algorithm to embed data and extract it in a bit-exact manner based on changing the magnitudes of the FFT spectrum. The key point is selecting a frequency band for embedding based on the comparison between the original and the MP3 compressed/decompressed signal and on a suitable scaling factor. The experimental results show that the method has a very high capacity (about 5kbps), without significant perceptual distortion (ODG about -0.25) and provides robustness against common audio signal processing such as added noise, filtering and MPEG compression (MP3). Furthermore, the proposed method has a larger capacity (number of embedded bits to number of host bits rate) than recent image data hiding methods.
Objective Assessment of Patient Inhaler User Technique Using an Audio-Based Classification Approach.

PubMed

Taylor, Terence E; Zigel, Yaniv; Egan, Clarice; Hughes, Fintan; Costello, Richard W; Reilly, Richard B

2018-02-01

Many patients make critical user technique errors when using pressurised metered dose inhalers (pMDIs) which reduce the clinical efficacy of respiratory medication. Such critical errors include poor actuation coordination (poor timing of medication release during inhalation) and inhaling too fast (peak inspiratory flow rate over 90 L/min). Here, we present a novel audio-based method that objectively assesses patient pMDI user technique. The Inhaler Compliance Assessment device was employed to record inhaler audio signals from 62 respiratory patients as they used a pMDI with an In-Check Flo-Tone device attached to the inhaler mouthpiece. Using a quadratic discriminant analysis approach, the audio-based method generated a total frame-by-frame accuracy of 88.2% in classifying sound events (actuation, inhalation and exhalation). The audio-based method estimated the peak inspiratory flow rate and volume of inhalations with an accuracy of 88.2% and 83.94% respectively. It was detected that 89% of patients made at least one critical user technique error even after tuition from an expert clinical reviewer. This method provides a more clinically accurate assessment of patient inhaler user technique than standard checklist methods.
Improvement of information fusion-based audio steganalysis

NASA Astrophysics Data System (ADS)

Kraetzer, Christian; Dittmann, Jana

2010-01-01

In the paper we extend an existing information fusion based audio steganalysis approach by three different kinds of evaluations: The first evaluation addresses the so far neglected evaluations on sensor level fusion. Our results show that this fusion removes content dependability while being capable of achieving similar classification rates (especially for the considered global features) if compared to single classifiers on the three exemplarily tested audio data hiding algorithms. The second evaluation enhances the observations on fusion from considering only segmental features to combinations of segmental and global features, with the result of a reduction of the required computational complexity for testing by about two magnitudes while maintaining the same degree of accuracy. The third evaluation tries to build a basis for estimating the plausibility of the introduced steganalysis approach by measuring the sensibility of the models used in supervised classification of steganographic material against typical signal modification operations like de-noising or 128kBit/s MP3 encoding. Our results show that for some of the tested classifiers the probability of false alarms rises dramatically after such modifications.
Generation new MP3 data set after compression

NASA Astrophysics Data System (ADS)

Atoum, Mohammed Salem; Almahameed, Mohammad

2016-02-01

The success of audio steganography techniques is to ensure imperceptibility of the embedded secret message in stego file and withstand any form of intentional or un-intentional degradation of secret message (robustness). Crucial to that using digital audio file such as MP3 file, which comes in different compression rate, however research studies have shown that performing steganography in MP3 format after compression is the most suitable one. Unfortunately until now the researchers can not test and implement their algorithm because no standard data set in MP3 file after compression is generated. So this paper focuses to generate standard data set with different compression ratio and different Genre to help researchers to implement their algorithms.
A secure steganography for privacy protection in healthcare system.

PubMed

Liu, Jing; Tang, Guangming; Sun, Yifeng

2013-04-01

Private data in healthcare system require confidentiality protection while transmitting. Steganography is the art of concealing data into a cover media for conveying messages confidentially. In this paper, we propose a steganographic method which can provide private data in medical system with very secure protection. In our method, a cover image is first mapped into a 1D pixels sequence by Hilbert filling curve and then divided into non-overlapping embedding units with three consecutive pixels. We use adaptive pixel pair match (APPM) method to embed digits in the pixel value differences (PVD) of the three pixels and the base of embedded digits is dependent on the differences among the three pixels. By solving an optimization problem, minimal distortion of the pixel ternaries caused by data embedding can be obtained. The experimental results show our method is more suitable to privacy protection of healthcare system than prior steganographic works.
Advances in Audio-Based Systems to Monitor Patient Adherence and Inhaler Drug Delivery.

PubMed

Taylor, Terence E; Zigel, Yaniv; De Looze, Céline; Sulaiman, Imran; Costello, Richard W; Reilly, Richard B

2018-03-01

Hundreds of millions of people worldwide have asthma and COPD. Current medications to control these chronic respiratory diseases can be administered using inhaler devices, such as the pressurized metered dose inhaler and the dry powder inhaler. Provided that they are used as prescribed, inhalers can improve patient clinical outcomes and quality of life. Poor patient inhaler adherence (both time of use and user technique) is, however, a major clinical concern and is associated with poor disease control, increased hospital admissions, and increased mortality rates, particularly in low- and middle-income countries. There are currently limited methods available to health-care professionals to objectively and remotely monitor patient inhaler adherence. This review describes recent sensor-based technologies that use audio-based approaches that show promising opportunities for monitoring inhaler adherence in clinical practice. This review discusses how one form of sensor-based technology, audio-based monitoring systems, can provide clinically pertinent information regarding patient inhaler use over the course of treatment. Audio-based monitoring can provide health-care professionals with quantitative measurements of the drug delivery of inhalers, signifying a clear clinical advantage over other methods of assessment. Furthermore, objective audio-based adherence measures can improve the predictability of patient outcomes to treatment compared with current standard methods of adherence assessment used in clinical practice. Objective feedback on patient inhaler adherence can be used to personalize treatment to the patient, which may enhance precision medicine in the treatment of chronic respiratory diseases. Copyright © 2017 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
Reduction in time-to-sleep through EEG based brain state detection and audio stimulation.

PubMed

Zhuo Zhang; Cuntai Guan; Ti Eu Chan; Juanhong Yu; Aung Aung Phyo Wai; Chuanchu Wang; Haihong Zhang

2015-08-01

We developed an EEG- and audio-based sleep sensing and enhancing system, called iSleep (interactive Sleep enhancement apparatus). The system adopts a closed-loop approach which optimizes the audio recording selection based on user's sleep status detected through our online EEG computing algorithm. The iSleep prototype comprises two major parts: 1) a sleeping mask integrated with a single channel EEG electrode and amplifier, a pair of stereo earphones and a microcontroller with wireless circuit for control and data streaming; 2) a mobile app to receive EEG signals for online sleep monitoring and audio playback control. In this study we attempt to validate our hypothesis that appropriate audio stimulation in relation to brain state can induce faster onset of sleep and improve the quality of a nap. We conduct experiments on 28 healthy subjects, each undergoing two nap sessions - one with a quiet background and one with our audio-stimulation. We compare the time-to-sleep in both sessions between two groups of subjects, e.g., fast and slow sleep onset groups. The p-value obtained from Wilcoxon Signed Rank Test is 1.22e-04 for slow onset group, which demonstrates that iSleep can significantly reduce the time-to-sleep for people with difficulty in falling sleep.
Steganography anomaly detection using simple one-class classification

NASA Astrophysics Data System (ADS)

Rodriguez, Benjamin M.; Peterson, Gilbert L.; Agaian, Sos S.

2007-04-01

There are several security issues tied to multimedia when implementing the various applications in the cellular phone and wireless industry. One primary concern is the potential ease of implementing a steganography system. Traditionally, the only mechanism to embed information into a media file has been with a desktop computer. However, as the cellular phone and wireless industry matures, it becomes much simpler for the same techniques to be performed using a cell phone. In this paper, two methods are compared that classify cell phone images as either an anomaly or clean, where a clean image is one in which no alterations have been made and an anomalous image is one in which information has been hidden within the image. An image in which information has been hidden is known as a stego image. The main concern in detecting steganographic content with machine learning using cell phone images is in training specific embedding procedures to determine if the method has been used to generate a stego image. This leads to a possible flaw in the system when the learned model of stego is faced with a new stego method which doesn't match the existing model. The proposed solution to this problem is to develop systems that detect steganography as anomalies, making the embedding method irrelevant in detection. Two applicable classification methods for solving the anomaly detection of steganographic content problem are single class support vector machines (SVM) and Parzen-window. Empirical comparison of the two approaches shows that Parzen-window outperforms the single class SVM most likely due to the fact that Parzen-window generalizes less.
The use of ambient audio to increase safety and immersion in location-based games

NASA Astrophysics Data System (ADS)

Kurczak, John Jason

The purpose of this thesis is to propose an alternative type of interface for mobile software being used while walking or running. Our work addresses the problem of visual user interfaces for mobile software be- ing potentially unsafe for pedestrians, and not being very immersive when used for location-based games. In addition, location-based games and applications can be dif- ficult to develop when directly interfacing with the sensors used to track the user's location. These problems need to be addressed because portable computing devices are be- coming a popular tool for navigation, playing games, and accessing the internet while walking. This poses a safety problem for mobile users, who may be paying too much attention to their device to notice and react to hazards in their environment. The difficulty of developing location-based games and other location-aware applications may significantly hinder the prevalence of applications that explore new interaction techniques for ubiquitous computing. We created the TREC toolkit to address the issues with tracking sensors while developing location-based games and applications. We have developed functional location-based applications with TREC to demonstrate the amount of work that can be saved by using this toolkit. In order to have a safer and more immersive alternative to visual interfaces, we have developed ambient audio interfaces for use with mobile applications. Ambient audio uses continuous streams of sound over headphones to present information to mobile users without distracting them from walking safely. In order to test the effectiveness of ambient audio, we ran a study to compare ambient audio with handheld visual interfaces in a location-based game. We compared players' ability to safely navigate the environment, their sense of immersion in the game, and their performance at the in-game tasks. We found that ambient audio was able to significantly increase players' safety and sense of immersion compared to a
Audio-visual interactions in environment assessment.

PubMed

Preis, Anna; Kociński, Jędrzej; Hafke-Dys, Honorata; Wrzosek, Małgorzata

2015-08-01

The aim of the study was to examine how visual and audio information influences audio-visual environment assessment. Original audio-visual recordings were made at seven different places in the city of Poznań. Participants of the psychophysical experiments were asked to rate, on a numerical standardized scale, the degree of comfort they would feel if they were in such an environment. The assessments of audio-visual comfort were carried out in a laboratory in four different conditions: (a) audio samples only, (b) original audio-visual samples, (c) video samples only, and (d) mixed audio-visual samples. The general results of this experiment showed a significant difference between the investigated conditions, but not for all the investigated samples. There was a significant improvement in comfort assessment when visual information was added (in only three out of 7 cases), when conditions (a) and (b) were compared. On the other hand, the results show that the comfort assessment of audio-visual samples could be changed by manipulating the audio rather than the video part of the audio-visual sample. Finally, it seems, that people could differentiate audio-visual representations of a given place in the environment based rather of on the sound sources' compositions than on the sound level. Object identification is responsible for both landscape and soundscape grouping. Copyright © 2015. Published by Elsevier B.V.
Description of an Audio-Based Paced Respiration Intervention for Vasomotor Symptoms

PubMed Central

Burns, Debra S.; Drews, Michael R.; Carpenter, Janet S.

2013-01-01

Millions of women experience menopause-related hot flashes or flushes that may have a negative effect on their quality of life. Hormone therapy is an effective treatment, however, it may be contraindicated or unacceptable for some women based on previous health complications or an undesirable risk–benefit ratio. Side effects and the unacceptability of hormone therapy have created a need for behavioral interventions to reduce hot flashes. A variety of complex, multimodal behavioral, relaxation-based interventions have been studied with women (n = 88) and showed generally favorable results. However, currently extensive resource commitments reduce the translation of these interventions into standard care. Slow, deep breathing is a common component in most interventions and may be the active ingredient leading to reduced hot flashes. This article describes the content of an audio-based program designed to teach paced breathing to reduce hot flashes. Intervention content was based on skills training theory and music entrainment. The audio intervention provides an efficient way to deliver a breathing intervention that may be beneficial to other clinical populations. PMID:23914283
Internet Audio Products (3/3)

ERIC Educational Resources Information Center

Schwartz, Linda; de Schutter, Adrienne; Fahrni, Patricia; Rudolph, Jim

2004-01-01

Two contrasting additions to the online audio market are reviewed: "iVocalize", a browser-based audio-conferencing software, and "Skype", a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The "iVocalize" review emphasizes the…
A randomized controlled trial of an audio-based treatment program for child anxiety disorders.

PubMed

Infantino, Alyssa; Donovan, Caroline L; March, Sonja

2016-04-01

The aim of this study was to investigate the efficacy of an audio-based cognitive-behavioural therapy (CBT) program for child anxiety disorders. Twenty-four children aged 5-11 years were randomly allocated into either the audio-based CBT program condition (Audio, n = 12) or a waitlist control (WL; n = 12) group. Outcome measures included a clinical diagnostic interview, clinician-rated global assessment of functioning, and parent and child self-report ratings of anxiety and internalisation. Assessments were conducted prior to treatment, 12 weeks following treatment, and at 3-month follow-up. Results indicated that at post-assessment, 58.3% of children receiving treatment compared to 16.7% of waitlist children were free of their primary diagnosis, with this figure rising to 66.67% at the 3-month follow-up time point. Additionally, at post-assessment, 25.0% of children in the treatment condition compared to .0% of the waitlist condition were free of all anxiety diagnoses, with this figure rising to 41.67% for the treatment group at 3-month follow-up. Overall, the findings suggest that the audio program tested in this study has the potential to be an efficacious treatment alternative for anxious children. Copyright © 2016 Elsevier Ltd. All rights reserved.
The priming function of in-car audio instruction.

PubMed

Keyes, Helen; Whitmore, Antony; Naneva, Stanislava; McDermott, Daragh

2018-05-01

Studies to date have focused on the priming power of visual road signs, but not the priming potential of audio road scene instruction. Here, the relative priming power of visual, audio, and multisensory road scene instructions was assessed. In a lab-based study, participants responded to target road scene turns following visual, audio, or multisensory road turn primes which were congruent or incongruent to the primes in direction, or control primes. All types of instruction (visual, audio, and multisensory) were successful in priming responses to a road scene. Responses to multisensory-primed targets (both audio and visual) were faster than responses to either audio or visual primes alone. Incongruent audio primes did not affect performance negatively in the manner of incongruent visual or multisensory primes. Results suggest that audio instructions have the potential to prime drivers to respond quickly and safely to their road environment. Peak performance will be observed if audio and visual road instruction primes can be timed to co-occur.
47 CFR 87.483 - Audio visual warning systems.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 47 Telecommunication 5 2014-10-01 2014-10-01 false Audio visual warning systems. 87.483 Section 87... AVIATION SERVICES Stations in the Radiodetermination Service § 87.483 Audio visual warning systems. An audio visual warning system (AVWS) is a radar-based obstacle avoidance system. AVWS activates...
CERN automatic audio-conference service

NASA Astrophysics Data System (ADS)

Sierra Moral, Rodrigo

2010-04-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.
Audio-visual affective expression recognition

NASA Astrophysics Data System (ADS)

Huang, Thomas S.; Zeng, Zhihong

2007-11-01

Automatic affective expression recognition has attracted more and more attention of researchers from different disciplines, which will significantly contribute to a new paradigm for human computer interaction (affect-sensitive interfaces, socially intelligent environments) and advance the research in the affect-related fields including psychology, psychiatry, and education. Multimodal information integration is a process that enables human to assess affective states robustly and flexibly. In order to understand the richness and subtleness of human emotion behavior, the computer should be able to integrate information from multiple sensors. We introduce in this paper our efforts toward machine understanding of audio-visual affective behavior, based on both deliberate and spontaneous displays. Some promising methods are presented to integrate information from both audio and visual modalities. Our experiments show the advantage of audio-visual fusion in affective expression recognition over audio-only or visual-only approaches.
Perceptual Audio Hashing Functions

NASA Astrophysics Data System (ADS)

Özer, Hamza; Sankur, Bülent; Memon, Nasir; Anarım, Emin

2005-12-01

Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.
Web Audio/Video Streaming Tool

NASA Technical Reports Server (NTRS)

Guruvadoo, Eranna K.

2003-01-01

In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.
Design of an audio advertisement dataset

NASA Astrophysics Data System (ADS)

Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

2015-12-01

Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

Multimodal audio guide for museums and exhibitions

NASA Astrophysics Data System (ADS)

Gebbensleben, Sandra; Dittmann, Jana; Vielhauer, Claus

2006-02-01

In our paper we introduce a new Audio Guide concept for exploring buildings, realms and exhibitions. Actual proposed solutions work in most cases with pre-defined devices, which users have to buy or borrow. These systems often go along with complex technical installations and require a great degree of user training for device handling. Furthermore, the activation of audio commentary related to the exhibition objects is typically based on additional components like infrared, radio frequency or GPS technology. Beside the necessity of installation of specific devices for user location, these approaches often only support automatic activation with no or limited user interaction. Therefore, elaboration of alternative concepts appears worthwhile. Motivated by these aspects, we introduce a new concept based on usage of the visitor's own mobile smart phone. The advantages in our approach are twofold: firstly the Audio Guide can be used in various places without any purchase and extensive installation of additional components in or around the exhibition object. Secondly, the visitors can experience the exhibition on individual tours only by uploading the Audio Guide at a single point of entry, the Audio Guide Service Counter, and keeping it on her or his personal device. Furthermore, since the user usually is quite familiar with the interface of her or his phone and can thus interact with the application device easily. Our technical concept makes use of two general ideas for location detection and activation. Firstly, we suggest an enhanced interactive number based activation by exploiting the visual capabilities of modern smart phones and secondly we outline an active digital audio watermarking approach, where information about objects are transmitted via an analog audio channel.
Steganography algorithm multi pixel value differencing (MPVD) to increase message capacity and data security

NASA Astrophysics Data System (ADS)

Rojali, Siahaan, Ida Sri Rejeki; Soewito, Benfano

2017-08-01

Steganography is the art and science of hiding the secret messages so the existence of the message cannot be detected by human senses. The data concealment is using the Multi Pixel Value Differencing (MPVD) algorithm, utilizing the difference from each pixel. The development was done by using six interval tables. The objective of this algorithm is to enhance the message capacity and to maintain the data security.
Minimizing embedding impact in steganography using trellis-coded quantization

NASA Astrophysics Data System (ADS)

Filler, Tomáš; Judas, Jan; Fridrich, Jessica

2010-01-01

In this paper, we propose a practical approach to minimizing embedding impact in steganography based on syndrome coding and trellis-coded quantization and contrast its performance with bounds derived from appropriate rate-distortion bounds. We assume that each cover element can be assigned a positive scalar expressing the impact of making an embedding change at that element (single-letter distortion). The problem is to embed a given payload with minimal possible average embedding impact. This task, which can be viewed as a generalization of matrix embedding or writing on wet paper, has been approached using heuristic and suboptimal tools in the past. Here, we propose a fast and very versatile solution to this problem that can theoretically achieve performance arbitrarily close to the bound. It is based on syndrome coding using linear convolutional codes with the optimal binary quantizer implemented using the Viterbi algorithm run in the dual domain. The complexity and memory requirements of the embedding algorithm are linear w.r.t. the number of cover elements. For practitioners, we include detailed algorithms for finding good codes and their implementation. Finally, we report extensive experimental results for a large set of relative payloads and for different distortion profiles, including the wet paper channel.
Evaluation of listener-based anuran surveys with automated audio recording devices

USGS Publications Warehouse

Shearin, A. F.; Calhoun, A.J.K.; Loftin, C.S.

2012-01-01

Volunteer-based audio surveys are used to document long-term trends in anuran community composition and abundance. Current sampling protocols, however, are not region- or species-specific and may not detect relatively rare or audibly cryptic species. We used automated audio recording devices to record calling anurans during 2006–2009 at wetlands in Maine, USA. We identified species calling, chorus intensity, time of day, and environmental variables when each species was calling and developed logistic and generalized mixed models to determine the time interval and environmental variables that optimize detection of each species during peak calling periods. We detected eight of nine anurans documented in Maine. Individual recordings selected from the sampling period (0.5 h past sunset to 0100 h) described in the North American Amphibian Monitoring Program (NAAMP) detected fewer species than were detected in recordings from 30 min past sunset until sunrise. Time of maximum detection of presence and full chorusing for three species (green frogs, mink frogs, pickerel frogs) occurred after the NAAMP sampling end time (0100 h). The NAAMP protocol’s sampling period may result in omissions and misclassifications of chorus sizes for certain species. These potential errors should be considered when interpreting trends generated from standardized anuran audio surveys.
On Max-Plus Algebra and Its Application on Image Steganography.

PubMed

Santoso, Kiswara Agung; Fatmawati; Suprajitno, Herry

2018-01-01

We propose a new steganography method to hide an image into another image using matrix multiplication operations on max-plus algebra. This is especially interesting because the matrix used in encoding or information disguises generally has an inverse, whereas matrix multiplication operations in max-plus algebra do not have an inverse. The advantages of this method are the size of the image that can be hidden into the cover image, larger than the previous method. The proposed method has been tested on many secret images, and the results are satisfactory which have a high level of strength and a high level of security and can be used in various operating systems.
Digital Advances in Contemporary Audio Production.

ERIC Educational Resources Information Center

Shields, Steven O.

Noting that a revolution in sonic high fidelity occurred during the 1980s as digital-based audio production methods began to replace traditional analog modes, this paper offers both an overview of digital audio theory and descriptions of some of the related digital production technologies that have begun to emerge from the mating of the computer…
Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks.

PubMed

Luque, Joaquín; Larios, Diego F; Personal, Enrique; Barbancho, Julio; León, Carlos

2016-05-18

Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance.
Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks

PubMed Central

Luque, Joaquín; Larios, Diego F.; Personal, Enrique; Barbancho, Julio; León, Carlos

2016-01-01

Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance. PMID:27213375
Aeronautical audio broadcasting via satellite

NASA Technical Reports Server (NTRS)

Tzeng, Forrest F.

1993-01-01

A system design for aeronautical audio broadcasting, with C-band uplink and L-band downlink, via Inmarsat space segments is presented. Near-transparent-quality compression of 5-kHz bandwidth audio at 20.5 kbit/s is achieved based on a hybrid technique employing linear predictive modeling and transform-domain residual quantization. Concatenated Reed-Solomon/convolutional codes with quadrature phase shift keying are selected for bandwidth and power efficiency. RF bandwidth at 25 kHz per channel, and a decoded bit error rate at 10(exp -6) with E(sub b)/N(sub o) at 3.75 dB are obtained. An interleaver, scrambler, modem synchronization, and frame format were designed, and frequency-division multiple access was selected over code-division multiple access. A link budget computation based on a worst-case scenario indicates sufficient system power margins. Transponder occupancy analysis for 72 audio channels demonstrates ample remaining capacity to accommodate emerging aeronautical services.
Building a Steganography Program Including How to Load, Process, and Save JPEG and PNG Files in Java

ERIC Educational Resources Information Center

Courtney, Mary F.; Stix, Allen

2006-01-01

Instructors teaching beginning programming classes are often interested in exercises that involve processing photographs (i.e., files stored as .jpeg). They may wish to offer activities such as color inversion, the color manipulation effects archived with pixel thresholding, or steganography, all of which Stevenson et al. [4] assert are sought by…
Audio-guided audiovisual data segmentation, indexing, and retrieval

NASA Astrophysics Data System (ADS)

Zhang, Tong; Kuo, C.-C. Jay

1998-12-01

While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.
Characteristics of audio and sub-audio telluric signals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Telford, W.M.

1977-06-01

Telluric current measurements in the audio and sub-audio frequency range, made in various parts of Canada and South America over the past four years, indicate that the signal amplitude is relatively uniform over 6 to 8 midday hours (LMT) except in Chile and that the signal anisotropy is reasonably constant in azimuth.
Predicting the Overall Spatial Quality of Automotive Audio Systems

NASA Astrophysics Data System (ADS)

Koya, Daisuke

The spatial quality of automotive audio systems is often compromised due to their unideal listening environments. Automotive audio systems need to be developed quickly due to industry demands. A suitable perceptual model could evaluate the spatial quality of automotive audio systems with similar reliability to formal listening tests but take less time. Such a model is developed in this research project by adapting an existing model of spatial quality for automotive audio use. The requirements for the adaptation were investigated in a literature review. A perceptual model called QESTRAL was reviewed, which predicts the overall spatial quality of domestic multichannel audio systems. It was determined that automotive audio systems are likely to be impaired in terms of the spatial attributes that were not considered in developing the QESTRAL model, but metrics are available that might predict these attributes. To establish whether the QESTRAL model in its current form can accurately predict the overall spatial quality of automotive audio systems, MUSHRA listening tests using headphone auralisation with head tracking were conducted to collect results to be compared against predictions by the model. Based on guideline criteria, the model in its current form could not accurately predict the overall spatial quality of automotive audio systems. To improve prediction performance, the QESTRAL model was recalibrated and modified using existing metrics of the model, those that were proposed from the literature review, and newly developed metrics. The most important metrics for predicting the overall spatial quality of automotive audio systems included those that were interaural cross-correlation (IACC) based, relate to localisation of the frontal audio scene, and account for the perceived scene width in front of the listener. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems. The resulting model predicts the overall spatial
Audio-Vision: Audio-Visual Interaction in Desktop Multimedia.

ERIC Educational Resources Information Center

Daniels, Lee

Although sophisticated multimedia authoring applications are now available to amateur programmers, the use of audio in of these programs has been inadequate. Due to the lack of research in the use of audio in instruction, there are few resources to assist the multimedia producer in using sound effectively and efficiently. This paper addresses the…
On Max-Plus Algebra and Its Application on Image Steganography

PubMed Central

Santoso, Kiswara Agung

2018-01-01

We propose a new steganography method to hide an image into another image using matrix multiplication operations on max-plus algebra. This is especially interesting because the matrix used in encoding or information disguises generally has an inverse, whereas matrix multiplication operations in max-plus algebra do not have an inverse. The advantages of this method are the size of the image that can be hidden into the cover image, larger than the previous method. The proposed method has been tested on many secret images, and the results are satisfactory which have a high level of strength and a high level of security and can be used in various operating systems. PMID:29887761
Military Review: The Professional Journal of the U.S. Army. January-February 2002

DTIC Science & Technology

2002-02-01

Internet.”9 He accuses bin Laden of hiding maps and photos of targets and of posting instructions on sports chat rooms, porno - graphic bulletin boards...anything unusual. Messages can be hidden in audio, video , or still image files, with information stored in the least significant bits of a digitized file...steganography, embedding secret messages in other messages to prevent observers from suspecting anything unusual. Messages can be hidden in audio, video , or
Electrophysiological evidence for Audio-visuo-lingual speech integration.

PubMed

Treille, Avril; Vilain, Coriandre; Schwartz, Jean-Luc; Hueber, Thomas; Sato, Marc

2018-01-31

Recent neurophysiological studies demonstrate that audio-visual speech integration partly operates through temporal expectations and speech-specific predictions. From these results, one common view is that the binding of auditory and visual, lipread, speech cues relies on their joint probability and prior associative audio-visual experience. The present EEG study examined whether visual tongue movements integrate with relevant speech sounds, despite little associative audio-visual experience between the two modalities. A second objective was to determine possible similarities and differences of audio-visual speech integration between unusual audio-visuo-lingual and classical audio-visuo-labial modalities. To this aim, participants were presented with auditory, visual, and audio-visual isolated syllables, with the visual presentation related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, with lingual and facial movements previously recorded by an ultrasound imaging system and a video camera. In line with previous EEG studies, our results revealed an amplitude decrease and a latency facilitation of P2 auditory evoked potentials in both audio-visual-lingual and audio-visuo-labial conditions compared to the sum of unimodal conditions. These results argue against the view that auditory and visual speech cues solely integrate based on prior associative audio-visual perceptual experience. Rather, they suggest that dynamic and phonetic informational cues are sharable across sensory modalities, possibly through a cross-modal transfer of implicit articulatory motor knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.
Video conference quality assessment based on cooperative sensing of video and audio

NASA Astrophysics Data System (ADS)

Wang, Junxi; Chen, Jialin; Tian, Xin; Zhou, Cheng; Zhou, Zheng; Ye, Lu

2015-12-01

This paper presents a method to video conference quality assessment, which is based on cooperative sensing of video and audio. In this method, a proposed video quality evaluation method is used to assess the video frame quality. The video frame is divided into noise image and filtered image by the bilateral filters. It is similar to the characteristic of human visual, which could also be seen as a low-pass filtering. The audio frames are evaluated by the PEAQ algorithm. The two results are integrated to evaluate the video conference quality. A video conference database is built to test the performance of the proposed method. It could be found that the objective results correlate well with MOS. Then we can conclude that the proposed method is efficiency in assessing video conference quality.
Audio stream classification for multimedia database search

NASA Astrophysics Data System (ADS)

Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

2013-03-01

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.
Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

NASA Astrophysics Data System (ADS)

George, Rohini

Lung cancer accounts for 13% of all cancers in the Unites States and is the leading cause of deaths among both men and women. The five-year survival for lung cancer patients is approximately 15%.(ACS facts & figures) Respiratory motion decreases accuracy of thoracic radiotherapy during imaging and delivery. To account for respiration, generally margins are added during radiation treatment planning, which may cause a substantial dose delivery to normal tissues and increase the normal tissue toxicity. To alleviate the above-mentioned effects of respiratory motion, several motion management techniques are available which can reduce the doses to normal tissues, thereby reducing treatment toxicity and allowing dose escalation to the tumor. This may increase the survival probability of patients who have lung cancer and are receiving radiation therapy. However the accuracy of these motion management techniques are inhibited by respiration irregularity. The rationale of this thesis was to study the improvement in regularity of respiratory motion by breathing coaching for lung cancer patients using audio instructions and audio-visual biofeedback. A total of 331 patient respiratory motion traces, each four minutes in length, were collected from 24 lung cancer patients enrolled in an IRB-approved breathing-training protocol. It was determined that audio-visual biofeedback significantly improved the regularity of respiratory motion compared to free breathing and audio instruction, thus improving the accuracy of respiratory gated radiotherapy. It was also observed that duty cycles below 30% showed insignificant reduction in residual motion while above 50% there was a sharp increase in residual motion. The reproducibility of exhale based gating was higher than that of inhale base gating. Modeling the respiratory cycles it was found that cosine and cosine 4 models had the best correlation with individual respiratory cycles. The overall respiratory motion probability distribution

The implementation of Project-Based Learning in courses Audio Video to Improve Employability Skills

NASA Astrophysics Data System (ADS)

Sulistiyo, Edy; Kustono, Djoko; Purnomo; Sutaji, Eddy

2018-04-01

This paper presents a project-based learning (PjBL) in subjects with Audio Video the Study Programme Electro Engineering Universitas Negeri Surabaya which consists of two ways namely the design of the prototype audio-video and assessment activities project-based learning tailored to the skills of the 21st century in the form of employability skills. The purpose of learning innovation is applying the lab work obtained in the theory classes. The PjBL aims to motivate students, centering on the problems of teaching in accordance with the world of work. Measures of learning include; determine the fundamental questions, designs, develop a schedule, monitor the learners and progress, test the results, evaluate the experience, project assessment, and product assessment. The results of research conducted showed the level of mastery of the ability to design tasks (of 78.6%), technical planning (39,3%), creativity (42,9%), innovative (46,4%), problem solving skills (the 57.1%), skill to communicate (75%), oral expression (75%), searching and understanding information (to 64.3%), collaborative work skills (71,4%), and classroom conduct (of 78.6%). In conclusion, instructors have to do the reflection and make improvements in some of the aspects that have a level of mastery of the skills less than 60% both on the application of project-based learning courses, audio video.
Enhancing Navigation Skills through Audio Gaming.

PubMed

Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

2010-01-01

We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks.
Audio steganography by amplitude or phase modification

NASA Astrophysics Data System (ADS)

Gopalan, Kaliappan; Wenndt, Stanley J.; Adams, Scott F.; Haddad, Darren M.

2003-06-01

This paper presents the results of embedding short covert message utterances on a host, or cover, utterance by modifying the phase or amplitude of perceptually masked or significant regions of the host. In the first method, the absolute phase at selected, perceptually masked frequency indices was changed to fixed, covert data-dependent values. Embedded bits were retrieved at the receiver from the phase at the selected frequency indices. Tests on embedding a GSM-coded covert utterance on clean and noisy host utterances showed no noticeable difference in the stego compared to the hosts in speech quality or spectrogram. A bit error rate of 2 out of 2800 was observed for a clean host utterance while no error occurred for a noisy host. In the second method, the absolute phase of 10 or fewer perceptually significant points in the host was set in accordance with covert data. This resulted in a stego with successful data retrieval and a slightly noticeable degradation in speech quality. Modifying the amplitude of perceptually significant points caused perceptible differences in the stego even with small changes of amplitude made at five points per frame. Finally, the stego obtained by altering the amplitude at perceptually masked points showed barely noticeable differences and excellent data recovery.
Fall Detection Using Smartphone Audio Features.

PubMed

Cheffena, Michael

2016-07-01

An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.
Instructors' Experiences of Web Based Synchronous Communication using Two Way Audio and Direct Messaging

ERIC Educational Resources Information Center

Murphy, Elizabeth; Ciszewska-Carr, Justyna

2007-01-01

This paper reports on an exploratory case study designed to gain insight into instructors' experiences with web based synchronous communication using two way audio and direct messaging. We conducted semi-structured interviews with eight instructors who used "Elluminate Live" in their web based, asynchronous courses in Education, Nursing,…
The power of digital audio in interactive instruction: An unexploited medium

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pratt, J.; Trainor, M.

1989-01-01

Widespread use of audio in computer-based training (CBT) occurred with the advent of the interactive videodisc technology. This paper discusses the alternative of digital audio, which, unlike videodisc audio, enables one to rapidly revise the audio used in the CBT and which may be used in nonvideo CBT applications as well. We also discuss techniques used in audio script writing, editing, and production. Results from evaluations indicate a high degree of user satisfaction. 4 refs.
Enhancing Navigation Skills through Audio Gaming

PubMed Central

Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

2014-01-01

We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks. PMID:25505796
Fingerprinting with Wow

NASA Astrophysics Data System (ADS)

Yu, Eugene; Craver, Scott

2006-02-01

Wow, or time warping caused by speed fluctuations in analog audio equipment, provides a wealth of applications in watermarking. Very subtle temporal distortion has been used to defeat watermarks, and as components in watermarking systems. In the image domain, the analogous warping of an image's canvas has been used both to defeat watermarks and also proposed to prevent collusion attacks on fingerprinting systems. In this paper, we explore how subliminal levels of wow can be used for steganography and fingerprinting. We present both a low-bitrate robust solution and a higher-bitrate solution intended for steganographic communication. As already observed, such a fingerprinting algorithm naturally discourages collusion by averaging, owing to flanging effects when misaligned audio is averaged. Another advantage of warping is that even when imperceptible, it can be beyond the reach of compression algorithms. We use this opportunity to debunk the common misconception that steganography is impossible under "perfect compression."
A high efficiency PWM CMOS class-D audio power amplifier

NASA Astrophysics Data System (ADS)

Zhangming, Zhu; Lianxi, Liu; Yintang, Yang; Han, Lei

2009-02-01

Based on the difference close-loop feedback technique and the difference pre-amp, a high efficiency PWM CMOS class-D audio power amplifier is proposed. A rail-to-rail PWM comparator with window function has been embedded in the class-D audio power amplifier. Design results based on the CSMC 0.5 μm CMOS process show that the max efficiency is 90%, the PSRR is -75 dB, the power supply voltage range is 2.5-5.5 V, the THD+N in 1 kHz input frequency is less than 0.20%, the quiescent current in no load is 2.8 mA, and the shutdown current is 0.5 μA. The active area of the class-D audio power amplifier is about 1.47 × 1.52 mm2. With the good performance, the class-D audio power amplifier can be applied to several audio power systems.
Effective Use of Audio Media in Multimedia Presentations.

ERIC Educational Resources Information Center

Kerr, Brenda

This paper emphasizes research-based reasons for adding audio to multimedia presentations. The first section summarizes suggestions from a review of research on the effectiveness of audio media when accompanied by other forms of media; types of research studies (e.g., evaluation, intra-medium, and aptitude treatment interaction studies) are also…
WebGL and web audio software lightweight components for multimedia education

NASA Astrophysics Data System (ADS)

Chang, Xin; Yuksel, Kivanc; Skarbek, Władysław

2017-08-01

The paper presents the results of our recent work on development of contemporary computing platform DC2 for multimedia education usingWebGL andWeb Audio { the W3C standards. Using literate programming paradigm the WEBSA educational tools were developed. It offers for a user (student), the access to expandable collection of WEBGL Shaders and web Audio scripts. The unique feature of DC2 is the option of literate programming, offered for both, the author and the reader in order to improve interactivity to lightweightWebGL andWeb Audio components. For instance users can define: source audio nodes including synthetic sources, destination audio nodes, and nodes for audio processing such as: sound wave shaping, spectral band filtering, convolution based modification, etc. In case of WebGL beside of classic graphics effects based on mesh and fractal definitions, the novel image processing analysis by shaders is offered like nonlinear filtering, histogram of gradients, and Bayesian classifiers.
High-Fidelity Piezoelectric Audio Device

NASA Technical Reports Server (NTRS)

Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

2003-01-01

ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.
Highlight summarization in golf videos using audio signals

NASA Astrophysics Data System (ADS)

Kim, Hyoung-Gook; Kim, Jin Young

2008-01-01

In this paper, we present an automatic summarization of highlights in golf videos based on audio information alone without video information. The proposed highlight summarization system is carried out based on semantic audio segmentation and detection on action units from audio signals. Studio speech, field speech, music, and applause are segmented by means of sound classification. Swing is detected by the methods of impulse onset detection. Sounds like swing and applause form a complete action unit, while studio speech and music parts are used to anchor the program structure. With the advantage of highly precise detection of applause, highlights are extracted effectively. Our experimental results obtain high classification precision on 18 golf games. It proves that the proposed system is very effective and computationally efficient to apply the technology to embedded consumer electronic devices.
Audio in Courseware: Design Knowledge Issues.

ERIC Educational Resources Information Center

Aarntzen, Diana

1993-01-01

Considers issues that need to be addressed when incorporating audio in courseware design. Topics discussed include functions of audio in courseware; the relationship between auditive and visual information; learner characteristics in relation to audio; events of instruction; and audio characteristics, including interactivity and speech technology.…
Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech.

PubMed

Alm, Magnus; Behne, Dawn

2013-10-01

Previous research indicates that perception of audio-visual (AV) synchrony changes in adulthood. Possible explanations for these age differences include a decline in hearing acuity, a decline in cognitive processing speed, and increased experience with AV binding. The current study aims to isolate the effect of AV experience by comparing synchrony judgments from 20 young adults (20 to 30 yrs) and 20 normal-hearing middle-aged adults (50 to 60 yrs), an age range for which a decline of cognitive processing speed is expected to be minimal. When presented with AV stop consonant syllables with asynchronies ranging from 440 ms audio-lead to 440 ms visual-lead, middle-aged adults showed significantly less tolerance for audio-lead than young adults. Middle-aged adults also showed a greater shift in their point of subjective simultaneity than young adults. Natural audio-lead asynchronies are arguably more predictable than natural visual-lead asynchronies, and this predictability may render audio-lead thresholds more prone to experience-related fine-tuning.
SCOPES: steganography with compression using permutation search

NASA Astrophysics Data System (ADS)

Boorboor, Sahar; Zolfaghari, Behrouz; Mozafari, Saadat Pour

2011-10-01

LSB (Least Significant Bit) is a widely used method for image steganography, which hides the secret message as a bit stream in LSBs of pixel bytes in the cover image. This paper proposes a variant of LSB named SCOPES that encodes and compresses the secret message while being hidden through storing addresses instead of message bytes. Reducing the length of the stored message improves the storage capacity and makes the stego image visually less suspicious to the third party. The main idea behind the SCOPES approach is dividing the message into 3-character segments, seeking each segment in the cover image and storing the address of the position containing the segment instead of the segment itself. In this approach, every permutation of the 3 bytes (if found) can be stored along with some extra bits indicating the permutation. In some rare cases the segment may not be found in the image and this can cause the message to be expanded by some overhead bits2 instead of being compressed. But experimental results show that SCOPES performs overlay better than traditional LSB even in the worst cases.
Advances in audio source seperation and multisource audio content retrieval

NASA Astrophysics Data System (ADS)

Vincent, Emmanuel

2012-06-01

Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.
The Use of Audio in Computer-Based Instruction.

ERIC Educational Resources Information Center

Koroghlanian, Carol M.; Sullivan, Howard J.

This study investigated the effects of audio and text density on the achievement, time-in-program, and attitudes of 134 undergraduates. Data concerning the subjects' preexisting computer skills and experience, as well as demographic information, were also collected. The instruction in visual design principles was delivered by computer and included…
Efficient audio signal processing for embedded systems

NASA Astrophysics Data System (ADS)

Chiu, Leung Kin

As mobile platforms continue to pack on more computational power, electronics manufacturers start to differentiate their products by enhancing the audio features. However, consumers also demand smaller devices that could operate for longer time, hence imposing design constraints. In this research, we investigate two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound ”richer" and "fuller." Piezoelectric speakers have a small form factor but exhibit poor response in the low-frequency region. In the algorithm, we combine psychoacoustic bass extension and dynamic range compression to improve the perceived bass coming out from the tiny speakers. We also developed an audio energy reduction algorithm for loudspeaker power management. The perceptually transparent algorithm extends the battery life of mobile devices and prevents thermal damage in speakers. This method is similar to audio compression algorithms, which encode audio signals in such a ways that the compression artifacts are not easily perceivable. Instead of reducing the storage space, however, we suppress the audio contents that are below the hearing threshold, therefore reducing the signal energy. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The system is an example of an analog-to-information converter. The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine
Detection and characterization of lightning-based sources using continuous wavelet transform: application to audio-magnetotellurics

NASA Astrophysics Data System (ADS)

Larnier, H.; Sailhac, P.; Chambodut, A.

2018-01-01

Atmospheric electromagnetic waves created by global lightning activity contain information about electrical processes of the inner and the outer Earth. Large signal-to-noise ratio events are particularly interesting because they convey information about electromagnetic properties along their path. We introduce a new methodology to automatically detect and characterize lightning-based waves using a time-frequency decomposition obtained through the application of continuous wavelet transform. We focus specifically on three types of sources, namely, atmospherics, slow tails and whistlers, that cover the frequency range 10 Hz to 10 kHz. Each wave has distinguishable characteristics in the time-frequency domain due to source shape and dispersion processes. Our methodology allows automatic detection of each type of event in the time-frequency decomposition thanks to their specific signature. Horizontal polarization attributes are also recovered in the time-frequency domain. This procedure is first applied to synthetic extremely low frequency time-series with different signal-to-noise ratios to test for robustness. We then apply it on real data: three stations of audio-magnetotelluric data acquired in Guadeloupe, oversea French territories. Most of analysed atmospherics and slow tails display linear polarization, whereas analysed whistlers are elliptically polarized. The diversity of lightning activity is finally analysed in an audio-magnetotelluric data processing framework, as used in subsurface prospecting, through estimation of the impedance response functions. We show that audio-magnetotelluric processing results depend mainly on the frequency content of electromagnetic waves observed in processed time-series, with an emphasis on the difference between morning and afternoon acquisition. Our new methodology based on the time-frequency signature of lightning-induced electromagnetic waves allows automatic detection and characterization of events in audio

ENERGY STAR Certified Audio Video

EPA Pesticide Factsheets

Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of May 1, 2013. A detailed listing of key efficiency criteria are available at http://www.energystar.gov/index.cfm?c=audio_dvd.pr_crit_audio_dvd
Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection

NASA Astrophysics Data System (ADS)

Dov, David; Talmon, Ronen; Cohen, Israel

2016-12-01

In this paper, we address the problem of multiple view data fusion in the presence of noise and interferences. Recent studies have approached this problem using kernel methods, by relying particularly on a product of kernels constructed separately for each view. From a graph theory point of view, we analyze this fusion approach in a discrete setting. More specifically, based on a statistical model for the connectivity between data points, we propose an algorithm for the selection of the kernel bandwidth, a parameter, which, as we show, has important implications on the robustness of this fusion approach to interferences. Then, we consider the fusion of audio-visual speech signals measured by a single microphone and by a video camera pointed to the face of the speaker. Specifically, we address the task of voice activity detection, i.e., the detection of speech and non-speech segments, in the presence of structured interferences such as keyboard taps and office noise. We propose an algorithm for voice activity detection based on the audio-visual signal. Simulation results show that the proposed algorithm outperforms competing fusion and voice activity detection approaches. In addition, we demonstrate that a proper selection of the kernel bandwidth indeed leads to improved performance.
A Novel Image Steganography Technique for Secured Online Transaction Using DWT and Visual Cryptography

NASA Astrophysics Data System (ADS)

Anitha Devi, M. D.; ShivaKumar, K. B.

2017-08-01

Online payment eco system is the main target especially for cyber frauds. Therefore end to end encryption is very much needed in order to maintain the integrity of secret information related to transactions carried online. With access to payment related sensitive information, which enables lot of money transactions every day, the payment infrastructure is a major target for hackers. The proposed system highlights, an ideal approach for secure online transaction for fund transfer with a unique combination of visual cryptography and Haar based discrete wavelet transform steganography technique. This combination of data hiding technique reduces the amount of information shared between consumer and online merchant needed for successful online transaction along with providing enhanced security to customer’s account details and thereby increasing customer’s confidence preventing “Identity theft” and “Phishing”. To evaluate the effectiveness of proposed algorithm Root mean square error, Peak signal to noise ratio have been used as evaluation parameters
Audio-frequency analysis of inductive voltage dividers based on structural models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Avramov, S.; Oldham, N.M.; Koffman, A.D.

1994-12-31

A Binary Inductive Voltage Divider (BIVD) is compared with a Decade Inductive Voltage Divider (DIVD) in an automatic IVD bridge. New detection and injection circuitry was designed and used to evaluate the IVDs with either the input or output tied to ground potential. In the audio frequency range the DIVD and BIVD error patterns are characterized for both in-phase and quadrature components. Differences between results obtained using a new error decomposition scheme based on structural modeling, and measurements using conventional IVD standards are reported.
Audio-Visual, Visuo-Tactile and Audio-Tactile Correspondences in Preschoolers.

PubMed

Nava, Elena; Grassi, Massimo; Turati, Chiara

2016-01-01

Interest in crossmodal correspondences has recently seen a renaissance thanks to numerous studies in human adults. Yet, still very little is known about crossmodal correspondences in children, particularly in sensory pairings other than audition and vision. In the current study, we investigated whether 4-5-year-old children match auditory pitch to the spatial motion of visual objects (audio-visual condition). In addition, we investigated whether this correspondence extends to touch, i.e., whether children also match auditory pitch to the spatial motion of touch (audio-tactile condition) and the spatial motion of visual objects to touch (visuo-tactile condition). In two experiments, two different groups of children were asked to indicate which of two stimuli fitted best with a centrally located third stimulus (Experiment 1), or to report whether two presented stimuli fitted together well (Experiment 2). We found sensitivity to the congruency of all of the sensory pairings only in Experiment 2, suggesting that only under specific circumstances can these correspondences be observed. Our results suggest that pitch-height correspondences for audio-visual and audio-tactile combinations may still be weak in preschool children, and speculate that this could be due to immature linguistic and auditory cues that are still developing at age five.
Age Matters: Student Experiences with Audio Learning Guides in University-Based Continuing Education

ERIC Educational Resources Information Center

Mercer, Lorraine; Pianosi, Birgit

2012-01-01

The primary objective of this research was to explore the experiences of undergraduate distance education students using sample audio versions (provided on compact disc) of the learning guides for their courses. The results of this study indicated that students responded positively to the opportunity to have word-for-word audio versions of their…
Audio distribution and Monitoring Circuit

NASA Technical Reports Server (NTRS)

Kirkland, J. M.

1983-01-01

Versatile circuit accepts and distributes TV audio signals. Three-meter audio distribution and monitoring circuit provides flexibility in monitoring, mixing, and distributing audio inputs and outputs at various signal and impedance levels. Program material is simultaneously monitored on three channels, or single-channel version built to monitor transmitted or received signal levels, drive speakers, interface to building communications, and drive long-line circuits.
Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study

ERIC Educational Resources Information Center

Romero-Fresco, Pablo; Fryer, Louise

2013-01-01

Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…
Robot Command Interface Using an Audio-Visual Speech Recognition System

NASA Astrophysics Data System (ADS)

Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.
Audio 2008: Audio Fixation

ERIC Educational Resources Information Center

Kaye, Alan L.

2008-01-01

Take a look around the bus or subway and see just how many people are bumping along to an iPod or an MP3 player. What they are listening to is their secret, but the many signature earbuds in sight should give one a real sense of just how pervasive digital audio has become. This article describes how that popularity is mirrored in library audio…
Steganalysis of recorded speech

NASA Astrophysics Data System (ADS)

Johnson, Micah K.; Lyu, Siwei; Farid, Hany

2005-03-01

Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.
Direct broadcast satellite-audio, portable and mobile reception tradeoffs

NASA Technical Reports Server (NTRS)

Golshan, Nasser

1992-01-01

This paper reports on the findings of a systems tradeoffs study on direct broadcast satellite-radio (DBS-R). Based on emerging advanced subband and transform audio coding systems, four ranges of bit rates: 16-32 kbps, 48-64 kbps, 96-128 kbps and 196-256 kbps are identified for DBS-R. The corresponding grades of audio quality will be subjectively comparable to AM broadcasting, monophonic FM, stereophonic FM, and CD quality audio, respectively. The satellite EIRP's needed for mobile DBS-R reception in suburban areas are sufficient for portable reception in most single family houses when allowance is made for the higher G/T of portable table-top receivers. As an example, the variation of the space segment cost as a function of frequency, audio quality, coverage capacity, and beam size is explored for a typical DBS-R system.
Temporal phase mask encrypted optical steganography carried by amplified spontaneous emission noise.

PubMed

Wu, Ben; Wang, Zhenxing; Shastri, Bhavin J; Chang, Matthew P; Frost, Nicholas A; Prucnal, Paul R

2014-01-13

A temporal phase mask encryption method is proposed and experimentally demonstrated to improve the security of the stealth channel in an optical steganography system. The stealth channel is protected in two levels. In the first level, the data is carried by amplified spontaneous emission (ASE) noise, which cannot be detected in either the time domain or spectral domain. In the second level, even if the eavesdropper suspects the existence of the stealth channel, each data bit is covered by a fast changing phase mask. The phase mask code is always combined with the wide band noise from ASE. Without knowing the right phase mask code to recover the stealth data, the eavesdropper can only receive the noise like signal with randomized phase.
Instrumental Landing Using Audio Indication

NASA Astrophysics Data System (ADS)

Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.

2018-02-01

The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.
A centralized audio presentation manager

DOE Office of Scientific and Technical Information (OSTI.GOV)

Papp, A.L. III; Blattner, M.M.

1994-05-16

The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in themore » most perceptible manner through the use of a theoretically and empirically designed rule set.« less
Realization of guitar audio effects using methods of digital signal processing

NASA Astrophysics Data System (ADS)

Buś, Szymon; Jedrzejewski, Konrad

2015-09-01

The paper is devoted to studies on possibilities of realization of guitar audio effects by means of methods of digital signal processing. As a result of research, some selected audio effects corresponding to the specifics of guitar sound were realized as the real-time system called Digital Guitar Multi-effect. Before implementation in the system, the selected effects were investigated using the dedicated application with a graphical user interface created in Matlab environment. In the second stage, the real-time system based on a microcontroller and an audio codec was designed and realized. The system is designed to perform audio effects on the output signal of an electric guitar.
366-AAA_audio

NASA Image and Video Library

1969-11-17

Apollo 12 Public Affairs Officer (PAO) Mission Commentary, November 17, 1969. This is an hour of audio covering communications occurring between 64 hours, 38 minutes into the mission, through 79 hours, 2 minutes which was on November 17, 1969, from 0300-17:09 CST. Transcript of attached audio is available at http://www.jsc.nasa.gov/history/mission_trans/AS12_PAO.PDF, on pages 207-224 of the 979-page document.
MPEG-7 audio-visual indexing test-bed for video retrieval

NASA Astrophysics Data System (ADS)

Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian

2003-12-01

This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
Musical examination to bridge audio data and sheet music

NASA Astrophysics Data System (ADS)

Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

2015-03-01

The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly
36 CFR 1002.12 - Audio disturbances.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 36 Parks, Forests, and Public Property 3 2014-07-01 2014-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...

36 CFR 1002.12 - Audio disturbances.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 36 Parks, Forests, and Public Property 3 2012-07-01 2012-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
50 CFR 27.72 - Audio equipment.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 50 Wildlife and Fisheries 6 2010-10-01 2010-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
36 CFR 1002.12 - Audio disturbances.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 36 Parks, Forests, and Public Property 3 2011-07-01 2011-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
36 CFR 1002.12 - Audio disturbances.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false Audio disturbances. 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...
50 CFR 27.72 - Audio equipment.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 50 Wildlife and Fisheries 8 2011-10-01 2011-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
50 CFR 27.72 - Audio equipment.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 50 Wildlife and Fisheries 9 2012-10-01 2012-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...
A new 4-D chaotic hyperjerk system, its synchronization, circuit design and applications in RNG, image encryption and chaos-based steganography

NASA Astrophysics Data System (ADS)

Vaidyanathan, S.; Akgul, A.; Kaçar, S.; Çavuşoğlu, U.

2018-02-01

Hyperjerk systems have received significant interest in the literature because of their simple structure and complex dynamical properties. This work presents a new chaotic hyperjerk system having two exponential nonlinearities. Dynamical properties of the chaotic hyperjerk system are discovered through equilibrium point analysis, bifurcation diagram, dissipativity and Lyapunov exponents. Moreover, an adaptive backstepping controller is designed for the synchronization of the chaotic hyperjerk system. Also, a real circuit of the chaotic hyperjerk system has been carried out to show the feasibility of the theoretical hyperjerk model. The chaotic hyperjerk system can also be useful in scientific fields such as Random Number Generators (RNGs), data security, data hiding, etc. In this work, three implementations of the chaotic hyperjerk system, viz. RNG, image encryption and sound steganography have been performed by using complex dynamics characteristics of the system.
36 CFR 2.12 - Audio disturbances.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 36 Parks, Forests, and Public Property 1 2012-07-01 2012-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
36 CFR 2.12 - Audio disturbances.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
36 CFR 2.12 - Audio disturbances.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 36 Parks, Forests, and Public Property 1 2013-07-01 2013-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
36 CFR 2.12 - Audio disturbances.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 36 Parks, Forests, and Public Property 1 2014-07-01 2014-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
36 CFR 2.12 - Audio disturbances.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 36 Parks, Forests, and Public Property 1 2011-07-01 2011-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...
Tensorial dynamic time warping with articulation index representation for efficient audio-template learning.

PubMed

Le, Long N; Jones, Douglas L

2018-03-01

Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.
Audio Frequency Analysis in Mobile Phones

ERIC Educational Resources Information Center

Aguilar, Horacio Munguía

2016-01-01

A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…
Quantum steganography and quantum error-correction

NASA Astrophysics Data System (ADS)

Shaw, Bilal A.

corrects an arbitrary error on the receiver's half of the ebit as well. We prove that this code is the smallest code with a CSS structure that uses only one ebit and corrects an arbitrary single-qubit error on the sender's side. We discuss the advantages and disadvantages for each of the two codes. In the second half of this thesis we explore the yet uncharted and relatively undiscovered area of quantum steganography. Steganography is the process of hiding secret information by embedding it in an "innocent" message. We present protocols for hiding quantum information in a codeword of a quantum error-correcting code passing through a channel. Using either a shared classical secret key or shared entanglement Alice disguises her information as errors in the channel. Bob can retrieve the hidden information, but an eavesdropper (Eve) with the power to monitor the channel, but without the secret key, cannot distinguish the message from channel noise. We analyze how difficult it is for Eve to detect the presence of secret messages, and estimate rates of steganographic communication and secret key consumption for certain protocols. We also provide an example of how Alice hides quantum information in the perfect code when the underlying channel between Bob and her is the depolarizing channel. Using this scheme Alice can hide up to four stego-qubits.
[Intermodal timing cues for audio-visual speech recognition].

PubMed

Hashimoto, Masahiro; Kumashiro, Masaharu

2004-06-01

The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
Interactive MPEG-4 low-bit-rate speech/audio transmission over the Internet

NASA Astrophysics Data System (ADS)

Liu, Fang; Kim, JongWon; Kuo, C.-C. Jay

1999-11-01

The recently developed MPEG-4 technology enables the coding and transmission of natural and synthetic audio-visual data in the form of objects. In an effort to extend the object-based functionality of MPEG-4 to real-time Internet applications, architectural prototypes of multiplex layer and transport layer tailored for transmission of MPEG-4 data over IP are under debate among Internet Engineering Task Force (IETF), and MPEG-4 systems Ad Hoc group. In this paper, we present an architecture for interactive MPEG-4 speech/audio transmission system over the Internet. It utilities a framework of Real Time Streaming Protocol (RTSP) over Real-time Transport Protocol (RTP) to provide controlled, on-demand delivery of real time speech/audio data. Based on a client-server model, a couple of low bit-rate bit streams (real-time speech/audio, pre- encoded speech/audio) are multiplexed and transmitted via a single RTP channel to the receiver. The MPEG-4 Scene Description (SD) and Object Descriptor (OD) bit streams are securely sent through the RTSP control channel. Upon receiving, an initial MPEG-4 audio- visual scene is constructed after de-multiplexing, decoding of bit streams, and scene composition. A receiver is allowed to manipulate the initial audio-visual scene presentation locally, or interactively arrange scene changes by sending requests to the server. A server may also choose to update the client with new streams and list of contents for user selection.
A microcomputer interface for a digital audio processor-based data recording system.

PubMed

Croxton, T L; Stump, S J; Armstrong, W M

1987-10-01

An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer.
A microcomputer interface for a digital audio processor-based data recording system.

PubMed Central

Croxton, T L; Stump, S J; Armstrong, W M

1987-01-01

An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer. PMID:3676444
Robust audio-visual speech recognition under noisy audio-video conditions.

PubMed

Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji

2014-02-01

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

Digital Multicasting of Multiple Audio Streams

NASA Technical Reports Server (NTRS)

Macha, Mitchell; Bullock, John

2007-01-01

The Mission Control Center Voice Over Internet Protocol (MCC VOIP) system (see figure) comprises hardware and software that effect simultaneous, nearly real-time transmission of as many as 14 different audio streams to authorized listeners via the MCC intranet and/or the Internet. The original version of the MCC VOIP system was conceived to enable flight-support personnel located in offices outside a spacecraft mission control center to monitor audio loops within the mission control center. Different versions of the MCC VOIP system could be used for a variety of public and commercial purposes - for example, to enable members of the general public to monitor one or more NASA audio streams through their home computers, to enable air-traffic supervisors to monitor communication between airline pilots and air-traffic controllers in training, and to monitor conferences among brokers in a stock exchange. At the transmitting end, the audio-distribution process begins with feeding the audio signals to analog-to-digital converters. The resulting digital streams are sent through the MCC intranet, using a user datagram protocol (UDP), to a server that converts them to encrypted data packets. The encrypted data packets are then routed to the personal computers of authorized users by use of multicasting techniques. The total data-processing load on the portion of the system upstream of and including the encryption server is the total load imposed by all of the audio streams being encoded, regardless of the number of the listeners or the number of streams being monitored concurrently by the listeners. The personal computer of a user authorized to listen is equipped with special- purpose MCC audio-player software. When the user launches the program, the user is prompted to provide identification and a password. In one of two access- control provisions, the program is hard-coded to validate the user s identity and password against a list maintained on a domain-controller computer
News video story segmentation method using fusion of audio-visual features

NASA Astrophysics Data System (ADS)

Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang

2007-11-01

News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Performance enhancement for audio-visual speaker identification using dynamic facial muscle model.

PubMed

Asadpour, Vahid; Towhidkhah, Farzad; Homayounpour, Mohammad Mehdi

2006-10-01

Science of human identification using physiological characteristics or biometry has been of great concern in security systems. However, robust multimodal identification systems based on audio-visual information has not been thoroughly investigated yet. Therefore, the aim of this work to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently, imitation of valid speakers could be reduced to a large extent. These parameters are applied to a hidden Markov model (HMM) audio-visual identification system. In this work, a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. Noise robust audio features such as Mel-frequency cepstral coefficients (MFCC), spectral subtraction (SS), and relative spectra perceptual linear prediction (J-RASTA-PLP) have been used to evaluate the performance of the multimodal system once efficient audio feature extraction methods have been utilized. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. To evaluate the robustness of algorithms, some experiments were performed on genetically identical twins. Furthermore, changes in speaker voice were simulated with drug inhalation tests. In 3 dB signal to noise ratio (SNR), the dynamic muscle model improved the identification rate of the audio-visual system from 91 to 98%. Results on identical twins revealed that there was an apparent improvement on the performance for the dynamic muscle model-based system, in which the identification rate of the audio-visual system was enhanced from 87
Real World Audio

NASA Technical Reports Server (NTRS)

1998-01-01

Crystal River Engineering was originally featured in Spinoff 1992 with the Convolvotron, a high speed digital audio processing system that delivers three-dimensional sound over headphones. The Convolvotron was developed for Ames' research on virtual acoustic displays. Crystal River is a now a subsidiary of Aureal Semiconductor, Inc. and they together develop and market the technology, which is a 3-D (three dimensional) audio technology known commercially today as Aureal 3D (A-3D). The technology has been incorporated into video games, surround sound systems, and sound cards.
Semantic Context Detection Using Audio Event Fusion

NASA Astrophysics Data System (ADS)

Chu, Wei-Ta; Cheng, Wen-Huang; Wu, Ja-Ling

2006-12-01

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.
Animation, audio, and spatial ability: Optimizing multimedia for scientific explanations

NASA Astrophysics Data System (ADS)

Koroghlanian, Carol May

This study investigated the effects of audio, animation and spatial ability in a computer based instructional program for biology. The program presented instructional material via text or audio with lean text and included eight instructional sequences presented either via static illustrations or animations. High school students enrolled in a biology course were blocked by spatial ability and randomly assigned to one of four treatments (Text-Static Illustration Audio-Static Illustration, Text-Animation, Audio-Animation). The study examined the effects of instructional mode (Text vs. Audio), illustration mode (Static Illustration vs. Animation) and spatial ability (Low vs. High) on practice and posttest achievement, attitude and time. Results for practice achievement indicated that high spatial ability participants achieved more than low spatial ability participants. Similar results for posttest achievement and spatial ability were not found. Participants in the Static Illustration treatments achieved the same as participants in the Animation treatments on both the practice and posttest. Likewise, participants in the Text treatments achieved the same as participants in the Audio treatments on both the practice and posttest. In terms of attitude, participants responded favorably to the computer based instructional program. They found the program interesting, felt the static illustrations or animations made the explanations easier to understand and concentrated on learning the material. Furthermore, participants in the Animation treatments felt the information was easier to understand than participants in the Static Illustration treatments. However, no difference for any attitude item was found for participants in the Text as compared to those in the Audio treatments. Significant differences were found by Spatial Ability for three attitude items concerning concentration and interest. In all three items, the low spatial ability participants responded more positively
Detecting double compression of audio signal

NASA Astrophysics Data System (ADS)

Yang, Rui; Shi, Yun Q.; Huang, Jiwu

2010-01-01

MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.
Nonlinear dynamic macromodeling techniques for audio systems

NASA Astrophysics Data System (ADS)

Ogrodzki, Jan; Bieńkowski, Piotr

2015-09-01

This paper develops a modelling method and a models identification technique for the nonlinear dynamic audio systems. Identification is performed by means of a behavioral approach based on a polynomial approximation. This approach makes use of Discrete Fourier Transform and Harmonic Balance Method. A model of an audio system is first created and identified and then it is simulated in real time using an algorithm of low computational complexity. The algorithm consists in real time emulation of the system response rather than in simulation of the system itself. The proposed software is written in Python language using object oriented programming techniques. The code is optimized for a multithreads environment.
The Audio Description as a Physics Teaching Tool

ERIC Educational Resources Information Center

Cozendey, Sabrina; Costa, Maria da Piedade

2016-01-01

This study analyses the use of audio description in teaching physics concepts, aiming to determine the variables that influence the understanding of the concept. One education resource was audio described. For make the audio description the screen was freezing. The video with and without audio description should be presented to students, so that…
Comparing Audio and Video Data for Rating Communication

PubMed Central

Williams, Kristine; Herman, Ruth; Bontempo, Daniel

2013-01-01

Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with ICC (2,1) for audio = .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio recorded data should be evaluated in designing studies evaluating nursing care. PMID:23579475
Digital Audio/Video for Computer- and Web-Based Instruction for Training Rural Special Education Personnel.

ERIC Educational Resources Information Center

Ludlow, Barbara L.; Foshay, John B.; Duff, Michael C.

Video presentations of teaching episodes in home, school, and community settings and audio recordings of parents' and professionals' views can be important adjuncts to personnel preparation in special education. This paper describes instructional applications of digital media and outlines steps in producing audio and video segments. Digital audio…
Transfer Learning for Improved Audio-Based Human Activity Recognition.

PubMed

Ntalampiras, Stavros; Potamitis, Ilyas

2018-06-25

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.
Audio-Tutorial Instruction in Medicine.

ERIC Educational Resources Information Center

Boyle, Gloria J.; Herrick, Merlyn C.

This progress report concerns an audio-tutorial approach used at the University of Missouri-Columbia School of Medicine. Instructional techniques such as slide-tape presentations, compressed speech audio tapes, computer-assisted instruction (CAI), motion pictures, television, microfiche, and graphic and printed materials have been implemented,…
Method for reading sensors and controlling actuators using audio interfaces of mobile devices.

PubMed

Aroca, Rafael V; Burlamaqui, Aquiles F; Gonçalves, Luiz M G

2012-01-01

This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks.
Method for Reading Sensors and Controlling Actuators Using Audio Interfaces of Mobile Devices

PubMed Central

Aroca, Rafael V.; Burlamaqui, Aquiles F.; Gonçalves, Luiz M. G.

2012-01-01

This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks. PMID:22438726
Diagnostic accuracy of sleep bruxism scoring in absence of audio-video recording: a pilot study.

PubMed

Carra, Maria Clotilde; Huynh, Nelly; Lavigne, Gilles J

2015-03-01

Based on the most recent polysomnographic (PSG) research diagnostic criteria, sleep bruxism is diagnosed when >2 rhythmic masticatory muscle activity (RMMA)/h of sleep are scored on the masseter and/or temporalis muscles. These criteria have not yet been validated for portable PSG systems. This pilot study aimed to assess the diagnostic accuracy of scoring sleep bruxism in absence of audio-video recordings. Ten subjects (mean age 24.7 ± 2.2) with a clinical diagnosis of sleep bruxism spent one night in the sleep laboratory. PSG were performed with a portable system (type 2) while audio-video was recorded. Sleep studies were scored by the same examiner three times: (1) without, (2) with, and (3) without audio-video in order to test the intra-scoring and intra-examiner reliability for RMMA scoring. The RMMA event-by-event concordance rate between scoring without audio-video and with audio-video was 68.3 %. Overall, the RMMA index was overestimated by 23.8 % without audio-video. However, the intra-class correlation coefficient (ICC) between scorings with and without audio-video was good (ICC = 0.91; p < 0.001); the intra-examiner reliability was high (ICC = 0.97; p < 0.001). The clinical diagnosis of sleep bruxism was confirmed in 8/10 subjects based on scoring without audio-video and in 6/10 subjects with audio-video. Although the absence of audio-video recording, the diagnostic accuracy of assessing RMMA with portable PSG systems appeared to remain good, supporting their use for both research and clinical purposes. However, the risk of moderate overestimation in absence of audio-video must be taken into account.
Comparing audio and video data for rating communication.

PubMed

Williams, Kristine; Herman, Ruth; Bontempo, Daniel

2013-09-01

Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with Interclass Correlation Coefficient (ICC) (2,1) for audio .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio-recorded data should be evaluated in designing studies evaluating nursing care.
Inexpensive Audio Activities: Earbud-based Sound Experiments

NASA Astrophysics Data System (ADS)

Allen, Joshua; Boucher, Alex; Meggison, Dean; Hruby, Kate; Vesenka, James

2016-11-01

Inexpensive alternatives to a number of classic introductory physics sound laboratories are presented including interference phenomena, resonance conditions, and frequency shifts. These can be created using earbuds, economical supplies such as Giant Pixie Stix® wrappers, and free software available for PCs and mobile devices. We describe two interference laboratories (beat frequency and two-speaker interference) and two resonance laboratories (quarter- and half-wavelength). Lastly, a Doppler laboratory using rotating earbuds is explained. The audio signal captured by all experiments is analyzed on free spectral analysis software and many of the experiments incorporate the unifying theme of measuring the speed of sound in air.
High capacity reversible watermarking for audio by histogram shifting and predicted error expansion.

PubMed

Wang, Fei; Xie, Zhaoxin; Chen, Zuo

2014-01-01

Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability.
36 CFR § 1002.12 - Audio disturbances.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 36 Parks, Forests, and Public Property 3 2013-07-01 2012-07-01 true Audio disturbances. Â§ 1002.12... RECREATION § 1002.12 Audio disturbances. (a) The following are prohibited: (1) Operating motorized equipment or machinery such as an electric generating plant, motor vehicle, motorized toy, or an audio device...

Laboratory and in-flight experiments to evaluate 3-D audio display technology

NASA Technical Reports Server (NTRS)

Ericson, Mark; Mckinley, Richard; Kibbe, Marion; Francis, Daniel

1994-01-01

Laboratory and in-flight experiments were conducted to evaluate 3-D audio display technology for cockpit applications. A 3-D audio display generator was developed which digitally encodes naturally occurring direction information onto any audio signal and presents the binaural sound over headphones. The acoustic image is stabilized for head movement by use of an electromagnetic head-tracking device. In the laboratory, a 3-D audio display generator was used to spatially separate competing speech messages to improve the intelligibility of each message. Up to a 25 percent improvement in intelligibility was measured for spatially separated speech at high ambient noise levels (115 dB SPL). During the in-flight experiments, pilots reported that spatial separation of speech communications provided a noticeable improvement in intelligibility. The use of 3-D audio for target acquisition was also investigated. In the laboratory, 3-D audio enabled the acquisition of visual targets in about two seconds average response time at 17 degrees accuracy. During the in-flight experiments, pilots correctly identified ground targets 50, 75, and 100 percent of the time at separation angles of 12, 20, and 35 degrees, respectively. In general, pilot performance in the field with the 3-D audio display generator was as expected, based on data from laboratory experiments.
Using High-Dimensional Image Models to Perform Highly Undetectable Steganography

NASA Astrophysics Data System (ADS)

Pevný, Tomáš; Filler, Tomáš; Bas, Patrick

This paper presents a complete methodology for designing practical and highly-undetectable stegosystems for real digital media. The main design principle is to minimize a suitably-defined distortion by means of efficient coding algorithm. The distortion is defined as a weighted difference of extended state-of-the-art feature vectors already used in steganalysis. This allows us to "preserve" the model used by steganalyst and thus be undetectable even for large payloads. This framework can be efficiently implemented even when the dimensionality of the feature set used by the embedder is larger than 107. The high dimensional model is necessary to avoid known security weaknesses. Although high-dimensional models might be problem in steganalysis, we explain, why they are acceptable in steganography. As an example, we introduce HUGO, a new embedding algorithm for spatial-domain digital images and we contrast its performance with LSB matching. On the BOWS2 image database and in contrast with LSB matching, HUGO allows the embedder to hide 7× longer message with the same level of security level.
43 CFR 8365.2-2 - Audio devices.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 43 Public Lands: Interior 2 2013-10-01 2013-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 43 Public Lands: Interior 2 2012-10-01 2012-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
43 CFR 8365.2-2 - Audio devices.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 43 Public Lands: Interior 2 2014-10-01 2014-10-01 false Audio devices. 8365.2-2 Section 8365.2-2..., DEPARTMENT OF THE INTERIOR RECREATION PROGRAMS VISITOR SERVICES Rules of Conduct § 8365.2-2 Audio devices. On... audio device such as a radio, television, musical instrument, or other noise producing device or...
Design and implementation of an audio indicator

NASA Astrophysics Data System (ADS)

Zheng, Shiyong; Li, Zhao; Li, Biqing

2017-04-01

This page proposed an audio indicator which designed by using C9014, LED by operational amplifier level indicator, the decimal count/distributor of CD4017. The experimental can control audibly neon and holiday lights through the signal. Input audio signal after C9014 composed of operational amplifier for power amplifier, the adjust potentiometer extraction amplification signal input voltage CD4017 distributors make its drive to count, then connect the LED display running situation of the circuit. This simple audio indicator just use only U1 and can produce two colors LED with the audio signal tandem come pursuit of the running effect, from LED display the running of the situation takes can understand the general audio signal. The variation in the audio and the frequency of the signal and the corresponding level size. In this light can achieve jump to change, slowly, atlas, lighting four forms, used in home, hotel, discos, theater, advertising and other fields, and a wide range of USES, rU1h life in a modern society.
Capacity-optimized mp2 audio watermarking

NASA Astrophysics Data System (ADS)

Steinebach, Martin; Dittmann, Jana

2003-06-01

Today a number of audio watermarking algorithms have been proposed, some of them at a quality making them suitable for commercial applications. The focus of most of these algorithms is copyright protection. Therefore, transparency and robustness are the most discussed and optimised parameters. But other applications for audio watermarking can also be identified stressing other parameters like complexity or payload. In our paper, we introduce a new mp2 audio watermarking algorithm optimised for high payload. Our algorithm uses the scale factors of an mp2 file for watermark embedding. They are grouped and masked based on a pseudo-random pattern generated from a secret key. In each group, we embed one bit. Depending on the bit to embed, we change the scale factors by adding 1 where necessary until it includes either more even or uneven scale factors. An uneven group has a 1 embedded, an even group a 0. The same rule is later applied to detect the watermark. The group size can be increased or decreased for transparency/payload trade-off. We embed 160 bits or more in an mp2 file per second without reducing perceived quality. As an application example, we introduce a prototypic Karaoke system displaying song lyrics embedded as a watermark.
Music Identification System Using MPEG-7 Audio Signature Descriptors

PubMed Central

You, Shingchern D.; Chen, Wei-Hwa; Chen, Woei-Kae

2013-01-01

This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359
The Lowdown on Audio Downloads

ERIC Educational Resources Information Center

Farrell, Beth

2010-01-01

First offered to public libraries in 2004, downloadable audiobooks have grown by leaps and bounds. According to the Audio Publishers Association, their sales today account for 21% of the spoken-word audio market. It hasn't been easy, however. WMA. DRM. MP3. AAC. File extensions small on letters but very big on consequences for librarians,…
Reconstruction of audio waveforms from spike trains of artificial cochlea models

PubMed Central

Zai, Anja T.; Bhargava, Saurabh; Mesgarani, Nima; Liu, Shih-Chii

2015-01-01

Spiking cochlea models describe the analog processing and spike generation process within the biological cochlea. Reconstructing the audio input from the artificial cochlea spikes is therefore useful for understanding the fidelity of the information preserved in the spikes. The reconstruction process is challenging particularly for spikes from the mixed signal (analog/digital) integrated circuit (IC) cochleas because of multiple non-linearities in the model and the additional variance caused by random transistor mismatch. This work proposes an offline method for reconstructing the audio input from spike responses of both a particular spike-based hardware model called the AEREAR2 cochlea and an equivalent software cochlea model. This method was previously used to reconstruct the auditory stimulus based on the peri-stimulus histogram of spike responses recorded in the ferret auditory cortex. The reconstructed audio from the hardware cochlea is evaluated against an analogous software model using objective measures of speech quality and intelligibility; and further tested in a word recognition task. The reconstructed audio under low signal-to-noise (SNR) conditions (SNR < –5 dB) gives a better classification performance than the original SNR input in this word recognition task. PMID:26528113
47 CFR 73.403 - Digital audio broadcasting service requirements.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 47 Telecommunication 4 2012-10-01 2012-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 4 2011-10-01 2011-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 47 Telecommunication 4 2014-10-01 2014-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
47 CFR 73.403 - Digital audio broadcasting service requirements.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 47 Telecommunication 4 2013-10-01 2013-10-01 false Digital audio broadcasting service requirements... SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting service requirements. (a) Broadcast radio stations using IBOC must transmit at least one over-the-air digital audio...
Mining knowledge in noisy audio data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Czyzewski, A.

1996-12-31

This paper demonstrates a KDD method applied to audio data analysis, particularly, it presents possibilities which result from replacing traditional methods of analysis and acoustic signal processing by KDD algorithms when restoring audio recordings affected by strong noise.
Perceptually controlled doping for audio source separation

NASA Astrophysics Data System (ADS)

Mahé, Gaël; Nadalin, Everton Z.; Suyama, Ricardo; Romano, João MT

2014-12-01

The separation of an underdetermined audio mixture can be performed through sparse component analysis (SCA) that relies however on the strong hypothesis that source signals are sparse in some domain. To overcome this difficulty in the case where the original sources are available before the mixing process, the informed source separation (ISS) embeds in the mixture a watermark, which information can help a further separation. Though powerful, this technique is generally specific to a particular mixing setup and may be compromised by an additional bitrate compression stage. Thus, instead of watermarking, we propose a `doping' method that makes the time-frequency representation of each source more sparse, while preserving its audio quality. This method is based on an iterative decrease of the distance between the distribution of the signal and a target sparse distribution, under a perceptual constraint. We aim to show that the proposed approach is robust to audio coding and that the use of the sparsified signals improves the source separation, in comparison with the original sources. In this work, the analysis is made only in instantaneous mixtures and focused on voice sources.
Comparing the Effects of Classroom Audio-Recording and Video-Recording on Preservice Teachers' Reflection of Practice

ERIC Educational Resources Information Center

Bergman, Daniel

2015-01-01

This study examined the effects of audio and video self-recording on preservice teachers' written reflections. Participants (n = 201) came from a secondary teaching methods course and its school-based (clinical) fieldwork. The audio group (n[subscript A] = 106) used audio recorders to monitor their teaching in fieldwork placements; the video group…
Spatial Audio on the Web: Or Why Can't I hear Anything Over There?

NASA Technical Reports Server (NTRS)

Wenzel, Elizabeth M.; Schlickenmaier, Herbert (Technical Monitor); Johnson, Gerald (Technical Monitor); Frey, Mary Anne (Technical Monitor); Schneider, Victor S. (Technical Monitor); Ahunada, Albert J. (Technical Monitor)

1997-01-01

Auditory complexity, freedom of movement and interactivity is not always possible in a "true" virtual environment, much less in web-based audio. However, a lot of the perceptual and engineering constraints (and frustrations) that researchers, engineers and listeners have experienced in virtual audio are relevant to spatial audio on the web. My talk will discuss some of these engineering constraints and their perceptual consequences, and attempt to relate these issues to implementation on the web.
Video-assisted segmentation of speech and audio track

NASA Astrophysics Data System (ADS)

Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.

1999-08-01

Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.

Audio fingerprint extraction for content identification

NASA Astrophysics Data System (ADS)

Shiu, Yu; Yeh, Chia-Hung; Kuo, C. C. J.

2003-11-01

In this work, we present an audio content identification system that identifies some unknown audio material by comparing its fingerprint with those extracted off-line and saved in the music database. We will describe in detail the procedure to extract audio fingerprints and demonstrate that they are robust to noise and content-preserving manipulations. The main feature in the proposed system is the zero-crossing rate extracted with the octave-band filter bank. The zero-crossing rate can be used to describe the dominant frequency in each subband with a very low computational cost. The size of audio fingerprint is small and can be efficiently stored along with the compressed files in the database. It is also robust to many modifications such as tempo change and time-alignment distortion. Besides, the octave-band filter bank is used to enhance the robustness to distortion, especially those localized on some frequency regions.
Authenticity examination of compressed audio recordings using detection of multiple compression and encoders' identification.

PubMed

Korycki, Rafal

2014-05-01

Since the appearance of digital audio recordings, audio authentication has been becoming increasingly difficult. The currently available technologies and free editing software allow a forger to cut or paste any single word without audible artifacts. Nowadays, the only method referring to digital audio files commonly approved by forensic experts is the ENF criterion. It consists in fluctuation analysis of the mains frequency induced in electronic circuits of recording devices. Therefore, its effectiveness is strictly dependent on the presence of mains signal in the recording, which is a rare occurrence. Recently, much attention has been paid to authenticity analysis of compressed multimedia files and several solutions were proposed for detection of double compression in both digital video and digital audio. This paper addresses the problem of tampering detection in compressed audio files and discusses new methods that can be used for authenticity analysis of digital recordings. Presented approaches consist in evaluation of statistical features extracted from the MDCT coefficients as well as other parameters that may be obtained from compressed audio files. Calculated feature vectors are used for training selected machine learning algorithms. The detection of multiple compression covers up tampering activities as well as identification of traces of montage in digital audio recordings. To enhance the methods' robustness an encoder identification algorithm was developed and applied based on analysis of inherent parameters of compression. The effectiveness of tampering detection algorithms is tested on a predefined large music database consisting of nearly one million of compressed audio files. The influence of compression algorithms' parameters on the classification performance is discussed, based on the results of the current study. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Development of an audio-based virtual gaming environment to assist with navigation skills in the blind.

PubMed

Connors, Erin C; Yazzolino, Lindsay A; Sánchez, Jaime; Merabet, Lotfi B

2013-03-27

Audio-based Environment Simulator (AbES) is virtual environment software designed to improve real world navigation skills in the blind. Using only audio based cues and set within the context of a video game metaphor, users gather relevant spatial information regarding a building's layout. This allows the user to develop an accurate spatial cognitive map of a large-scale three-dimensional space that can be manipulated for the purposes of a real indoor navigation task. After game play, participants are then assessed on their ability to navigate within the target physical building represented in the game. Preliminary results suggest that early blind users were able to acquire relevant information regarding the spatial layout of a previously unfamiliar building as indexed by their performance on a series of navigation tasks. These tasks included path finding through the virtual and physical building, as well as a series of drop off tasks. We find that the immersive and highly interactive nature of the AbES software appears to greatly engage the blind user to actively explore the virtual environment. Applications of this approach may extend to larger populations of visually impaired individuals.
Watermarking 3D Objects for Verification

DTIC Science & Technology

1999-01-01

signal (audio/ image /video) pro- cessing and steganography fields, and even newer to the computer graphics community. Inherently, digital watermarking of...quality images , and digital video. The field of digital watermarking is relatively new, and many of its terms have not been well defined. Among the dif...ferent media types, watermarking of 2D still images is comparatively better studied. Inherently, digital water- marking of 3D objects remains a
Audio-vocal interaction in single neurons of the monkey ventrolateral prefrontal cortex.

PubMed

Hage, Steffen R; Nieder, Andreas

2015-05-06

Complex audio-vocal integration systems depend on a strong interconnection between the auditory and the vocal motor system. To gain cognitive control over audio-vocal interaction during vocal motor control, the PFC needs to be involved. Neurons in the ventrolateral PFC (VLPFC) have been shown to separately encode the sensory perceptions and motor production of vocalizations. It is unknown, however, whether single neurons in the PFC reflect audio-vocal interactions. We therefore recorded single-unit activity in the VLPFC of rhesus monkeys (Macaca mulatta) while they produced vocalizations on command or passively listened to monkey calls. We found that 12% of randomly selected neurons in VLPFC modulated their discharge rate in response to acoustic stimulation with species-specific calls. Almost three-fourths of these auditory neurons showed an additional modulation of their discharge rates either before and/or during the monkeys' motor production of vocalization. Based on these audio-vocal interactions, the VLPFC might be well positioned to combine higher order auditory processing with cognitive control of the vocal motor output. Such audio-vocal integration processes in the VLPFC might constitute a precursor for the evolution of complex learned audio-vocal integration systems, ultimately giving rise to human speech. Copyright © 2015 the authors 0270-6474/15/357030-11$15.00/0.
Reducing audio stimulus presentation latencies across studies, laboratories, and hardware and operating system configurations.

PubMed

Babjack, Destiny L; Cernicky, Brandon; Sobotka, Andrew J; Basler, Lee; Struthers, Devon; Kisic, Richard; Barone, Kimberly; Zuccolotto, Anthony P

2015-09-01

Using differing computer platforms and audio output devices to deliver audio stimuli often introduces (1) substantial variability across labs and (2) variable time between the intended and actual sound delivery (the sound onset latency). Fast, accurate audio onset latencies are particularly important when audio stimuli need to be delivered precisely as part of studies that depend on accurate timing (e.g., electroencephalographic, event-related potential, or multimodal studies), or in multisite studies in which standardization and strict control over the computer platforms used is not feasible. This research describes the variability introduced by using differing configurations and introduces a novel approach to minimizing audio sound latency and variability. A stimulus presentation and latency assessment approach is presented using E-Prime and Chronos (a new multifunction, USB-based data presentation and collection device). The present approach reliably delivers audio stimuli with low latencies that vary by ≤1 ms, independent of hardware and Windows operating system (OS)/driver combinations. The Chronos audio subsystem adopts a buffering, aborting, querying, and remixing approach to the delivery of audio, to achieve a consistent 1-ms sound onset latency for single-sound delivery, and precise delivery of multiple sounds that achieves standard deviations of 1/10th of a millisecond without the use of advanced scripting. Chronos's sound onset latencies are small, reliable, and consistent across systems. Testing of standard audio delivery devices and configurations highlights the need for careful attention to consistency between labs, experiments, and multiple study sites in their hardware choices, OS selections, and adoption of audio delivery systems designed to sidestep the audio latency variability issue.
47 CFR 10.520 - Common audio attention signal.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 1 2011-10-01 2011-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
47 CFR 10.520 - Common audio attention signal.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 47 Telecommunication 1 2013-10-01 2013-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
47 CFR 10.520 - Common audio attention signal.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 47 Telecommunication 1 2014-10-01 2014-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
47 CFR 10.520 - Common audio attention signal.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 47 Telecommunication 1 2012-10-01 2012-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
Audio visual speech source separation via improved context dependent association model

NASA Astrophysics Data System (ADS)

Kazemi, Alireza; Boostani, Reza; Sobhanmanesh, Fariborz

2014-12-01

In this paper, we exploit the non-linear relation between a speech source and its associated lip video as a source of extra information to propose an improved audio-visual speech source separation (AVSS) algorithm. The audio-visual association is modeled using a neural associator which estimates the visual lip parameters from a temporal context of acoustic observation frames. We define an objective function based on mean square error (MSE) measure between estimated and target visual parameters. This function is minimized for estimation of the de-mixing vector/filters to separate the relevant source from linear instantaneous or time-domain convolutive mixtures. We have also proposed a hybrid criterion which uses AV coherency together with kurtosis as a non-Gaussianity measure. Experimental results are presented and compared in terms of visually relevant speech detection accuracy and output signal-to-interference ratio (SIR) of source separation. The suggested audio-visual model significantly improves relevant speech classification accuracy compared to existing GMM-based model and the proposed AVSS algorithm improves the speech separation quality compared to reference ICA- and AVSS-based methods.
Digital Audio: A Sound Design Element.

ERIC Educational Resources Information Center

Barron, Ann; Varnadoe, Susan

1992-01-01

Discussion of incorporating audio into videodiscs for multimedia educational applications highlights a project developed for the Navy that used digital audio in an interactive video delivery system (IVDS) for training sonar operators. Storage constraints with videodiscs are explained, design requirements for the IVDS are described, and production…
Spatial domain entertainment audio decompression/compression

NASA Astrophysics Data System (ADS)

Chan, Y. K.; Tam, Ka Him K.

2014-02-01

The ARM7 NEON processor with 128bit SIMD hardware accelerator requires a peak performance of 13.99 Mega Cycles per Second for MP3 stereo entertainment quality decoding. For similar compression bit rate, OGG and AAC is preferred over MP3. The Patent Cooperation Treaty Application dated 28/August/2012 describes an audio decompression scheme producing a sequence of interleaving "min to Max" and "Max to min" rising and falling segments. The number of interior audio samples bound by "min to Max" or "Max to min" can be {0|1|…|N} audio samples. The magnitudes of samples, including the bounding min and Max, are distributed as normalized constants within the 0 and 1 of the bounding magnitudes. The decompressed audio is then a "sequence of static segments" on a frame by frame basis. Some of these frames needed to be post processed to elevate high frequency. The post processing is compression efficiency neutral and the additional decoding complexity is only a small fraction of the overall decoding complexity without the need of extra hardware. Compression efficiency can be speculated as very high as source audio had been decimated and converted to a set of data with only "segment length and corresponding segment magnitude" attributes. The PCT describes how these two attributes are efficiently coded by the PCT innovative coding scheme. The PCT decoding efficiency is obviously very high and decoding latency is basically zero. Both hardware requirement and run time is at least an order of magnitude better than MP3 variants. The side benefit is ultra low power consumption on mobile device. The acid test on how such a simplistic waveform representation can indeed reproduce authentic decompressed quality is benchmarked versus OGG(aoTuv Beta 6.03) by three pair of stereo audio frames and one broadcast like voice audio frame with each frame consisting 2,028 samples at 44,100KHz sampling frequency.
Effects of aging on audio-visual speech integration.

PubMed

Huyse, Aurélie; Leybaert, Jacqueline; Berthommier, Frédéric

2014-10-01

This study investigated the impact of aging on audio-visual speech integration. A syllable identification task was presented in auditory-only, visual-only, and audio-visual congruent and incongruent conditions. Visual cues were either degraded or unmodified. Stimuli were embedded in stationary noise alternating with modulated noise. Fifteen young adults and 15 older adults participated in this study. Results showed that older adults had preserved lipreading abilities when the visual input was clear but not when it was degraded. The impact of aging on audio-visual integration also depended on the quality of the visual cues. In the visual clear condition, the audio-visual gain was similar in both groups and analyses in the framework of the fuzzy-logical model of perception confirmed that older adults did not differ from younger adults in their audio-visual integration abilities. In the visual reduction condition, the audio-visual gain was reduced in the older group, but only when the noise was stationary, suggesting that older participants could compensate for the loss of lipreading abilities by using the auditory information available in the valleys of the noise. The fuzzy-logical model of perception confirmed the significant impact of aging on audio-visual integration by showing an increased weight of audition in the older group.
A Graph Theory Practice on Transformed Image: A Random Image Steganography

PubMed Central

Thanikaiselvan, V.; Arulmozhivarman, P.; Subashanthini, S.; Amirtharajan, Rengarajan

2013-01-01

Modern day information age is enriched with the advanced network communication expertise but unfortunately at the same time encounters infinite security issues when dealing with secret and/or private information. The storage and transmission of the secret information become highly essential and have led to a deluge of research in this field. In this paper, an optimistic effort has been taken to combine graceful graph along with integer wavelet transform (IWT) to implement random image steganography for secure communication. The implementation part begins with the conversion of cover image into wavelet coefficients through IWT and is followed by embedding secret image in the randomly selected coefficients through graph theory. Finally stegoimage is obtained by applying inverse IWT. This method provides a maximum of 44 dB peak signal to noise ratio (PSNR) for 266646 bits. Thus, the proposed method gives high imperceptibility through high PSNR value and high embedding capacity in the cover image due to adaptive embedding scheme and high robustness against blind attack through graph theoretic random selection of coefficients. PMID:24453857
Real-time implementation of second generation of audio multilevel information coding

NASA Astrophysics Data System (ADS)

Ali, Murtaza; Tewfik, Ahmed H.; Viswanathan, V.

1994-03-01

This paper describes real-time implementation of a novel wavelet- based audio compression method. This method is based on the discrete wavelet (DWT) representation of signals. A bit allocation procedure is used to allocate bits to the transform coefficients in an adaptive fashion. The bit allocation procedure has been designed to take advantage of the masking effect in human hearing. The procedure minimizes the number of bits required to represent each frame of audio signals at a fixed distortion level. The real-time implementation provides almost transparent compression of monophonic CD quality audio signals (samples at 44.1 KHz and quantized using 16 bits/sample) at bit rates of 64-78 Kbits/sec. Our implementation uses two ASPI Elf boards, each of which is built around a TI TMS230C31 DSP chip. The time required for encoding of a mono CD signal is about 92 percent of real time and that for decoding about 61 percent.
Audio-video feature correlation: faces and speech

NASA Astrophysics Data System (ADS)

Durand, Gwenael; Montacie, Claude; Caraty, Marie-Jose; Faudemay, Pascal

1999-08-01

This paper presents a study of the correlation of features automatically extracted from the audio stream and the video stream of audiovisual documents. In particular, we were interested in finding out whether speech analysis tools could be combined with face detection methods, and to what extend they should be combined. A generic audio signal partitioning algorithm as first used to detect Silence/Noise/Music/Speech segments in a full length movie. A generic object detection method was applied to the keyframes extracted from the movie in order to detect the presence or absence of faces. The correlation between the presence of a face in the keyframes and of the corresponding voice in the audio stream was studied. A third stream, which is the script of the movie, is warped on the speech channel in order to automatically label faces appearing in the keyframes with the name of the corresponding character. We naturally found that extracted audio and video features were related in many cases, and that significant benefits can be obtained from the joint use of audio and video analysis methods.
Engaging Practical Students through Audio Feedback

ERIC Educational Resources Information Center

Pearson, John

2018-01-01

This paper uses an action research intervention in an attempt to improve student engagement with summative feedback. The intervention delivered summative module feedback to the students as audio recordings, replacing the written method employed in previous years. The project found that students are keen on audio as an alternative to written…
SNR-adaptive stream weighting for audio-MES ASR.

PubMed

Lee, Ki-Seung

2008-08-01

Myoelectric signals (MESs) from the speaker's mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES observation vector was given by a linear combination of class-conditional observation log-likelihoods of two classifiers, using appropriate weights. We developed a weighting process adaptive to SNRs. The main objective of the paper involves determining the optimal SNR classification boundaries and constructing a set of optimum stream weights for each SNR class. These two parameters were determined by a method based on a maximum mutual information criterion. Acoustic and facial MES data were collected from five subjects, using a 60-word vocabulary. Four types of acoustic noise including babble, car, aircraft, and white noise were acoustically added to clean speech signals with SNR ranging from -14 to 31 dB. The classification accuracy of the audio ASR was as low as 25.5%. Whereas, the classification accuracy of the MES ASR was 85.2%. The classification accuracy could be further improved by employing the proposed audio-MES weighting method, which was as high as 89.4% in the case of babble noise. A similar result was also found for the other types of noise.
Digital Audio Application to Short Wave Broadcasting

NASA Technical Reports Server (NTRS)

Chen, Edward Y.

1997-01-01

Digital audio is becoming prevalent not only in consumer electornics, but also in different broadcasting media. Terrestrial analog audio broadcasting in the AM and FM bands will be eventually be replaced by digital systems.

3D Audio System

NASA Technical Reports Server (NTRS)

1992-01-01

Ames Research Center research into virtual reality led to the development of the Convolvotron, a high speed digital audio processing system that delivers three-dimensional sound over headphones. It consists of a two-card set designed for use with a personal computer. The Convolvotron's primary application is presentation of 3D audio signals over headphones. Four independent sound sources are filtered with large time-varying filters that compensate for motion. The perceived location of the sound remains constant. Possible applications are in air traffic control towers or airplane cockpits, hearing and perception research and virtual reality development.
ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

NASA Technical Reports Server (NTRS)

Logalbo, P.; Benedicto, J.; Viola, R.

1993-01-01

Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.
Audio signal processor

NASA Technical Reports Server (NTRS)

Hymer, R. L.

1970-01-01

System provides automatic volume control for an audio amplifier or a voice communication system without introducing noise surges during pauses in the input, and without losing the initial signal when the input resumes.
Structuring Broadcast Audio for Information Access

NASA Astrophysics Data System (ADS)

Gauvain, Jean-Luc; Lamel, Lori

2003-12-01

One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.
Automatic summarization of soccer highlights using audio-visual descriptors.

PubMed

Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

2015-01-01

Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
Digital Watermarking: From Concepts to Real-Time Video Applications

DTIC Science & Technology

1999-01-01

includes still- image , video, audio, and geometry data among others-the fundamental con- cept of steganography can be transferred from the field of...size of the message, which should be as small as possible. Some commercially available algorithms for image watermarking forego the secure-watermarking... image compres- sion.’ The image’s luminance component is divided into 8 x 8 pixel blocks. The algorithm selects a sequence of blocks and applies the
Toward Personal and Emotional Connectivity in Mobile Higher Education through Asynchronous Formative Audio Feedback

ERIC Educational Resources Information Center

Rasi, Päivi; Vuojärvi, Hanna

2018-01-01

This study aims to develop asynchronous formative audio feedback practices for mobile learning in higher education settings. The development was conducted in keeping with the principles of design-based research. The research activities focused on an inter-university online course, within which the use of instructor audio feedback was tested,…
47 CFR 73.403 - Digital audio broadcasting service requirements.

Code of Federal Regulations, 2010 CFR

2010-10-01

... programming stream at no direct charge to listeners. In addition, a broadcast radio station must simulcast its analog audio programming on one of its digital audio programming streams. The DAB audio programming... analog programming service currently provided to listeners. (b) Emergency information. The emergency...
A Virtual Audio Guidance and Alert System for Commercial Aircraft Operations

NASA Technical Reports Server (NTRS)

Begault, Durand R.; Wenzel, Elizabeth M.; Shrum, Richard; Miller, Joel; Null, Cynthia H. (Technical Monitor)

1996-01-01

Our work in virtual reality systems at NASA Ames Research Center includes the area of aurally-guided visual search, using specially-designed audio cues and spatial audio processing (also known as virtual or "3-D audio") techniques (Begault, 1994). Previous studies at Ames had revealed that use of 3-D audio for Traffic Collision Avoidance System (TCAS) advisories significantly reduced head-down time, compared to a head-down map display (0.5 sec advantage) or no display at all (2.2 sec advantage) (Begault, 1993, 1995; Begault & Pittman, 1994; see Wenzel, 1994, for an audio demo). Since the crew must keep their head up and looking out the window as much as possible when taxiing under low-visibility conditions, and the potential for "blunder" is increased under such conditions, it was sensible to evaluate the audio spatial cueing for a prototype audio ground collision avoidance warning (GCAW) system, and a 3-D audio guidance system. Results were favorable for GCAW, but not for the audio guidance system.
Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

ERIC Educational Resources Information Center

Udo, J. P.; Acevedo, B.; Fels, D. I.

2010-01-01

Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…
A digital audio/video interleaving system. [for Shuttle Orbiter

NASA Technical Reports Server (NTRS)

Richards, R. W.

1978-01-01

A method of interleaving an audio signal with its associated video signal for simultaneous transmission or recording, and the subsequent separation of the two signals, is described. Comparisons are made between the new audio signal interleaving system and the Skylab Pam audio/video interleaving system, pointing out improvements gained by using the digital audio/video interleaving system. It was found that the digital technique is the simplest, most effective and most reliable method for interleaving audio and/or other types of data into the video signal for the Shuttle Orbiter application. Details of the design of a multiplexer capable of accommodating two basic data channels, each consisting of a single 31.5-kb/s digital bit stream are given. An adaptive slope delta modulation system is introduced to digitize audio signals, producing a high immunity of work intelligibility to channel errors, primarily due to the robust nature of the delta-modulation algorithm.
AUDIO-CASI

PubMed Central

Cooley, Philip C.; Turner, Charles F.; O'Reilly, James M.; Allen, Danny R.; Hamill, David N.; Paddock, Richard E.

2011-01-01

This article reviews a multimedia application in the area of survey measurement research: adding audio capabilities to a computer-assisted interviewing system. Hardware and software issues are discussed, and potential hardware devices that operate from DOS platforms are reviewed. Three types of hardware devices are considered: PCMCIA devices, parallel port attachments, and laptops with built-in sound. PMID:22096271
Power saver circuit for audio/visual signal unit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Right, R. W.

1985-02-12

A combined audio and visual signal unit with the audio and visual components actuated alternately and powered over a single cable pair in such a manner that only one of the audio and visual components is drawing power from the power supply at any given instant. Thus, the power supply is never called upon to provide more energy than that drawn by the one of the components having the greater power requirement. This is particularly advantageous when several combined audio and visual signal units are coupled in parallel on one cable pair. Typically, the signal unit may comprise a hornmore » and a strobe light for a fire alarm signalling system.« less
Temporal Structure and Complexity Affect Audio-Visual Correspondence Detection

PubMed Central

Denison, Rachel N.; Driver, Jon; Ruff, Christian C.

2013-01-01

Synchrony between events in different senses has long been considered the critical temporal cue for multisensory integration. Here, using rapid streams of auditory and visual events, we demonstrate how humans can use temporal structure (rather than mere temporal coincidence) to detect multisensory relatedness. We find psychophysically that participants can detect matching auditory and visual streams via shared temporal structure for crossmodal lags of up to 200 ms. Performance on this task reproduced features of past findings based on explicit timing judgments but did not show any special advantage for perfectly synchronous streams. Importantly, the complexity of temporal patterns influences sensitivity to correspondence. Stochastic, irregular streams – with richer temporal pattern information – led to higher audio-visual matching sensitivity than predictable, rhythmic streams. Our results reveal that temporal structure and its complexity are key determinants for human detection of audio-visual correspondence. The distinctive emphasis of our new paradigms on temporal patterning could be useful for studying special populations with suspected abnormalities in audio-visual temporal perception and multisensory integration. PMID:23346067
The Audio-Visual Equipment Directory. Seventeenth Edition.

ERIC Educational Resources Information Center

Herickes, Sally, Ed.

The following types of audiovisual equipment are catalogued: 8 mm. and 16 mm. motion picture projectors, filmstrip and sound filmstrip projectors, slide projectors, random access projection equipment, opaque, overhead, and micro-projectors, record players, special purpose projection equipment, audio tape recorders and players, audio tape…
How we give personalised audio feedback after summative OSCEs.

PubMed

Harrison, Christopher J; Molyneux, Adrian J; Blackwell, Sara; Wass, Valerie J

2015-04-01

Students often receive little feedback after summative objective structured clinical examinations (OSCEs) to enable them to improve their performance. Electronic audio feedback has shown promise in other educational areas. We investigated the feasibility of electronic audio feedback in OSCEs. An electronic OSCE system was designed, comprising (1) an application for iPads allowing examiners to mark in the key consultation skill domains, provide "tick-box" feedback identifying strengths and difficulties, and record voice feedback; (2) a feedback website giving students the opportunity to view/listen in multiple ways to the feedback. Acceptability of the audio feedback was investigated, using focus groups with students and questionnaires with both examiners and students. 87 (95%) students accessed the examiners' audio comments; 83 (90%) found the comments useful and 63 (68%) reported changing the way they perform a skill as a result of the audio feedback. They valued its highly personalised, relevant nature and found it much more useful than written feedback. Eighty-nine per cent of examiners gave audio feedback to all students on their stations. Although many found the method easy, lack of time was a factor. Electronic audio feedback provides timely, personalised feedback to students after a summative OSCE provided enough time is allocated to the process.
Automatic Detection and Classification of Audio Events for Road Surveillance Applications.

PubMed

Almaadeed, Noor; Asim, Muhammad; Al-Maadeed, Somaya; Bouridane, Ahmed; Beghdadi, Azeddine

2018-06-06

This work investigates the problem of detecting hazardous events on roads by designing an audio surveillance system that automatically detects perilous situations such as car crashes and tire skidding. In recent years, research has shown several visual surveillance systems that have been proposed for road monitoring to detect accidents with an aim to improve safety procedures in emergency cases. However, the visual information alone cannot detect certain events such as car crashes and tire skidding, especially under adverse and visually cluttered weather conditions such as snowfall, rain, and fog. Consequently, the incorporation of microphones and audio event detectors based on audio processing can significantly enhance the detection accuracy of such surveillance systems. This paper proposes to combine time-domain, frequency-domain, and joint time-frequency features extracted from a class of quadratic time-frequency distributions (QTFDs) to detect events on roads through audio analysis and processing. Experiments were carried out using a publicly available dataset. The experimental results conform the effectiveness of the proposed approach for detecting hazardous events on roads as demonstrated by 7% improvement of accuracy rate when compared against methods that use individual temporal and spectral features.
Tune in the Net with RealAudio.

ERIC Educational Resources Information Center

Buchanan, Larry

1997-01-01

Describes how to connect to the RealAudio Web site to download a player that provides sound from Web pages to the computer through streaming technology. Explains hardware and software requirements and provides addresses for other RealAudio Web sites are provided, including weather information and current news. (LRW)
Comparison of three orientation and mobility aids for individuals with blindness: Verbal description, audio-tactile map and audio-haptic map.

PubMed

Papadopoulos, Konstantinos; Koustriava, Eleni; Koukourikos, Panagiotis; Kartasidou, Lefkothea; Barouti, Marialena; Varveris, Asimis; Misiou, Marina; Zacharogeorga, Timoclia; Anastasiadis, Theocharis

2017-01-01

Disorientation and inability of wayfinding are phenomena with a great frequency for individuals with visual impairments during the process of travelling novel environments. Orientation and mobility aids could suggest important tools for the preparation of a more secure and cognitively mapped travelling. The aim of the present study was to examine if spatial knowledge structured after an individual with blindness had studied the map of an urban area that was delivered through a verbal description, an audio-tactile map or an audio-haptic map, could be used for detecting in the area specific points of interest. The effectiveness of the three aids with reference to each other was also examined. The results of the present study highlight the effectiveness of the audio-tactile and the audio-haptic maps as orientation and mobility aids, especially when these are compared to verbal descriptions.
Sounding ruins: reflections on the production of an 'audio drift'.

PubMed

Gallagher, Michael

2015-07-01

This article is about the use of audio media in researching places, which I term 'audio geography'. The article narrates some episodes from the production of an 'audio drift', an experimental environmental sound work designed to be listened to on a portable MP3 player whilst walking in a ruinous landscape. Reflecting on how this work functions, I argue that, as well as representing places, audio geography can shape listeners' attention and bodily movements, thereby reworking places, albeit temporarily. I suggest that audio geography is particularly apt for amplifying the haunted and uncanny qualities of places. I discuss some of the issues raised for research ethics, epistemology and spectral geographies.

The sounds of handheld audio players.

PubMed

Rudy, Susan F

2007-01-01

Hearing experts and public health organizations have longstanding hearing safety concerns about personal handheld audio devices, which are growing in both number and popularity. This paper reviews the maximum sound levels of handheld compact disc players, MP3 players, and an iPod. It further reviews device factors that influence the sound levels produced by these audio devices and ways to reduce the risk to hearing during their use.
Ontology-based structured cosine similarity in document summarization: with applications to mobile audio-based knowledge management.

PubMed

Yuan, Soe-Tsyr; Sun, Jerry

2005-10-01

Development of algorithms for automated text categorization in massive text document sets is an important research area of data mining and knowledge discovery. Most of the text-clustering methods were grounded in the term-based measurement of distance or similarity, ignoring the structure of the documents. In this paper, we present a novel method named structured cosine similarity (SCS) that furnishes document clustering with a new way of modeling on document summarization, considering the structure of the documents so as to improve the performance of document clustering in terms of quality, stability, and efficiency. This study was motivated by the problem of clustering speech documents (of no rich document features) attained from the wireless experience oral sharing conducted by mobile workforce of enterprises, fulfilling audio-based knowledge management. In other words, this problem aims to facilitate knowledge acquisition and sharing by speech. The evaluations also show fairly promising results on our method of structured cosine similarity.
Digital Audio Sampling for Film and Video.

ERIC Educational Resources Information Center

Stanton, Michael J.

Digital audio sampling is explained, and some of its implications in digital sound applications are discussed. Digital sound equipment is rapidly replacing analog recording devices as the state-of-the-art in audio technology. The philosophy of digital recording involves doing away with the continuously variable analog waveforms and turning the…
Audio and Video Reflections to Promote Social Justice

ERIC Educational Resources Information Center

Boske, Christa

2011-01-01

Purpose: The purpose of this paper is to examine how 15 graduate students enrolled in a US school leadership preparation program understand issues of social justice and equity through a reflective process utilizing audio and/or video software. Design/methodology/approach: The study is based on the tradition of grounded theory. The researcher…
Modified DCTNet for audio signals classification

NASA Astrophysics Data System (ADS)

Xian, Yin; Pu, Yunchen; Gan, Zhe; Lu, Liang; Thompson, Andrew

2016-10-01

In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.
Audio Motor Training at the Foot Level Improves Space Representation.

PubMed

Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica

2017-01-01

Spatial representation is developed thanks to the integration of visual signals with the other senses. It has been shown that the lack of vision compromises the development of some spatial representations. In this study we tested the effect of a new rehabilitation device called ABBI (Audio Bracelet for Blind Interaction) to improve space representation. ABBI produces an audio feedback linked to body movement. Previous studies from our group showed that this device improves the spatial representation of space in early blind adults around the upper part of the body. Here we evaluate whether the audio motor feedback produced by ABBI can also improve audio spatial representation of sighted individuals in the space around the legs. Forty five blindfolded sighted subjects participated in the study, subdivided into three experimental groups. An audio space localization (front-back discrimination) task was performed twice by all groups of subjects before and after different kind of training conditions. A group (experimental) performed an audio-motor training with the ABBI device placed on their foot. Another group (control) performed a free motor activity without audio feedback associated with body movement. The other group (control) passively listened to the ABBI sound moved at foot level by the experimenter without producing any body movement. Results showed that only the experimental group, which performed the training with the audio-motor feedback, showed an improvement in accuracy for sound discrimination. No improvement was observed for the two control groups. These findings suggest that the audio-motor training with ABBI improves audio space perception also in the space around the legs in sighted individuals. This result provides important inputs for the rehabilitation of the space representations in the lower part of the body.
Audio Motor Training at the Foot Level Improves Space Representation

PubMed Central

Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica

2017-01-01

Spatial representation is developed thanks to the integration of visual signals with the other senses. It has been shown that the lack of vision compromises the development of some spatial representations. In this study we tested the effect of a new rehabilitation device called ABBI (Audio Bracelet for Blind Interaction) to improve space representation. ABBI produces an audio feedback linked to body movement. Previous studies from our group showed that this device improves the spatial representation of space in early blind adults around the upper part of the body. Here we evaluate whether the audio motor feedback produced by ABBI can also improve audio spatial representation of sighted individuals in the space around the legs. Forty five blindfolded sighted subjects participated in the study, subdivided into three experimental groups. An audio space localization (front-back discrimination) task was performed twice by all groups of subjects before and after different kind of training conditions. A group (experimental) performed an audio-motor training with the ABBI device placed on their foot. Another group (control) performed a free motor activity without audio feedback associated with body movement. The other group (control) passively listened to the ABBI sound moved at foot level by the experimenter without producing any body movement. Results showed that only the experimental group, which performed the training with the audio-motor feedback, showed an improvement in accuracy for sound discrimination. No improvement was observed for the two control groups. These findings suggest that the audio-motor training with ABBI improves audio space perception also in the space around the legs in sighted individuals. This result provides important inputs for the rehabilitation of the space representations in the lower part of the body. PMID:29326564
DNA-based cryptographic methods for data hiding in DNA media.

PubMed

Marwan, Samiha; Shawish, Ahmed; Nagaty, Khaled

2016-12-01

Information security can be achieved using cryptography, steganography or a combination of them, where data is firstly encrypted using any of the available cryptography techniques and then hid into any hiding medium. Recently, the famous genomic DNA has been introduced as a hiding medium, known as DNA steganography, due to its notable ability to hide huge data sets with a high level of randomness and hence security. Despite the numerous cryptography techniques, to our knowledge only the vigenere cipher and the DNA-based playfair cipher have been combined with the DNA steganography, which keeps space for investigation of other techniques and coming up with new improvements. This paper presents a comprehensive analysis between the DNA-based playfair, vigenere, RSA and the AES ciphers, each combined with a DNA hiding technique. The conducted analysis reports the performance diversity of each combined technique in terms of security, speed, hiding capacity in addition to both key size and data size. Moreover, this paper proposes a modification of the current combined DNA-based playfair cipher technique, which makes it not only simple and fast but also provides a significantly higher hiding capacity and security. The conducted extensive experimental studies confirm such outstanding performance in comparison with all the discussed combined techniques. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Effect of audio in-vehicle red light-running warning message on driving behavior based on a driving simulator experiment.

PubMed

Yan, Xuedong; Liu, Yang; Xu, Yongcun

2015-01-01

Drivers' incorrect decisions of crossing signalized intersections at the onset of the yellow change may lead to red light running (RLR), and RLR crashes result in substantial numbers of severe injuries and property damage. In recent years, some Intelligent Transport System (ITS) concepts have focused on reducing RLR by alerting drivers that they are about to violate the signal. The objective of this study is to conduct an experimental investigation on the effectiveness of the red light violation warning system using a voice message. In this study, the prototype concept of the RLR audio warning system was modeled and tested in a high-fidelity driving simulator. According to the concept, when a vehicle is approaching an intersection at the onset of yellow and the time to the intersection is longer than the yellow interval, the in-vehicle warning system can activate the following audio message "The red light is impending. Please decelerate!" The intent of the warning design is to encourage drivers who cannot clear an intersection during the yellow change interval to stop at the intersection. The experimental results showed that the warning message could decrease red light running violations by 84.3 percent. Based on the logistic regression analyses, drivers without a warning were about 86 times more likely to make go decisions at the onset of yellow and about 15 times more likely to run red lights than those with a warning. Additionally, it was found that the audio warning message could significantly reduce RLR severity because the RLR drivers' red-entry times without a warning were longer than those with a warning. This driving simulator study showed a promising effect of the audio in-vehicle warning message on reducing RLR violations and crashes. It is worthwhile to further develop the proposed technology in field applications.
Space Shuttle Orbiter audio subsystem. [to communication and tracking system

NASA Technical Reports Server (NTRS)

Stewart, C. H.

1978-01-01

The selection of the audio multiplex control configuration for the Space Shuttle Orbiter audio subsystem is discussed and special attention is given to the evaluation criteria of cost, weight and complexity. The specifications and design of the subsystem are described and detail is given to configurations of the audio terminal and audio central control unit (ATU, ACCU). The audio input from the ACCU, at a signal level of -12.2 to 14.8 dBV, nominal range, at 1 kHz, was found to have balanced source impedance and a balanced local impedance of 6000 + or - 600 ohms at 1 kHz, dc isolated. The Lyndon B. Johnson Space Center (JSC) electroacoustic test laboratory, an audio engineering facility consisting of a collection of acoustic test chambers, analyzed problems of speaker and headset performance, multiplexed control data coupled with audio channels, and the Orbiter cabin acoustic effects on the operational performance of voice communications. This system allows technical management and project engineering to address key constraining issues, such as identifying design deficiencies of the headset interface unit and the assessment of the Orbiter cabin performance of voice communications, which affect the subsystem development.
Spatialized audio improves call sign recognition during multi-aircraft control.

PubMed

Kim, Sungbin; Miller, Michael E; Rusnock, Christina F; Elshaw, John J

2018-07-01

We investigated the impact of a spatialized audio display on response time, workload, and accuracy while monitoring auditory information for relevance. The human ability to differentiate sound direction implies that spatial audio may be used to encode information. Therefore, it is hypothesized that spatial audio cues can be applied to aid differentiation of critical versus noncritical verbal auditory information. We used a human performance model and a laboratory study involving 24 participants to examine the effect of applying a notional, automated parser to present audio in a particular ear depending on information relevance. Operator workload and performance were assessed while subjects listened for and responded to relevant audio cues associated with critical information among additional noncritical information. Encoding relevance through spatial location in a spatial audio display system--as opposed to monophonic, binaural presentation--significantly reduced response time and workload, particularly for noncritical information. Future auditory displays employing spatial cues to indicate relevance have the potential to reduce workload and improve operator performance in similar task domains. Furthermore, these displays have the potential to reduce the dependence of workload and performance on the number of audio cues. Published by Elsevier Ltd.
Concurrent emotional pictures modulate temporal order judgments of spatially separated audio-tactile stimuli.

PubMed

Jia, Lina; Shi, Zhuanghua; Zang, Xuelian; Müller, Hermann J

2013-11-06

Although attention can be captured toward high-arousal stimuli, little is known about how perceiving emotion in one modality influences the temporal processing of non-emotional stimuli in other modalities. We addressed this issue by presenting observers spatially uninformative emotional pictures while they performed an audio-tactile temporal-order judgment (TOJ) task. In Experiment 1, audio-tactile stimuli were presented at the same location straight ahead of the participants, who had to judge "which modality came first?". In Experiments 2 and 3, the audio-tactile stimuli were delivered one to the left and the other to the right side, and participants had to judge "which side came first?". We found both negative and positive high-arousal pictures to significantly bias TOJs towards the tactile and away from the auditory event when the audio-tactile stimuli were spatially separated; by contrast, there was no such bias when the audio-tactile stimuli originated from the same location. To further examine whether this bias is attributable to the emotional meanings conveyed by the pictures or to their high arousal effect, we compared and contrasted the influences of near-body threat vs. remote threat (emotional) pictures on audio-tactile TOJs in Experiment 3. The bias manifested only in the near-body threat condition. Taken together, the findings indicate that visual stimuli conveying meanings of near-body interaction activate a sensorimotor functional link prioritizing the processing of tactile over auditory signals when these signals are spatially separated. In contrast, audio-tactile signals from the same location engender strong crossmodal integration, thus counteracting modality-based attentional shifts induced by the emotional pictures. © 2013 Published by Elsevier B.V.
Cross-modal integration of polyphonic characters in Chinese audio-visual sentences: a MVPA study based on functional connectivity.

PubMed

Zhang, Zhengyi; Zhang, Gaoyan; Zhang, Yuanyuan; Liu, Hong; Xu, Junhai; Liu, Baolin

2017-12-01

This study aimed to investigate the functional connectivity in the brain during the cross-modal integration of polyphonic characters in Chinese audio-visual sentences. The visual sentences were all semantically reasonable and the audible pronunciations of the polyphonic characters in corresponding sentences contexts varied in four conditions. To measure the functional connectivity, correlation, coherence and phase synchronization index (PSI) were used, and then multivariate pattern analysis was performed to detect the consensus functional connectivity patterns. These analyses were confined in the time windows of three event-related potential components of P200, N400 and late positive shift (LPS) to investigate the dynamic changes of the connectivity patterns at different cognitive stages. We found that when differentiating the polyphonic characters with abnormal pronunciations from that with the appreciate ones in audio-visual sentences, significant classification results were obtained based on the coherence in the time window of the P200 component, the correlation in the time window of the N400 component and the coherence and PSI in the time window the LPS component. Moreover, the spatial distributions in these time windows were also different, with the recruitment of frontal sites in the time window of the P200 component, the frontal-central-parietal regions in the time window of the N400 component and the central-parietal sites in the time window of the LPS component. These findings demonstrate that the functional interaction mechanisms are different at different stages of audio-visual integration of polyphonic characters.
Multi-channel spatialization systems for audio signals

NASA Technical Reports Server (NTRS)

Begault, Durand R. (Inventor)

1993-01-01

Synthetic head related transfer functions (HRTF's) for imposing reprogrammable spatial cues to a plurality of audio input signals included, for example, in multiple narrow-band audio communications signals received simultaneously are generated and stored in interchangeable programmable read only memories (PROM's) which store both head related transfer function impulse response data and source positional information for a plurality of desired virtual source locations. The analog inputs of the audio signals are filtered and converted to digital signals from which synthetic head related transfer functions are generated in the form of linear phase finite impulse response filters. The outputs of the impulse response filters are subsequently reconverted to analog signals, filtered, mixed, and fed to a pair of headphones.
Multi-channel spatialization system for audio signals

NASA Technical Reports Server (NTRS)

Begault, Durand R. (Inventor)

1995-01-01

Synthetic head related transfer functions (HRTF's) for imposing reprogramable spatial cues to a plurality of audio input signals included, for example, in multiple narrow-band audio communications signals received simultaneously are generated and stored in interchangeable programmable read only memories (PROM's) which store both head related transfer function impulse response data and source positional information for a plurality of desired virtual source locations. The analog inputs of the audio signals are filtered and converted to digital signals from which synthetic head related transfer functions are generated in the form of linear phase finite impulse response filters. The outputs of the impulse response filters are subsequently reconverted to analog signals, filtered, mixed and fed to a pair of headphones.
The Practical Audio-Visual Handbook for Teachers.

ERIC Educational Resources Information Center

Scuorzo, Herbert E.

The use of audio/visual media as an aid to instruction is a common practice in today's classroom. Most teachers, however, have little or no formal training in this field and rarely a knowledgeable coordinator to help them. "The Practical Audio-Visual Handbook for Teachers" discusses the types and mechanics of many of these media forms and proposes…
Hierarchical vs non-hierarchical audio indexation and classification for video genres

NASA Astrophysics Data System (ADS)

Dammak, Nouha; BenAyed, Yassine

2018-04-01

In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
Effect of using different cover image quality to obtain robust selective embedding in steganography

NASA Astrophysics Data System (ADS)

Abdullah, Karwan Asaad; Al-Jawad, Naseer; Abdulla, Alan Anwer

2014-05-01

One of the common types of steganography is to conceal an image as a secret message in another image which normally called a cover image; the resulting image is called a stego image. The aim of this paper is to investigate the effect of using different cover image quality, and also analyse the use of different bit-plane in term of robustness against well-known active attacks such as gamma, statistical filters, and linear spatial filters. The secret messages are embedded in higher bit-plane, i.e. in other than Least Significant Bit (LSB), in order to resist active attacks. The embedding process is performed in three major steps: First, the embedding algorithm is selectively identifying useful areas (blocks) for embedding based on its lighting condition. Second, is to nominate the most useful blocks for embedding based on their entropy and average. Third, is to select the right bit-plane for embedding. This kind of block selection made the embedding process scatters the secret message(s) randomly around the cover image. Different tests have been performed for selecting a proper block size and this is related to the nature of the used cover image. Our proposed method suggests a suitable embedding bit-plane as well as the right blocks for the embedding. Experimental results demonstrate that different image quality used for the cover images will have an effect when the stego image is attacked by different active attacks. Although the secret messages are embedded in higher bit-plane, but they cannot be recognised visually within the stegos image.
Talker variability in audio-visual speech perception

PubMed Central

Heald, Shannon L. M.; Nusbaum, Howard C.

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker’s face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919
Talker variability in audio-visual speech perception.

PubMed

Heald, Shannon L M; Nusbaum, Howard C

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.

When patients take the initiative to audio-record a clinical consultation.

PubMed

van Bruinessen, Inge Renske; Leegwater, Brigit; van Dulmen, Sandra

2017-08-01

to get insight into healthcare professionals' current experience with, and views on consultation audio-recordings made on patients' initiative. 215 Dutch healthcare professionals (123 physicians and 92 nurses) working in oncology care completed a survey inquiring their experiences and views. 71% of the respondents had experience with the consultation audio-recordings. Healthcare professionals who are in favour of the use of audio-recordings seem to embrace the evidence-based benefits for patients of listing back to a consultation again, and mention the positive influence on their patients. Opposing arguments relate to the belief that is confusing for patients or that it increases the chance that information is misinterpreted. Also the lack of control they have over the recording (fear for misuse), uncertainty about the medico-legal status, inhibiting influence on the communication process and feeling of distrust was mentioned. For almost one quarter of respondents these arguments and concerns were reason enough not to cooperate at all (9%), to cooperate only in certain cases (4%) or led to doubts about cooperation (9%). the many concerns that exist among healthcare professionals need to be tackled in order to increase transparency, as audio-recordings are expected to be used increasingly. Copyright © 2017 Elsevier B.V. All rights reserved.
Neuromorphic audio-visual sensor fusion on a sound-localizing robot.

PubMed

Chan, Vincent Yue-Sek; Jin, Craig T; van Schaik, André

2012-01-01

This paper presents the first robotic system featuring audio-visual (AV) sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localization through self motion and visual feedback, using an adaptive ITD-based sound localization algorithm. After training, the robot can localize sound sources (white or pink noise) in a reverberant environment with an RMS error of 4-5° in azimuth. We also investigate the AV source binding problem and an experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. Despite the simplicity of this method and a large number of false visual events in the background, a correct match can be made 75% of the time during the experiment.
Audio-Tutorial Instruction: A Strategy For Teaching Introductory College Geology.

ERIC Educational Resources Information Center

Fenner, Peter; Andrews, Ted F.

The rationale of audio-tutorial instruction is discussed, and the history and development of the audio-tutorial botany program at Purdue University is described. Audio-tutorial programs in geology at eleven colleges and one school are described, illustrating several ways in which programs have been developed and integrated into courses. Programs…
47 CFR 10.520 - Common audio attention signal.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section 10.520 Telecommunication FEDERAL COMMUNICATIONS COMMISSION GENERAL COMMERCIAL MOBILE ALERT SYSTEM Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment...
Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans.

PubMed

Bresch, Erik; Nielsen, Jon; Nayak, Krishna; Narayanan, Shrikanth

2006-10-01

This letter describes a data acquisition setup for recording, and processing, running speech from a person in a magnetic resonance imaging (MRI) scanner. The main focus is on ensuring synchronicity between image and audio acquisition, and in obtaining good signal to noise ratio to facilitate further speech analysis and modeling. A field-programmable gate array based hardware design for synchronizing the scanner image acquisition to other external data such as audio is described. The audio setup itself features two fiber optical microphones and a noise-canceling filter. Two noise cancellation methods are described including a novel approach using a pulse sequence specific model of the gradient noise of the MRI scanner. The setup is useful for scientific speech production studies. Sample results of speech and singing data acquired and processed using the proposed method are given.
National Information Systems Security Conference (19th) held in Baltimore, Maryland on October 22-25, 1996. Volume 1

DTIC Science & Technology

1996-10-25

been demonstrated that steganography is ineffective 195 when images are stored using this compression algorithm [2]. Difficulty in designing a general...Despite the relative ease of employing steganography to covertly transport data in an uncompressed 24-bit image , lossy compression algorithms based on... image , the security threat that steganography poses cannot be completely eliminated by application of a transform-based lossy compression algorithm
Improvements of ModalMax High-Fidelity Piezoelectric Audio Device

NASA Technical Reports Server (NTRS)

Woodard, Stanley E.

2005-01-01

ModalMax audio speakers have been enhanced by innovative means of tailoring the vibration response of thin piezoelectric plates to produce a high-fidelity audio response. The ModalMax audio speakers are 1 mm in thickness. The device completely supplants the need to have a separate driver and speaker cone. ModalMax speakers can perform the same applications of cone speakers, but unlike cone speakers, ModalMax speakers can function in harsh environments such as high humidity or extreme wetness. New design features allow the speakers to be completely submersed in salt water, making them well suited for maritime applications. The sound produced from the ModalMax audio speakers has sound spatial resolution that is readily discernable for headset users.
Audio signal encryption using chaotic Hénon map and lifting wavelet transforms

NASA Astrophysics Data System (ADS)

Roy, Animesh; Misra, A. P.

2017-12-01

We propose an audio signal encryption scheme based on the chaotic Hénon map. The scheme mainly comprises two phases: one is the preprocessing stage where the audio signal is transformed into data by the lifting wavelet scheme and the other in which the transformed data is encrypted by chaotic data set and hyperbolic functions. Furthermore, we use dynamic keys and consider the key space size to be large enough to resist any kind of cryptographic attacks. A statistical investigation is also made to test the security and the efficiency of the proposed scheme.
Low-delay predictive audio coding for the HIVITS HDTV codec

NASA Astrophysics Data System (ADS)

McParland, A. K.; Gilchrist, N. H. C.

1995-01-01

The status of work relating to predictive audio coding, as part of the European project on High Quality Video Telephone and HD(TV) Systems (HIVITS), is reported. The predictive coding algorithm is developed, along with six-channel audio coding and decoding hardware. Demonstrations of the audio codec operating in conjunction with the video codec, are given.
Huffman coding in advanced audio coding standard

NASA Astrophysics Data System (ADS)

Brzuchalski, Grzegorz

2012-05-01

This article presents several hardware architectures of Advanced Audio Coding (AAC) Huffman noiseless encoder, its optimisations and working implementation. Much attention has been paid to optimise the demand of hardware resources especially memory size. The aim of design was to get as short binary stream as possible in this standard. The Huffman encoder with whole audio-video system has been implemented in FPGA devices.
Collusion-Resistant Audio Fingerprinting System in the Modulated Complex Lapped Transform Domain

PubMed Central

Garcia-Hernandez, Jose Juan; Feregrino-Uribe, Claudia; Cumplido, Rene

2013-01-01

Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios. PMID:23762455
Audre's daughter: Black lesbian steganography in Dee Rees' Pariah and Audre Lorde's Zami: A New Spelling of My Name.

PubMed

Kang, Nancy

2016-01-01

This article argues that African-American director Dee Rees' critically acclaimed debut Pariah (2011) is a rewriting of lesbian poet-activist Audre Lorde's iconic "bio-mythography" Zami: A New Spelling of My Name (1982). The article examines how Rees' work creatively and subtly re-envisions Lorde's Zami by way of deeply rooted and often cleverly camouflaged patterns, resonances, and contrasts. Shared topics include naming, mother-daughter bonds, the role of clothing in identity formation, domestic abuse, queer time, and lesbian, gay, bisexual, and transgender legacy discourse construction. What emerges between the visual and written texts is a hidden language of connection--what may be termed Black lesbian steganography--which proves thought-provoking to viewers and readers alike.
Sounding ruins: reflections on the production of an ‘audio drift’

PubMed Central

Gallagher, Michael

2014-01-01

This article is about the use of audio media in researching places, which I term ‘audio geography’. The article narrates some episodes from the production of an ‘audio drift’, an experimental environmental sound work designed to be listened to on a portable MP3 player whilst walking in a ruinous landscape. Reflecting on how this work functions, I argue that, as well as representing places, audio geography can shape listeners’ attention and bodily movements, thereby reworking places, albeit temporarily. I suggest that audio geography is particularly apt for amplifying the haunted and uncanny qualities of places. I discuss some of the issues raised for research ethics, epistemology and spectral geographies. PMID:29708107
47 CFR 73.9005 - Compliance requirements for covered demodulator products: Audio.

Code of Federal Regulations, 2010 CFR

2010-10-01

... products: Audio. 73.9005 Section 73.9005 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED....9005 Compliance requirements for covered demodulator products: Audio. Except as otherwise provided in §§ 73.9003(a) or 73.9004(a), covered demodulator products shall not output the audio portions of...
Design and Implementation of a Video-Zoom Driven Digital Audio-Zoom System for Portable Digital Imaging Devices

NASA Astrophysics Data System (ADS)

Park, Nam In; Kim, Seon Man; Kim, Hong Kook; Kim, Ji Woon; Kim, Myeong Bo; Yun, Su Won

In this paper, we propose a video-zoom driven audio-zoom algorithm in order to provide audio zooming effects in accordance with the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone system, in conjunction with a soft masking process that considers the phase differences between microphones. Thus, the audio-zoom processed signal is obtained by multiplying an audio gain derived from a video-zoom level by the masked signal. After all, a real-time audio-zoom system is implemented on an ARM-CORETEX-A8 having a clock speed of 600 MHz after different levels of optimization are performed such as algorithmic level, C-code, and memory optimizations. To evaluate the complexity of the proposed real-time audio-zoom system, test data whose length is 21.3 seconds long is sampled at 48 kHz. As a result, it is shown from the experiments that the processing time for the proposed audio-zoom system occupies 14.6% or less of the ARM clock cycles. It is also shown from the experimental results performed in a semi-anechoic chamber that the signal with the front direction can be amplified by approximately 10 dB compared to the other directions.
Audio-visual integration through the parallel visual pathways.

PubMed

Kaposvári, Péter; Csete, Gergő; Bognár, Anna; Csibri, Péter; Tóth, Eszter; Szabó, Nikoletta; Vécsei, László; Sáry, Gyula; Tamás Kincses, Zsigmond

2015-10-22

Audio-visual integration has been shown to be present in a wide range of different conditions, some of which are processed through the dorsal, and others through the ventral visual pathway. Whereas neuroimaging studies have revealed integration-related activity in the brain, there has been no imaging study of the possible role of segregated visual streams in audio-visual integration. We set out to determine how the different visual pathways participate in this communication. We investigated how audio-visual integration can be supported through the dorsal and ventral visual pathways during the double flash illusion. Low-contrast and chromatic isoluminant stimuli were used to drive preferably the dorsal and ventral pathways, respectively. In order to identify the anatomical substrates of the audio-visual interaction in the two conditions, the psychophysical results were correlated with the white matter integrity as measured by diffusion tensor imaging.The psychophysiological data revealed a robust double flash illusion in both conditions. A correlation between the psychophysical results and local fractional anisotropy was found in the occipito-parietal white matter in the low-contrast condition, while a similar correlation was found in the infero-temporal white matter in the chromatic isoluminant condition. Our results indicate that both of the parallel visual pathways may play a role in the audio-visual interaction. Copyright © 2015. Published by Elsevier B.V.
The Black Record: A Selective Discography of Afro-Americana on Audio Discs Held by the Audio/Visual Department, John M. Olin Library.

ERIC Educational Resources Information Center

Dain, Bernice, Comp.; Nevin, David, Comp.

The present revised and expanded edition of this document is an inclusive cumulation. A few items have been included which are on order as new to the collection or as replacements. This discography is intended to serve primarily as a local user's guide. The call number preceding each entry is based on the Audio-Visual Department's own, unique…
A review of lossless audio compression standards and algorithms

NASA Astrophysics Data System (ADS)

Muin, Fathiah Abdul; Gunawan, Teddy Surya; Kartiwi, Mira; Elsheikh, Elsheikh M. A.

2017-09-01

Over the years, lossless audio compression has gained popularity as researchers and businesses has become more aware of the need for better quality and higher storage demand. This paper will analyse various lossless audio coding algorithm and standards that are used and available in the market focusing on Linear Predictive Coding (LPC) specifically due to its popularity and robustness in audio compression, nevertheless other prediction methods are compared to verify this. Advanced representation of LPC such as LSP decomposition techniques are also discussed within this paper.
Teaching Audio Playwriting: The Pedagogy of Drama Podcasting

ERIC Educational Resources Information Center

Eshelman, David J.

2016-01-01

This article suggests how teaching artists can develop practical coursework in audio playwriting. To prepare students to work in the reemergent audio drama medium, the author created a seminar course called Radio Theatre Writing, taught at Arkansas Tech University in the fall of 2014. The course had three sections. First, it focused on…
Dynamic and scalable audio classification by collective network of binary classifiers framework: an evolutionary approach.

PubMed

Kiranyaz, Serkan; Mäkinen, Toni; Gabbouj, Moncef

2012-10-01

In this paper, we propose a novel framework based on a collective network of evolutionary binary classifiers (CNBC) to address the problems of feature and class scalability. The main goal of the proposed framework is to achieve a high classification performance over dynamic audio and video repositories. The proposed framework adopts a "Divide and Conquer" approach in which an individual network of binary classifiers (NBC) is allocated to discriminate each audio class. An evolutionary search is applied to find the best binary classifier in each NBC with respect to a given criterion. Through the incremental evolution sessions, the CNBC framework can dynamically adapt to each new incoming class or feature set without resorting to a full-scale re-training or re-configuration. Therefore, the CNBC framework is particularly designed for dynamically varying databases where no conventional static classifiers can adapt to such changes. In short, it is entirely a novel topology, an unprecedented approach for dynamic, content/data adaptive and scalable audio classification. A large set of audio features can be effectively used in the framework, where the CNBCs make appropriate selections and combinations so as to achieve the highest discrimination among individual audio classes. Experiments demonstrate a high classification accuracy (above 90%) and efficiency of the proposed framework over large and dynamic audio databases. Copyright © 2012 Elsevier Ltd. All rights reserved.

Digital Audio Radio Broadcast Systems Laboratory Testing Nearly Complete

NASA Technical Reports Server (NTRS)

2005-01-01

Radio history continues to be made at the NASA Lewis Research Center with the completion of phase one of the digital audio radio (DAR) testing conducted by the Consumer Electronics Group of the Electronic Industries Association. This satellite, satellite/terrestrial, and terrestrial digital technology will open up new audio broadcasting opportunities both domestically and worldwide. It will significantly improve the current quality of amplitude-modulated/frequency-modulated (AM/FM) radio with a new digitally modulated radio signal and will introduce true compact-disc-quality (CD-quality) sound for the first time. Lewis is hosting the laboratory testing of seven proposed digital audio radio systems and modes. Two of the proposed systems operate in two modes each, making a total of nine systems being tested. The nine systems are divided into the following types of transmission: in-band on-channel (IBOC), in-band adjacent-channel (IBAC), and new bands. The laboratory testing was conducted by the Consumer Electronics Group of the Electronic Industries Association. Subjective assessments of the audio recordings for each of the nine systems was conducted by the Communications Research Center in Ottawa, Canada, under contract to the Electronic Industries Association. The Communications Research Center has the only CCIR-qualified (Consultative Committee for International Radio) audio testing facility in North America. The main goals of the U.S. testing process are to (1) provide technical data to the Federal Communication Commission (FCC) so that it can establish a standard for digital audio receivers and transmitters and (2) provide the receiver and transmitter industries with the proper standards upon which to build their equipment. In addition, the data will be forwarded to the International Telecommunications Union to help in the establishment of international standards for digital audio receivers and transmitters, thus allowing U.S. manufacturers to compete in the
Audio teleconferencing: creative use of a forgotten innovation.

PubMed

Mather, Carey; Marlow, Annette

2012-06-01

As part of a regional School of Nursing and Midwifery's commitment to addressing recruitment and retention issues, approximately 90% of second year undergraduate student nurses undertake clinical placements at: multipurpose centres; regional or district hospitals; aged care; or community centres based in rural and remote regions within the State. The remaining 10% undertake professional experience placement in urban areas only. This placement of a large cohort of students, in low numbers in a variety of clinical settings, initiated the need to provide consistent support to both students and staff at these facilities. Subsequently the development of an audio teleconferencing model of clinical facilitation to guide student teaching and learning and to provide support to registered nurse preceptors in clinical practice was developed. This paper draws on Weimer's 'Personal Accounts of Change' approach to describe, discuss and evaluate the modifications that have occurred since the inception of this audio teleconferencing model (Weimer, 2006).
Readings on American Society. The Audio-Lingual Literary Series II.

ERIC Educational Resources Information Center

Imamura, Shigeo; Ney, James W.

This text contains 11 lessons based on an adaptation of the 1964 essay "Automation: Road to Lifetime Jobs" by A.H. Raskin and 14 lessons based on an adaptation of John Fischer's 1948 essay "Unwritten Rules of American Politics." The format of the book and the lessons is the same as that of the other volumes of "The Audio-Lingual Literary Series."…
Audio Recording for Independent Confirmation of Clinical Assessments in Generalized Anxiety Disorder.

PubMed

Targum, Steven D; Murphy, Christopher; Khan, Jibran; Zumpano, Laura; Whitlock, Mark; Simen, Arthur A; Binneman, Brendon

2018-04-01

Objective : The assessment of patients with generalized anxiety disorder (GAD) to deteremine whether a medication intervention is necessary is not always clear and might benefit from a second opinion. However, second opinions are time consuming, expensive, and not practical in most settings. We obtained independent, second opinion reviews of the primary clinician's assessment via audio-digital recording. Design : An audio-digital recording of key site-based assessments was used to generate site-independent "dual" reviews of the clinical presentation, symptom severity, and medication requirements of patients with GAD as part of the screening procedures for a clinical trial (ClinicalTrials.gov: NCT02310568). Results : Site-independent reviewers affirmed the diagnosis, symptom severity metrics, and treatment requirements of 90 moderately ill patients with GAD. The patients endorsed excessive worry that was hard to control and essentially all six of the associated DSM-IV-TR anxiety symptoms. The Hamilton Rating Scale for Anxiety scores revealed moderately severe anxiety with a high Pearson's correlation ( r =0.852) between site-based and independent raters and minimal scoring discordance on each scale item. Based upon their independent reviews, these "second" opinions confirmed that these GAD patients warranted a new medication intervention. Thirty patients (33.3%) reported a previous history of a major depressive episode (MDE) and had significantly more depressive symptoms than patients without a history of MDE. Conclusion : The audio-digital recording method provides a useful second opinion that can affirm the need for a different treatment intervention in these anxious patients. A second live assessment would have required additional clinic time and added patient burden. The audio-digital recording method is less burdensome than live second opinion assessments and might have utility in both research and clinical practice settings.
Feature Representations for Neuromorphic Audio Spike Streams.

PubMed

Anumula, Jithendar; Neil, Daniel; Delbruck, Tobi; Liu, Shih-Chii

2018-01-01

Event-driven neuromorphic spiking sensors such as the silicon retina and the silicon cochlea encode the external sensory stimuli as asynchronous streams of spikes across different channels or pixels. Combining state-of-art deep neural networks with the asynchronous outputs of these sensors has produced encouraging results on some datasets but remains challenging. While the lack of effective spiking networks to process the spike streams is one reason, the other reason is that the pre-processing methods required to convert the spike streams to frame-based features needed for the deep networks still require further investigation. This work investigates the effectiveness of synchronous and asynchronous frame-based features generated using spike count and constant event binning in combination with the use of a recurrent neural network for solving a classification task using N-TIDIGITS18 dataset. This spike-based dataset consists of recordings from the Dynamic Audio Sensor, a spiking silicon cochlea sensor, in response to the TIDIGITS audio dataset. We also propose a new pre-processing method which applies an exponential kernel on the output cochlea spikes so that the interspike timing information is better preserved. The results from the N-TIDIGITS18 dataset show that the exponential features perform better than the spike count features, with over 91% accuracy on the digit classification task. This accuracy corresponds to an improvement of at least 2.5% over the use of spike count features, establishing a new state of the art for this dataset.
Feature Representations for Neuromorphic Audio Spike Streams

PubMed Central

Anumula, Jithendar; Neil, Daniel; Delbruck, Tobi; Liu, Shih-Chii

2018-01-01

Event-driven neuromorphic spiking sensors such as the silicon retina and the silicon cochlea encode the external sensory stimuli as asynchronous streams of spikes across different channels or pixels. Combining state-of-art deep neural networks with the asynchronous outputs of these sensors has produced encouraging results on some datasets but remains challenging. While the lack of effective spiking networks to process the spike streams is one reason, the other reason is that the pre-processing methods required to convert the spike streams to frame-based features needed for the deep networks still require further investigation. This work investigates the effectiveness of synchronous and asynchronous frame-based features generated using spike count and constant event binning in combination with the use of a recurrent neural network for solving a classification task using N-TIDIGITS18 dataset. This spike-based dataset consists of recordings from the Dynamic Audio Sensor, a spiking silicon cochlea sensor, in response to the TIDIGITS audio dataset. We also propose a new pre-processing method which applies an exponential kernel on the output cochlea spikes so that the interspike timing information is better preserved. The results from the N-TIDIGITS18 dataset show that the exponential features perform better than the spike count features, with over 91% accuracy on the digit classification task. This accuracy corresponds to an improvement of at least 2.5% over the use of spike count features, establishing a new state of the art for this dataset. PMID:29479300
Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans (L)

PubMed Central

Bresch, Erik; Nielsen, Jon; Nayak, Krishna; Narayanan, Shrikanth

2007-01-01

This letter describes a data acquisition setup for recording, and processing, running speech from a person in a magnetic resonance imaging (MRI) scanner. The main focus is on ensuring synchronicity between image and audio acquisition, and in obtaining good signal to noise ratio to facilitate further speech analysis and modeling. A field-programmable gate array based hardware design for synchronizing the scanner image acquisition to other external data such as audio is described. The audio setup itself features two fiber optical microphones and a noise-canceling filter. Two noise cancellation methods are described including a novel approach using a pulse sequence specific model of the gradient noise of the MRI scanner. The setup is useful for scientific speech production studies. Sample results of speech and singing data acquired and processed using the proposed method are given. PMID:17069275
Audio-Visual Stimulation in Conjunction with Functional Electrical Stimulation to Address Upper Limb and Lower Limb Movement Disorder.

PubMed

Kumar, Deepesh; Verma, Sunny; Bhattacharya, Sutapa; Lahiri, Uttama

2016-06-13

Neurological disorders often manifest themselves in the form of movement deficit on the part of the patient. Conventional rehabilitation often used to address these deficits, though powerful are often monotonous in nature. Adequate audio-visual stimulation can prove to be motivational. In the research presented here we indicate the applicability of audio-visual stimulation to rehabilitation exercises to address at least some of the movement deficits for upper and lower limbs. Added to the audio-visual stimulation, we also use Functional Electrical Stimulation (FES). In our presented research we also show the applicability of FES in conjunction with audio-visual stimulation delivered through VR-based platform for grasping skills of patients with movement disorder.
Economic evaluation of audio based resilience training for depression in primary care.

PubMed

Koeser, Leonardo; Dobbin, Alastair; Ross, Sheila; McCrone, Paul

2013-07-01

Although there is some evidence on the effectiveness and cost-effectiveness of computerised cognitive behavioural therapy (CCBT) for treating anxiety and depression in primary care, alternative low-cost psychosocial interventions have not been investigated. The cost-effectiveness of an audio based resilience training (Positive Mental Training, PosMT) was examined using a decision model. Patient level cost and effectiveness data from a trial comparing a CCBT treatment and usual care and effectiveness data from a study on PosMT were used to inform this. Net benefits of CCBT and PosMT were approximately equal in individuals with 'moderate' depression at baseline and markedly in favour of PosMT for the 'severe' depression subgroup. With only four observations in the 'mild' depression category for PosMT, the existing evidence base remains unaltered. Efficacy data for the PosMT arm was derived from a study using a partially randomised preference design and the model structure contains simplifications due to lack of data availability. PosMT may represent good value for money in treatment of depression for certain groups of patients. More research in this area may be warranted. Copyright © 2013 Elsevier B.V. All rights reserved.
Bridging music and speech rhythm: rhythmic priming and audio-motor training affect speech perception.

PubMed

Cason, Nia; Astésano, Corine; Schön, Daniele

2015-02-01

Following findings that musical rhythmic priming enhances subsequent speech perception, we investigated whether rhythmic priming for spoken sentences can enhance phonological processing - the building blocks of speech - and whether audio-motor training enhances this effect. Participants heard a metrical prime followed by a sentence (with a matching/mismatching prosodic structure), for which they performed a phoneme detection task. Behavioural (RT) data was collected from two groups: one who received audio-motor training, and one who did not. We hypothesised that 1) phonological processing would be enhanced in matching conditions, and 2) audio-motor training with the musical rhythms would enhance this effect. Indeed, providing a matching rhythmic prime context resulted in faster phoneme detection, thus revealing a cross-domain effect of musical rhythm on phonological processing. In addition, our results indicate that rhythmic audio-motor training enhances this priming effect. These results have important implications for rhythm-based speech therapies, and suggest that metrical rhythm in music and speech may rely on shared temporal processing brain resources. Copyright © 2015 Elsevier B.V. All rights reserved.
A scheme for racquet sports video analysis with the combination of audio-visual information

NASA Astrophysics Data System (ADS)

Xing, Liyuan; Ye, Qixiang; Zhang, Weigang; Huang, Qingming; Yu, Hua

2005-07-01

As a very important category in sports video, racquet sports video, e.g. table tennis, tennis and badminton, has been paid little attention in the past years. Considering the characteristics of this kind of sports video, we propose a new scheme for structure indexing and highlight generating based on the combination of audio and visual information. Firstly, a supervised classification method is employed to detect important audio symbols including impact (ball hit), audience cheers, commentator speech, etc. Meanwhile an unsupervised algorithm is proposed to group video shots into various clusters. Then, by taking advantage of temporal relationship between audio and visual signals, we can specify the scene clusters with semantic labels including rally scenes and break scenes. Thirdly, a refinement procedure is developed to reduce false rally scenes by further audio analysis. Finally, an exciting model is proposed to rank the detected rally scenes from which many exciting video clips such as game (match) points can be correctly retrieved. Experiments on two types of representative racquet sports video, table tennis video and tennis video, demonstrate encouraging results.
Communicative Competence in Audio Classrooms: A Position Paper for the CADE 1991 Conference.

ERIC Educational Resources Information Center

Burge, Liz

Classroom practitioners need to move their attention away from the technological and logistical competencies required for audio conferencing (AC) to the required communicative competencies in order to advance their skills in handling the psychodynamics of audio virtual classrooms which include audio alone and audio with graphics. While the…
The Use of Asynchronous Audio Feedback with Online RN-BSN Students

ERIC Educational Resources Information Center

London, Julie E.

2013-01-01

The use of audio technology by online nursing educators is a recent phenomenon. Research has been conducted in the area of audio technology in different domains and populations, but very few researchers have focused on nursing. Preliminary results have indicated that using audio in place of text can increase student cognition and socialization.…
Effect of audio instruction on tracking errors using a four-dimensional image-guided radiotherapy system.

PubMed

Nakamura, Mitsuhiro; Sawada, Akira; Mukumoto, Nobutaka; Takahashi, Kunio; Mizowaki, Takashi; Kokubo, Masaki; Hiraoka, Masahiro

2013-09-06

The Vero4DRT (MHI-TM2000) is capable of performing X-ray image-based tracking (X-ray Tracking) that directly tracks the target or fiducial markers under continuous kV X-ray imaging. Previously, we have shown that irregular respiratory patterns increased X-ray Tracking errors. Thus, we assumed that audio instruction, which generally improves the periodicity of respiration, should reduce tracking errors. The purpose of this study was to assess the effect of audio instruction on X-ray Tracking errors. Anterior-posterior abdominal skin-surface displacements obtained from ten lung cancer patients under free breathing and simple audio instruction were used as an alternative to tumor motion in the superior-inferior direction. First, a sequential predictive model based on the Levinson-Durbin algorithm was created to estimate the future three-dimensional (3D) target position under continuous kV X-ray imaging while moving a steel ball target of 9.5 mm in diameter. After creating the predictive model, the future 3D target position was sequentially calculated from the current and past 3D target positions based on the predictive model every 70 ms under continuous kV X-ray imaging. Simultaneously, the system controller of the Vero4DRT calculated the corresponding pan and tilt rotational angles of the gimbaled X-ray head, which then adjusted its orientation to the target. The calculated and current rotational angles of the gimbaled X-ray head were recorded every 5 ms. The target position measured by the laser displacement gauge was synchronously recorded every 10 msec. Total tracking system errors (ET) were compared between free breathing and audio instruction. Audio instruction significantly improved breathing regularity (p < 0.01). The mean ± standard deviation of the 95th percentile of ET (E95T ) was 1.7 ± 0.5 mm (range: 1.1-2.6mm) under free breathing (E95T,FB) and 1.9 ± 0.5 mm (range: 1.2-2.7 mm) under audio instruction (E95T,AI). E95T,AI was larger than E95T,FB for
Developing a Consensus-Driven, Core Competency Model to Shape Future Audio Engineering Technology Curriculum: A Web-Based Modified Delphi Study

ERIC Educational Resources Information Center

Tough, David T.

2009-01-01

The purpose of this online study was to create a ranking of essential core competencies and technologies required by AET (audio engineering technology) programs 10 years in the future. The study was designed to facilitate curriculum development and improvement in the rapidly expanding number of small to medium sized audio engineering technology…
Audio-visual temporal perception in children with restored hearing.

PubMed

Gori, Monica; Chilosi, Anna; Forli, Francesca; Burr, David

2017-05-01

It is not clear how audio-visual temporal perception develops in children with restored hearing. In this study we measured temporal discrimination thresholds with an audio-visual temporal bisection task in 9 deaf children with restored audition, and 22 typically hearing children. In typically hearing children, audition was more precise than vision, with no gain in multisensory conditions (as previously reported in Gori et al. (2012b)). However, deaf children with restored audition showed similar thresholds for audio and visual thresholds and some evidence of gain in audio-visual temporal multisensory conditions. Interestingly, we found a strong correlation between auditory weighting of multisensory signals and quality of language: patients who gave more weight to audition had better language skills. Similarly, auditory thresholds for the temporal bisection task were also a good predictor of language skills. This result supports the idea that the temporal auditory processing is associated with language development. Copyright © 2017. Published by Elsevier Ltd.
Review of Audio Interfacing Literature for Computer-Assisted Music Instruction.

ERIC Educational Resources Information Center

Watanabe, Nan

1980-01-01

Presents a review of the literature dealing with audio devices used in computer assisted music instruction and discusses the need for research and development of reliable, cost-effective, random access audio hardware. (Author)
Audio Haptic Videogaming for Developing Wayfinding Skills in Learners Who are Blind

PubMed Central

Sánchez, Jaime; de Borba Campos, Marcia; Espinoza, Matías; Merabet, Lotfi B.

2014-01-01

Interactive digital technologies are currently being developed as a novel tool for education and skill development. Audiopolis is an audio and haptic based videogame designed for developing orientation and mobility (O&M) skills in people who are blind. We have evaluated the cognitive impact of videogame play on O&M skills by assessing performance on a series of behavioral tasks carried out in both indoor and outdoor virtual spaces. Our results demonstrate that the use of Audiopolis had a positive impact on the development and use of O&M skills in school-aged learners who are blind. The impact of audio and haptic information on learning is also discussed. PMID:25485312
A Case Study on Audio Feedback with Geography Undergraduates

ERIC Educational Resources Information Center

Rodway-Dyer, Sue; Knight, Jasper; Dunne, Elizabeth

2011-01-01

Several small-scale studies have suggested that audio feedback can help students to reflect on their learning and to develop deep learning approaches that are associated with higher attainment in assessments. For this case study, Geography undergraduates were given audio feedback on a written essay assignment, alongside traditional written…
Audio Teleconferencing: Low Cost Technology for External Studies Networking.

ERIC Educational Resources Information Center

Robertson, Bill

1987-01-01

This discussion of the benefits of audio teleconferencing for distance education programs and for business and government applications focuses on the recent experience of Canadian educational users. Four successful operating models and their costs are reviewed, and it is concluded that audio teleconferencing is cost efficient and educationally…

High performance MPEG-audio decoder IC

NASA Technical Reports Server (NTRS)

Thorn, M.; Benbassat, G.; Cyr, K.; Li, S.; Gill, M.; Kam, D.; Walker, K.; Look, P.; Eldridge, C.; Ng, P.

1993-01-01

The emerging digital audio and video compression technology brings both an opportunity and a new challenge to IC design. The pervasive application of compression technology to consumer electronics will require high volume, low cost IC's and fast time to market of the prototypes and production units. At the same time, the algorithms used in the compression technology result in complex VLSI IC's. The conflicting challenges of algorithm complexity, low cost, and fast time to market have an impact on device architecture and design methodology. The work presented in this paper is about the design of a dedicated, high precision, Motion Picture Expert Group (MPEG) audio decoder.
The effect of audio tours on learning and social interaction: An evaluation at Carlsbad Caverns National Park

NASA Astrophysics Data System (ADS)

Novey, Levi T.; Hall, Troy E.

2007-03-01

Auditory forms of nonpersonal communication have rarely been evaluated in informal settings like parks and museums. This study evaluated the effect of an interpretive audio tour on visitor knowledge and social behavior at Carlsbad Caverns National Park. A cross-sectional pretest/posttest quasi-experimental design compared the responses of audio tour users (n = 123) and nonusers (n = 131) on several knowledge questions. Observations (n = 700) conducted at seven sites within the caverns documented sign reading, time spent listening to the audio, within group conversation, and other social behaviors for a different sample of visitors. Pretested tour users and nonusers did not differ in visitor characteristics, knowledge, or attitude variables, suggesting the two populations were similar. On a 12-item knowledge quiz, tour users' scores increased from 5.7 to 10.3, and nonusers' scores increased from 6.2 to 8.4. Most visitors were able to identify some of the park's major messages when presented with a multiple-choice question, but more audio users than nonusers identified resource preservation as a primary message in an open-ended question. Based on observations, audio tour users and nonusers did not differ substantially in their interactions with other members of their group or in their reading of interpretive signs in the cave. Audio tour users had positive reactions to the tour, and these reactions, coupled with the positive learning outcomes and negligible effects on social interaction, suggest that audio tours can be an effective communication medium in informal educational settings.
Selected Audio-Visual Materials for Consumer Education. [New Version.

ERIC Educational Resources Information Center

Johnston, William L.

Ninety-two films, filmstrips, multi-media kits, slides, and audio cassettes, produced between 1964 and 1974, are listed in this selective annotated bibliography on consumer education. The major portion of the bibliography is devoted to films and filmstrips. The main topics of the audio-visual materials include purchasing, advertising, money…
The Effect of Audio and Animation in Multimedia Instruction

ERIC Educational Resources Information Center

Koroghlanian, Carol; Klein, James D.

2004-01-01

This study investigated the effects of audio, animation, and spatial ability in a multimedia computer program for high school biology. Participants completed a multimedia program that presented content by way of text or audio with lean text. In addition, several instructional sequences were presented either with static illustrations or animations.…
Audio Design: Creating Multi-sensory Images for the Mind.

ERIC Educational Resources Information Center

Ferrington, Gary

1994-01-01

Explores the concept of "theater of the mind" and discusses design factors in creating audio works that effectively stimulate mental pictures, including: narrative format in audio scripting; qualities of voice; use of concrete language; music; noise versus silence; and the creation of the illusion of space using monaural, stereophonic,…
The sweet-home project: audio technology in smart homes to improve well-being and reliance.

PubMed

Vacher, Michel; Istrate, Dan; Portet, François; Joubert, Thierry; Chevalier, Thierry; Smidtas, Serge; Meillon, Brigitte; Lecouteux, Benjamin; Sehili, Mohamed; Chahuara, Pedro; Méniard, Sylvain

2011-01-01

The Sweet-Home project aims at providing audio-based interaction technology that lets the user have full control over their home environment, at detecting distress situations and at easing the social inclusion of the elderly and frail population. This paper presents an overview of the project focusing on the multimodal sound corpus acquisition and labelling and on the investigated techniques for speech and sound recognition. The user study and the recognition performances show the interest of this audio technology.
Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap.

PubMed

Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin'ya

2013-01-01

It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.
Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap

PubMed Central

Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin’Ya

2013-01-01

It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this is necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; see Experiment 1) and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; see Experiment 2) we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair alone can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap. PMID:23658549
A Longitudinal, Quantitative Study of Student Attitudes towards Audio Feedback for Assessment

ERIC Educational Resources Information Center

Parkes, Mitchell; Fletcher, Peter

2017-01-01

This paper reports on the findings of a three-year longitudinal study investigating the experiences of postgraduate level students who were provided with audio feedback for their assessment. Results indicated that students positively received audio feedback. Overall, students indicated a preference for audio feedback over written feedback. No…
Effect of Audio Coaching on Correlation of Abdominal Displacement With Lung Tumor Motion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakamura, Mitsuhiro; Narita, Yuichiro; Matsuo, Yukinori

2009-10-01

Purpose: To assess the effect of audio coaching on the time-dependent behavior of the correlation between abdominal motion and lung tumor motion and the corresponding lung tumor position mismatches. Methods and Materials: Six patients who had a lung tumor with a motion range >8 mm were enrolled in the present study. Breathing-synchronized fluoroscopy was performed initially without audio coaching, followed by fluoroscopy with recorded audio coaching for multiple days. Two different measurements, anteroposterior abdominal displacement using the real-time positioning management system and superoinferior (SI) lung tumor motion by X-ray fluoroscopy, were performed simultaneously. Their sequential images were recorded using onemore » display system. The lung tumor position was automatically detected with a template matching technique. The relationship between the abdominal and lung tumor motion was analyzed with and without audio coaching. Results: The mean SI tumor displacement was 10.4 mm without audio coaching and increased to 23.0 mm with audio coaching (p < .01). The correlation coefficients ranged from 0.89 to 0.97 with free breathing. Applying audio coaching, the correlation coefficients improved significantly (range, 0.93-0.99; p < .01), and the SI lung tumor position mismatches became larger in 75% of all sessions. Conclusion: Audio coaching served to increase the degree of correlation and make it more reproducible. In addition, the phase shifts between tumor motion and abdominal displacement were improved; however, all patients breathed more deeply, and the SI lung tumor position mismatches became slightly larger with audio coaching than without audio coaching.« less
Creating accessible science museums with user-activated environmental audio beacons (ping!).

PubMed

Landau, Steven; Wiener, William; Naghshineh, Koorosh; Giusti, Ellen

2005-01-01

In 2003, Touch Graphics Company carried out research on a new invention that promises to improve accessibility to science museums for visitors who are visually impaired. The system, nicknamed Ping!, allows users to navigate an exhibit area, listen to audio descriptions, and interact with exhibits using a cell phone-based interface. The system relies on computer telephony, and it incorporates a network of wireless environmental audio beacons that can be triggered by users wishing to travel to destinations they choose. User testing indicates that the system is effective, both as a way-finding tool and as a means of providing accessible information on museum content. Follow-up development projects will determine if this approach can be successfully implemented in other settings and for other user populations.
Steganalysis based on JPEG compatibility

NASA Astrophysics Data System (ADS)

Fridrich, Jessica; Goljan, Miroslav; Du, Rui

2001-11-01

In this paper, we introduce a new forensic tool that can reliably detect modifications in digital images, such as distortion due to steganography and watermarking, in images that were originally stored in the JPEG format. The JPEG compression leave unique fingerprints and serves as a fragile watermark enabling us to detect changes as small as modifying the LSB of one randomly chosen pixel. The detection of changes is based on investigating the compatibility of 8x8 blocks of pixels with JPEG compression with a given quantization matrix. The proposed steganalytic method is applicable to virtually all steganongraphic and watermarking algorithms with the exception of those that embed message bits into the quantized JPEG DCT coefficients. The method can also be used to estimate the size of the secret message and identify the pixels that carry message bits. As a consequence of our steganalysis, we strongly recommend avoiding using images that have been originally stored in the JPEG format as cover-images for spatial-domain steganography.
Selective Attention Modulates the Direction of Audio-Visual Temporal Recalibration

PubMed Central

Ikumi, Nara; Soto-Faraco, Salvador

2014-01-01

Temporal recalibration of cross-modal synchrony has been proposed as a mechanism to compensate for timing differences between sensory modalities. However, far from the rich complexity of everyday life sensory environments, most studies to date have examined recalibration on isolated cross-modal pairings. Here, we hypothesize that selective attention might provide an effective filter to help resolve which stimuli are selected when multiple events compete for recalibration. We addressed this question by testing audio-visual recalibration following an adaptation phase where two opposing audio-visual asynchronies were present. The direction of voluntary visual attention, and therefore to one of the two possible asynchronies (flash leading or flash lagging), was manipulated using colour as a selection criterion. We found a shift in the point of subjective audio-visual simultaneity as a function of whether the observer had focused attention to audio-then-flash or to flash-then-audio groupings during the adaptation phase. A baseline adaptation condition revealed that this effect of endogenous attention was only effective toward the lagging flash. This hints at the role of exogenous capture and/or additional endogenous effects producing an asymmetry toward the leading flash. We conclude that selective attention helps promote selected audio-visual pairings to be combined and subsequently adjusted in time but, stimulus organization exerts a strong impact on recalibration. We tentatively hypothesize that the resolution of recalibration in complex scenarios involves the orchestration of top-down selection mechanisms and stimulus-driven processes. PMID:25004132
Selective attention modulates the direction of audio-visual temporal recalibration.

PubMed

Ikumi, Nara; Soto-Faraco, Salvador

2014-01-01

Temporal recalibration of cross-modal synchrony has been proposed as a mechanism to compensate for timing differences between sensory modalities. However, far from the rich complexity of everyday life sensory environments, most studies to date have examined recalibration on isolated cross-modal pairings. Here, we hypothesize that selective attention might provide an effective filter to help resolve which stimuli are selected when multiple events compete for recalibration. We addressed this question by testing audio-visual recalibration following an adaptation phase where two opposing audio-visual asynchronies were present. The direction of voluntary visual attention, and therefore to one of the two possible asynchronies (flash leading or flash lagging), was manipulated using colour as a selection criterion. We found a shift in the point of subjective audio-visual simultaneity as a function of whether the observer had focused attention to audio-then-flash or to flash-then-audio groupings during the adaptation phase. A baseline adaptation condition revealed that this effect of endogenous attention was only effective toward the lagging flash. This hints at the role of exogenous capture and/or additional endogenous effects producing an asymmetry toward the leading flash. We conclude that selective attention helps promote selected audio-visual pairings to be combined and subsequently adjusted in time but, stimulus organization exerts a strong impact on recalibration. We tentatively hypothesize that the resolution of recalibration in complex scenarios involves the orchestration of top-down selection mechanisms and stimulus-driven processes.
Audio-magnetotelluric data collected in the area of Beatty, Nevada

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, J.M.

1998-11-01

In the summer of 1997, electrical geophysical data was collected north of Beatty, Nevada. Audio-magnetotellurics (AMT) was the geophysical method used to collect 16 stations along two profiles. The purpose of this data collection was to determine the depth to the alluvial basement, based upon the needs of the geologists requesting the data.
Audio aided electro-tactile perception training for finger posture biofeedback.

PubMed

Vargas, Jose Gonzalez; Yu, Wenwei

2008-01-01

Visual information is one of the prerequisites for most biofeedback studies. The aim of this study is to explore how the usage of an audio aided training helps in the learning process of dynamical electro-tactile perception without any visual feedback. In this research, the electrical simulation patterns associated with the experimenter's finger postures and motions were presented to the subjects. Along with the electrical stimulation patterns 2 different types of information, verbal and audio information on finger postures and motions, were presented to the verbal training subject group (group 1) and audio training subject group (group 2), respectively. The results showed an improvement in the ability to distinguish and memorize electrical stimulation patterns correspondent to finger postures and motions without visual feedback, and with audio tones aid, the learning was faster and the perception became more precise after training. Thus, this study clarified that, as a substitution to visual presentation, auditory information could help effectively in the formation of electro-tactile perception. Further research effort needed to make clear the difference between the visual guided and audio aided training in terms of information compilation, post-training effect and robustness of the perception.
The Sweet-Home project: audio processing and decision making in smart home to improve well-being and reliance.

PubMed

Vacher, Michel; Chahuara, Pedro; Lecouteux, Benjamin; Istrate, Dan; Portet, Francois; Joubert, Thierry; Sehili, Mohamed; Meillon, Brigitte; Bonnefond, Nicolas; Fabre, Sébastien; Roux, Camille; Caffiau, Sybille

2013-01-01

The Sweet-Home project aims at providing audio-based interaction technology that lets the user have full control over their home environment, at detecting distress situations and at easing the social inclusion of the elderly and frail population. This paper presents an overview of the project focusing on the implemented techniques for speech and sound recognition as context-aware decision making with uncertainty. A user experiment in a smart home demonstrates the interest of this audio-based technology.
78 FR 38093 - Seventh Meeting: RTCA Special Committee 226, Audio Systems and Equipment

Federal Register 2010, 2011, 2012, 2013, 2014

2013-06-25

... Committee 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... 226, Audio Systems and Equipment [[Page 38094
Audio Podcasting in a Tablet PC-Enhanced Biochemistry Course

ERIC Educational Resources Information Center

Lyles, Heather; Robertson, Brian; Mangino, Michael; Cox, James R.

2007-01-01

This report describes the effects of making audio podcasts of all lectures in a large, basic biochemistry course promptly available to students. The audio podcasts complement a previously described approach in which a tablet PC is used to annotate PowerPoint slides with digital ink to produce electronic notes that can be archived. The fundamentals…
Effect of audio instruction on tracking errors using a four‐dimensional image‐guided radiotherapy system

PubMed Central

Sawada, Akira; Mukumoto, Nobutaka; Takahashi, Kunio; Mizowaki, Takashi; Kokubo, Masaki; Hiraoka, Masahiro

2013-01-01

The Vero4DRT (MHI‐TM2000) is capable of performing X‐ray image‐based tracking (X‐ray Tracking) that directly tracks the target or fiducial markers under continuous kV X‐ray imaging. Previously, we have shown that irregular respiratory patterns increased X‐ray Tracking errors. Thus, we assumed that audio instruction, which generally improves the periodicity of respiration, should reduce tracking errors. The purpose of this study was to assess the effect of audio instruction on X‐ray Tracking errors. Anterior‐posterior abdominal skin‐surface displacements obtained from ten lung cancer patients under free breathing and simple audio instruction were used as an alternative to tumor motion in the superior‐inferior direction. First, a sequential predictive model based on the Levinson‐Durbin algorithm was created to estimate the future three‐dimensional (3D) target position under continuous kV X‐ray imaging while moving a steel ball target of 9.5 mm in diameter. After creating the predictive model, the future 3D target position was sequentially calculated from the current and past 3D target positions based on the predictive model every 70 ms under continuous kV X‐ray imaging. Simultaneously, the system controller of the Vero4DRT calculated the corresponding pan and tilt rotational angles of the gimbaled X‐ray head, which then adjusted its orientation to the target. The calculated and current rotational angles of the gimbaled X‐ray head were recorded every 5 ms. The target position measured by the laser displacement gauge was synchronously recorded every 10 msec. Total tracking system errors (ET) were compared between free breathing and audio instruction. Audio instruction significantly improved breathing regularity (p < 0.01). The mean ± standard deviation of the 95th percentile of ET (E95T) was 1.7 ± 0.5 mm (range: 1.1–2.6 mm) under free breathing (E95T,FB) and 1.9 ± 0.5 mm (range: 1.2–2.7 mm) under audio

Promoting Independence through Assistive Technology: Evaluating Audio Recorders to Support Grocery Shopping

ERIC Educational Resources Information Center

Bouck, Emily C.; Satsangi, Rajiv; Bartlett, Whitney; Weng, Pei-Lin

2012-01-01

In light of a positive research base regarding technology-based self-operating prompting systems (e.g., iPods), yet a concern about the sustainability of such technologies after a research project is completed, this study sought to explore the effectiveness and efficiency of an audio recorder, a low-cost, more commonly accessible technology to…
On-line Tool Wear Detection on DCMT070204 Carbide Tool Tip Based on Noise Cutting Audio Signal using Artificial Neural Network

NASA Astrophysics Data System (ADS)

Prasetyo, T.; Amar, S.; Arendra, A.; Zam Zami, M. K.

2018-01-01

This study develops an on-line detection system to predict the wear of DCMT070204 tool tip during the cutting process of the workpiece. The machine used in this research is CNC ProTurn 9000 to cut ST42 steel cylinder. The audio signal has been captured using the microphone placed in the tool post and recorded in Matlab. The signal is recorded at the sampling rate of 44.1 kHz, and the sampling size of 1024. The recorded signal is 110 data derived from the audio signal while cutting using a normal chisel and a worn chisel. And then perform signal feature extraction in the frequency domain using Fast Fourier Transform. Feature selection is done based on correlation analysis. And tool wear classification was performed using artificial neural networks with 33 input features selected. This artificial neural network is trained with back propagation method. Classification performance testing yields an accuracy of 74%.
Developing a Framework for Effective Audio Feedback: A Case Study

ERIC Educational Resources Information Center

Hennessy, Claire; Forrester, Gillian

2014-01-01

The increase in the use of technology-enhanced learning in higher education has included a growing interest in new approaches to enhance the quality of feedback given to students. Audio feedback is one method that has become more popular, yet evaluating its role in feedback delivery is still an emerging area for research. This paper is based on a…
Audio-visual speech cue combination.

PubMed

Arnold, Derek H; Tear, Morgan; Schindel, Ryan; Roseboom, Warrick

2010-04-16

Different sources of sensory information can interact, often shaping what we think we have seen or heard. This can enhance the precision of perceptual decisions relative to those made on the basis of a single source of information. From a computational perspective, there are multiple reasons why this might happen, and each predicts a different degree of enhanced precision. Relatively slight improvements can arise when perceptual decisions are made on the basis of multiple independent sensory estimates, as opposed to just one. These improvements can arise as a consequence of probability summation. Greater improvements can occur if two initially independent estimates are summated to form a single integrated code, especially if the summation is weighted in accordance with the variance associated with each independent estimate. This form of combination is often described as a Bayesian maximum likelihood estimate. Still greater improvements are possible if the two sources of information are encoded via a common physiological process. Here we show that the provision of simultaneous audio and visual speech cues can result in substantial sensitivity improvements, relative to single sensory modality based decisions. The magnitude of the improvements is greater than can be predicted on the basis of either a Bayesian maximum likelihood estimate or a probability summation. Our data suggest that primary estimates of speech content are determined by a physiological process that takes input from both visual and auditory processing, resulting in greater sensitivity than would be possible if initially independent audio and visual estimates were formed and then subsequently combined.
Audio frequency in vivo optical coherence elastography

NASA Astrophysics Data System (ADS)

Adie, Steven G.; Kennedy, Brendan F.; Armstrong, Julian J.; Alexandrov, Sergey A.; Sampson, David D.

2009-05-01

We present a new approach to optical coherence elastography (OCE), which probes the local elastic properties of tissue by using optical coherence tomography to measure the effect of an applied stimulus in the audio frequency range. We describe the approach, based on analysis of the Bessel frequency spectrum of the interferometric signal detected from scatterers undergoing periodic motion in response to an applied stimulus. We present quantitative results of sub-micron excitation at 820 Hz in a layered phantom and the first such measurements in human skin in vivo.
Audio Feedback -- Better Feedback?

ERIC Educational Resources Information Center

Voelkel, Susanne; Mello, Luciane V.

2014-01-01

National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…
Holographic disk with high data transfer rate: its application to an audio response memory.

PubMed

Kubota, K; Ono, Y; Kondo, M; Sugama, S; Nishida, N; Sakaguchi, M

1980-03-15

This paper describes a memory realized with a high data transfer rate using the holographic parallel-processing function and its application to an audio response system that supplies many audio messages to many terminals simultaneously. Digitalized audio messages are recorded as tiny 1-D Fourier transform holograms on a holographic disk. A hologram recorder and a hologram reader were constructed to test and demonstrate the holographic audio response memory feasibility. Experimental results indicate the potentiality of an audio response system with a 2000-word vocabulary and 250-Mbit/sec bit transfer rate.
Towards an Effective Use of Audio Conferencing in Distance Language Courses

ERIC Educational Resources Information Center

Hampel, Regine; Hauck, Mirjam

2004-01-01

In order to respond to learners' need for more flexible speaking opportunities and to overcome the geographical challenge of students spread over the United Kingdom and continental Western Europe, the Open University recently introduced Internet-based, real-time audio conferencing, thus making a groundbreaking move in the distance learning and…
Let Their Voices Be Heard! Building a Multicultural Audio Collection.

ERIC Educational Resources Information Center

Tucker, Judith Cook

1992-01-01

Discusses building a multicultural audio collection for a library. Gives some guidelines about selecting materials that really represent different cultures. Audio materials that are considered fall roughly into the categories of children's stories, didactic materials, oral histories, poetry and folktales, and music. The goal is an authentic…
47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 47 Telecommunication 4 2013-10-01 2013-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d 920...
47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 47 Telecommunication 4 2012-10-01 2012-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d 920...
47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 47 Telecommunication 4 2014-10-01 2014-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d 920...
47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 4 2011-10-01 2011-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d 920...
47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 4 2010-10-01 2010-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d 920...
A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia

NASA Astrophysics Data System (ADS)

Radhakrishnan, Regunathan; Divakaran, Ajay; Xiong, Ziyou; Otsuka, Isao

2006-12-01

We propose a content-adaptive analysis and representation framework to discover events using audio features from "unscripted" multimedia such as sports and surveillance for summarization. The proposed analysis framework performs an inlier/outlier-based temporal segmentation of the content. It is motivated by the observation that "interesting" events in unscripted multimedia occur sparsely in a background of usual or "uninteresting" events. We treat the sequence of low/mid-level features extracted from the audio as a time series and identify subsequences that are outliers. The outlier detection is based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the subsequences of the time series. We define the confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the parameters of the proposed framework and the confidence measure. Furthermore, we use the confidence measure to rank the detected outliers in terms of their departures from the background process. Our experimental results with sequences of low- and mid-level audio features extracted from sports video show that "highlight" events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out suspicious events from surveillance videos without any a priori knowledge. We show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length. Finally, we also show that the proposed framework can be used to systematically select "key audio classes" that are indicative of events of interest in the chosen domain.
Auditory and audio-vocal responses of single neurons in the monkey ventral premotor cortex.

PubMed

Hage, Steffen R

2018-03-20

Monkey vocalization is a complex behavioral pattern, which is flexibly used in audio-vocal communication. A recently proposed dual neural network model suggests that cognitive control might be involved in this behavior, originating from a frontal cortical network in the prefrontal cortex and mediated via projections from the rostral portion of the ventral premotor cortex (PMvr) and motor cortex to the primary vocal motor network in the brainstem. For the rapid adjustment of vocal output to external acoustic events, strong interconnections between vocal motor and auditory sites are needed, which are present at cortical and subcortical levels. However, the role of the PMvr in audio-vocal integration processes remains unclear. In the present study, single neurons in the PMvr were recorded in rhesus monkeys (Macaca mulatta) while volitionally producing vocalizations in a visual detection task or passively listening to monkey vocalizations. Ten percent of randomly selected neurons in the PMvr modulated their discharge rate in response to acoustic stimulation with species-specific calls. More than four-fifths of these auditory neurons showed an additional modulation of their discharge rates either before and/or during the monkeys' motor production of the vocalization. Based on these audio-vocal interactions, the PMvr might be well positioned to mediate higher order auditory processing with cognitive control of the vocal motor output to the primary vocal motor network. Such audio-vocal integration processes in the premotor cortex might constitute a precursor for the evolution of complex learned audio-vocal integration systems, ultimately giving rise to human speech. Copyright © 2018 Elsevier B.V. All rights reserved.
LiveDescribe: Can Amateur Describers Create High-Quality Audio Description?

ERIC Educational Resources Information Center

Branje, Carmen J.; Fels, Deborah I.

2012-01-01

Introduction: The study presented here evaluated the usability of the audio description software LiveDescribe and explored the acceptance rates of audio description created by amateur describers who used LiveDescribe to facilitate the creation of their descriptions. Methods: Twelve amateur describers with little or no previous experience with…
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2011 CFR

2011-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2012 CFR

2012-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2013 CFR

2013-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...

17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2010 CFR

2010-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
17 CFR 232.304 - Graphic, image, audio and video material.

Code of Federal Regulations, 2014 CFR

2014-04-01

... video material. 232.304 Section 232.304 Commodity and Securities Exchanges SECURITIES AND EXCHANGE... Submissions § 232.304 Graphic, image, audio and video material. (a) If a filer includes graphic, image, audio or video material in a document delivered to investors and others that is not reproduced in an...
Creating Accessible Science Museums with User-Activated Environmental Audio Beacons (Ping!)

ERIC Educational Resources Information Center

Landau, Steven; Wiener, William; Naghshineh, Koorosh; Giusti, Ellen

2005-01-01

In 2003, Touch Graphics Company carried out research on a new invention that promises to improve accessibility to science museums for visitors who are visually impaired. The system, nicknamed Ping!, allows users to navigate an exhibit area, listen to audio descriptions, and interact with exhibits using a cell phone-based interface. The system…
[Ventriloquism and audio-visual integration of voice and face].

PubMed

Yokosawa, Kazuhiko; Kanaya, Shoko

2012-07-01

Presenting synchronous auditory and visual stimuli in separate locations creates the illusion that the sound originates from the direction of the visual stimulus. Participants' auditory localization bias, called the ventriloquism effect, has revealed factors affecting the perceptual integration of audio-visual stimuli. However, many studies on audio-visual processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. These results cannot necessarily explain our perceptual behavior in natural scenes, where various signals exist within a single sensory modality. In the present study we report the contributions of a cognitive factor, that is, the audio-visual congruency of speech, although this factor has often been underestimated in previous ventriloquism research. Thus, we investigated the contribution of speech congruency on the ventriloquism effect using a spoken utterance and two videos of a talking face. The salience of facial movements was also manipulated. As a result, when bilateral visual stimuli are presented in synchrony with a single voice, cross-modal speech congruency was found to have a significant impact on the ventriloquism effect. This result also indicated that more salient visual utterances attracted participants' auditory localization. The congruent pairing of audio-visual utterances elicited greater localization bias than did incongruent pairing, whereas previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference to auditory localization. This suggests that a greater flexibility in responding to multi-sensory environments exists than has been previously considered.
A Unified Steganalysis Framework

DTIC Science & Technology

2013-04-01

contains more than 1800 images of different scenes. In the experiments, we used four JPEG based steganography techniques: Out- guess [13], F5 [16], model...also compressed these images again since some of the steganography meth- ods are double compressing the images . Stego- images are generated by embedding...randomly chosen messages (in bits) into 1600 grayscale images using each of the four steganography techniques. A random message length was determined
78 FR 18416 - Sixth Meeting: RTCA Special Committee 226, Audio Systems and Equipment

Federal Register 2010, 2011, 2012, 2013, 2014

2013-03-26

... 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... 226, Audio Systems and Equipment. DATES: The meeting will be held April 15-17, 2013 from 9:00 a.m.-5...
78 FR 57673 - Eighth Meeting: RTCA Special Committee 226, Audio Systems and Equipment

Federal Register 2010, 2011, 2012, 2013, 2014

2013-09-19

... Committee 226, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 226, Audio Systems and Equipment. SUMMARY... Committee 226, Audio Systems and Equipment. DATES: The meeting will be held October 8-10, 2012 from 9:00 a.m...
77 FR 37732 - Fourteenth Meeting: RTCA Special Committee 224, Audio Systems and Equipment

Federal Register 2010, 2011, 2012, 2013, 2014

2012-06-22

... Committee 224, Audio Systems and Equipment AGENCY: Federal Aviation Administration (FAA), U.S. Department of Transportation (DOT). ACTION: Meeting Notice of RTCA Special Committee 224, Audio Systems and Equipment. SUMMARY... Committee 224, Audio Systems and Equipment. DATES: The meeting will be held July 11, 2012, from 10 a.m.-4 p...
Audio Spatial Representation Around the Body

PubMed Central

Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica

2017-01-01

Studies have found that portions of space around our body are differently coded by our brain. Numerous works have investigated visual and auditory spatial representation, focusing mostly on the spatial representation of stimuli presented at head level, especially in the frontal space. Only few studies have investigated spatial representation around the entire body and its relationship with motor activity. Moreover, it is still not clear whether the space surrounding us is represented as a unitary dimension or whether it is split up into different portions, differently shaped by our senses and motor activity. To clarify these points, we investigated audio localization of dynamic and static sounds at different body levels. In order to understand the role of a motor action in auditory space representation, we asked subjects to localize sounds by pointing with the hand or the foot, or by giving a verbal answer. We found that the audio sound localization was different depending on the body part considered. Moreover, a different pattern of response was observed when subjects were asked to make actions with respect to the verbal responses. These results suggest that the audio space around our body is split in various spatial portions, which are perceived differently: front, back, around chest, and around foot, suggesting that these four areas could be differently modulated by our senses and our actions. PMID:29249999
Digital Audio Radio Field Tests

NASA Technical Reports Server (NTRS)

Hollansworth, James E.

1997-01-01

Radio history continues to be made at the NASA Lewis Research Center with the beginning of phase two of Digital Audio Radio testing conducted by the Consumer Electronic Manufacturers Association (a sector of the Electronic Industries Association and the National Radio Systems Committee) and cosponsored by the Electronic Industries Association and the National Association of Broadcasters. The bulk of the field testing of the four systems should be complete by the end of October 1996, with results available soon thereafter. Lewis hosted phase one of the testing process, which included laboratory testing of seven proposed digital audio radio systems and modes (see the following table). Two of the proposed systems operate in two modes, thus making a total of nine systems for testing. These nine systems are divided into the following types of transmission: in-band on channel (IBOC), in-band adjacent channel (IBAC), and new bands - the L-band (1452 to 1492 MHz) and the S-band (2310 to 2360 MHz).
ASTP video tape recorder ground support equipment (audio/CTE splitter/interleaver). Operations manual

NASA Technical Reports Server (NTRS)

1974-01-01

A descriptive handbook for the audio/CTE splitter/interleaver (RCA part No. 8673734-502) was presented. This unit is designed to perform two major functions: extract audio and time data from an interleaved video/audio signal (splitter section), and provide a test interleaved video/audio/CTE signal for the system (interleaver section). It is a rack mounting unit 7 inches high, 19 inches wide, 20 inches deep, mounted on slides for retracting from the rack, and weighs approximately 40 pounds. The following information is provided: installation, operation, principles of operation, maintenance, schematics and parts lists.
Overdrive and Edge as Refiners of "Belting"?: An Empirical Study Qualifying and Categorizing "Belting" Based on Audio Perception, Laryngostroboscopic Imaging, Acoustics, LTAS, and EGG.

PubMed

McGlashan, Julian; Thuesen, Mathias Aaen; Sadolin, Cathrine

2017-05-01

We aimed to study the categorizations "Overdrive" and "Edge" from the pedagogical method Complete Vocal Technique as refiners of the often ill-defined concept of "belting" by means of audio perception, laryngostroboscopic imaging, acoustics, long-term average spectrum (LTAS), and electroglottography (EGG). This is a case-control study. Twenty singers were recorded singing sustained vowels in a "belting" quality refined by audio perception as "Overdrive" and "Edge." Two studies were performed: (1) a laryngostroboscopic examination using a videonasoendoscopic camera system (Olympus) and the Laryngostrobe program (Laryngograph); (2) a simultaneous recording of the EGG and acoustic signals using Speech Studio (Laryngograph). The images were analyzed based on consensus agreement. Statistical analysis of the acoustic, LTAS, and EGG parameters was undertaken using the Student paired t test. The two modes of singing determined by audio perception have visibly different laryngeal gestures: Edge has a more constricted setting than that of Overdrive, where the ventricular folds seem to cover more of the vocal folds, the aryepiglottic folds show a sharper edge in Edge, and the cuneiform cartilages are rolled in anteromedially. LTAS analysis shows a statistical difference, particularly after the ninth harmonic, with a coinciding first formant. The combined group showed statistical differences in shimmer, harmonics-to-noise ratio, normalized noise energy, and mean sound pressure level (P ≤ 0.05). "Belting" sounds can be categorized using audio perception into two modes of singing: "Overdrive" and "Edge." This study demonstrates consistent visibly different laryngeal gestures between these modes and with some correspondingly significant differences in LTAS, EGG, and acoustic measures. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Responding Effectively to Composition Students: Comparing Student Perceptions of Written and Audio Feedback

ERIC Educational Resources Information Center

Bilbro, J.; Iluzada, C.; Clark, D. E.

2013-01-01

The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…
47 CFR Figure 2 to Subpart N of... - Typical Audio Wave

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 1 2011-10-01 2011-10-01 false Typical Audio Wave 2 Figure 2 to Subpart N of Part 2 Telecommunication FEDERAL COMMUNICATIONS COMMISSION GENERAL FREQUENCY ALLOCATIONS AND RADIO... Audio Wave EC03JN91.006 ...
Enhanced audio-visual interactions in the auditory cortex of elderly cochlear-implant users.

PubMed

Schierholz, Irina; Finke, Mareike; Schulte, Svenja; Hauthal, Nadine; Kantzke, Christoph; Rach, Stefan; Büchner, Andreas; Dengler, Reinhard; Sandmann, Pascale

2015-10-01

Auditory deprivation and the restoration of hearing via a cochlear implant (CI) can induce functional plasticity in auditory cortical areas. How these plastic changes affect the ability to integrate combined auditory (A) and visual (V) information is not yet well understood. In the present study, we used electroencephalography (EEG) to examine whether age, temporary deafness and altered sensory experience with a CI can affect audio-visual (AV) interactions in post-lingually deafened CI users. Young and elderly CI users and age-matched NH listeners performed a speeded response task on basic auditory, visual and audio-visual stimuli. Regarding the behavioral results, a redundant signals effect, that is, faster response times to cross-modal (AV) than to both of the two modality-specific stimuli (A, V), was revealed for all groups of participants. Moreover, in all four groups, we found evidence for audio-visual integration. Regarding event-related responses (ERPs), we observed a more pronounced visual modulation of the cortical auditory response at N1 latency (approximately 100 ms after stimulus onset) in the elderly CI users when compared with young CI users and elderly NH listeners. Thus, elderly CI users showed enhanced audio-visual binding which may be a consequence of compensatory strategies developed due to temporary deafness and/or degraded sensory input after implantation. These results indicate that the combination of aging, sensory deprivation and CI facilitates the coupling between the auditory and the visual modality. We suggest that this enhancement in multisensory interactions could be used to optimize auditory rehabilitation, especially in elderly CI users, by the application of strong audio-visually based rehabilitation strategies after implant switch-on. Copyright © 2015 Elsevier B.V. All rights reserved.
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues

NASA Astrophysics Data System (ADS)

Adams, W. H.; Iyengar, Giridharan; Lin, Ching-Yung; Naphade, Milind Ramesh; Neti, Chalapathy; Nock, Harriet J.; Smith, John R.

2003-12-01

We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
Engaging Students with Audio Feedback

ERIC Educational Resources Information Center

Cann, Alan

2014-01-01

Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…
Audio-vestibular signs and symptoms in Chiari malformation type i. Case series and literature review.

PubMed

Guerra Jiménez, Gloria; Mazón Gutiérrez, Ángel; Marco de Lucas, Enrique; Valle San Román, Natalia; Martín Laez, Rubén; Morales Angulo, Carmelo

2015-01-01

Chiari malformation is an alteration of the base of the skull with herniation through the foramen magnum of the brain stem and cerebellum. Although the most common presentation is occipital headache, the association of audio-vestibular symptoms is not rare. The aim of our study was to describe audio-vestibular signs and symptoms in Chiari malformation type i (CM-I). We performed a retrospective observational study of patients referred to our unit during the last 5 years. We also carried out a literature review of audio-vestibular signs and symptoms in this disease. There were 9 patients (2 males and 7 females), with an average age of 42.8 years. Five patients presented a Ménière-like syndrome; 2 cases, a recurrent vertigo with peripheral features; one patient showed a sudden hearing loss; and one case suffered a sensorineural hearing loss with early childhood onset. The most common audio-vestibular symptom indicated in the literature in patients with CM-I is unsteadiness (49%), followed by dizziness (18%), nystagmus (15%) and hearing loss (15%). Nystagmus is frequently horizontal (74%) or down-beating (18%). Other audio-vestibular signs and symptoms are tinnitus (11%), aural fullness (10%) and hyperacusis (1%). Occipital headache that increases with Valsalva manoeuvres and hand paresthesias are very suggestive symptoms. The appearance of audio-vestibular manifestations in CM-I makes it common to refer these patients to neurotologists. Unsteadiness, vertiginous syndromes and sensorineural hearing loss are frequent. Nystagmus, especially horizontal and down-beating, is not rare. It is important for neurotologists to familiarise themselves with CM-I symptoms to be able to consider it in differential diagnosis. Copyright © 2014 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Patología Cérvico-Facial. All rights reserved.
Instructional Audio Guidelines: Four Design Principles to Consider for Every Instructional Audio Design Effort

ERIC Educational Resources Information Center

Carter, Curtis W.

2012-01-01

This article contends that instructional designers and developers should attend to four particular design principles when creating instructional audio. Support for this view is presented by referencing the limited research that has been done in this area, and by indicating how and why each of the four principles is important to the design process.…
Use of Video and Audio Texts in EFL Listening Test

ERIC Educational Resources Information Center

Basal, Ahmet; Gülözer, Kaine; Demir, Ibrahim

2015-01-01

The study aims to discover whether audio or video modality in a listening test is more beneficial to test takers. In this study, the posttest-only control group design was utilized and quantitative data were collected in order to measure participant performances concerning two types of modality (audio or video) in a listening test. The…

106-17 Telemetry Standards Digitized Audio Telemetry Standard Chapter 5

DTIC Science & Technology

2017-07-01

RCC Standard 106-17 Chapter 5, July 2017 5-3 5.8 CVSD Bit Rate Determination The following discussion provides a procedure for determining the...Telemetry Standards , RCC Standard 106-17 Chapter 5, July 2017 CHAPTER 5 Digitized Audio Telemetry Standard Table of Contents Chapter 5...Digitized Audio Telemetry Standard ............................................................... 5-1 5.1 General
Development and Assessment of Web Courses That Use Streaming Audio and Video Technologies.

ERIC Educational Resources Information Center

Ingebritsen, Thomas S.; Flickinger, Kathleen

Iowa State University, through a program called Project BIO (Biology Instructional Outreach), has been using RealAudio technology for about 2 years in college biology courses that are offered entirely via the World Wide Web. RealAudio is a type of streaming media technology that can be used to deliver audio content and a variety of other media…
Hearing You Loud and Clear: Student Perspectives of Audio Feedback in Higher Education

ERIC Educational Resources Information Center

Gould, Jill; Day, Pat

2013-01-01

The use of audio feedback for students in a full-time community nursing degree course is appraised. The aim of this mixed methods study was to examine student views on audio feedback for written assignments. Questionnaires and a focus group were used to capture student opinion of this pilot project. The majority of students valued audio feedback…
Audio-Visual Perception System for a Humanoid Robotic Head

PubMed Central

Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M.; Bandera, Juan P.; Romero-Garces, Adrian; Reche-Lopez, Pedro

2014-01-01

One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework. PMID:24878593
Audio/ Videoconferencing Packages: Low Cost

ERIC Educational Resources Information Center

Treblay, Remy; Fyvie, Barb; Koritko, Brenda

2005-01-01

A comparison was conducted of "Voxwire MeetingRoom" and "iVocalize" v4.1.0.3, both Web-conferencing products using voice-over-Internet protocol (VoIP) to provide unlimited, inexpensive, international audio communication, and high-quality Web-conferencing fostering collaborative learning. The study used the evaluation criteria used in earlier…
Radioactive Decay: Audio Data Collection

ERIC Educational Resources Information Center

Struthers, Allan

2009-01-01

Many phenomena generate interesting audible time series. This data can be collected and processed using audio software. The free software package "Audacity" is used to demonstrate the process by recording, processing, and extracting click times from an inexpensive radiation detector. The high quality of the data is demonstrated with a simple…
Say What? The Role of Audio in Multimedia Video

NASA Astrophysics Data System (ADS)

Linder, C. A.; Holmes, R. M.

2011-12-01

Audio, including interviews, ambient sounds, and music, is a critical-yet often overlooked-part of an effective multimedia video. In February 2010, Linder joined scientists working on the Global Rivers Observatory Project for two weeks of intensive fieldwork in the Congo River watershed. The team's goal was to learn more about how climate change and deforestation are impacting the river system and coastal ocean. Using stills and video shot with a lightweight digital SLR outfit and audio recorded with a pocket-sized sound recorder, Linder documented the trials and triumphs of working in the heart of Africa. Using excerpts from the six-minute Congo multimedia video, this presentation will illustrate how to record and edit an engaging audio track. Topics include interview technique, collecting ambient sounds, choosing and using music, and editing it all together to educate and entertain the viewer.
Effect of Audio vs. Video on Aural Discrimination of Vowels

ERIC Educational Resources Information Center

McCrocklin, Shannon

2012-01-01

Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…
Making the Most of Audio. Technology in Language Learning Series.

ERIC Educational Resources Information Center

Barley, Anthony

Prepared for practicing language teachers, this book's aim is to help them make the most of audio, a readily accessible resource. The book shows, with the help of numerous practical examples, how a range of language skills can be developed. Most examples are in French. Chapters cover the following information: (1) making the most of audio (e.g.,…
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.

Code of Federal Regulations, 2011 CFR

2011-07-01

... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.

Code of Federal Regulations, 2012 CFR

2012-07-01

... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.

Code of Federal Regulations, 2013 CFR

2013-07-01

... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
37 CFR 202.22 - Acquisition and deposit of unpublished audio and audiovisual transmission programs.

Code of Federal Regulations, 2014 CFR

2014-07-01

... unpublished audio and audiovisual transmission programs. 202.22 Section 202.22 Patents, Trademarks, and... REGISTRATION OF CLAIMS TO COPYRIGHT § 202.22 Acquisition and deposit of unpublished audio and audiovisual... and copies of unpublished audio and audiovisual transmission programs by the Library of Congress under...
Transitioning from Analog to Digital Audio Recording in Childhood Speech Sound Disorders

ERIC Educational Resources Information Center

Shriberg, Lawrence D.; Mcsweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.

2005-01-01

Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing…
Defining Audio/Video Redundancy from a Limited Capacity Information Processing Perspective.

ERIC Educational Resources Information Center

Lang, Annie

1995-01-01

Investigates whether audio/video redundancy improves memory for television messages. Suggests a theoretical framework for classifying previous work and reinterpreting the results. Suggests general support for the notion that redundancy levels affect the capacity requirements of the message, which impact differentially on audio or visual…
A Context-Aware-Based Audio Guidance System for Blind People Using a Multimodal Profile Model

PubMed Central

Lin, Qing; Han, Youngjoon

2014-01-01

A wearable guidance system is designed to provide context-dependent guidance messages to blind people while they traverse local pathways. The system is composed of three parts: moving scene analysis, walking context estimation and audio message delivery. The combination of a downward-pointing laser scanner and a camera is used to solve the challenging problem of moving scene analysis. By integrating laser data profiles and image edge profiles, a multimodal profile model is constructed to estimate jointly the ground plane, object locations and object types, by using a Bayesian network. The outputs of the moving scene analysis are further employed to estimate the walking context, which is defined as a fuzzy safety level that is inferred through a fuzzy logic model. Depending on the estimated walking context, the audio messages that best suit the current context are delivered to the user in a flexible manner. The proposed system is tested under various local pathway scenes, and the results confirm its efficiency in assisting blind people to attain autonomous mobility. PMID:25302812
Securing Digital Audio using Complex Quadratic Map

NASA Astrophysics Data System (ADS)

Suryadi, MT; Satria Gunawan, Tjandra; Satria, Yudi

2018-03-01

In This digital era, exchanging data are common and easy to do, therefore it is vulnerable to be attacked and manipulated from unauthorized parties. One data type that is vulnerable to attack is digital audio. So, we need data securing method that is not vulnerable and fast. One of the methods that match all of those criteria is securing the data using chaos function. Chaos function that is used in this research is complex quadratic map (CQM). There are some parameter value that causing the key stream that is generated by CQM function to pass all 15 NIST test, this means that the key stream that is generated using this CQM is proven to be random. In addition, samples of encrypted digital sound when tested using goodness of fit test are proven to be uniform, so securing digital audio using this method is not vulnerable to frequency analysis attack. The key space is very huge about 8.1×l031 possible keys and the key sensitivity is very small about 10-10, therefore this method is also not vulnerable against brute-force attack. And finally, the processing speed for both encryption and decryption process on average about 450 times faster that its digital audio duration.
Audio Tracking in Noisy Environments by Acoustic Map and Spectral Signature.

PubMed

Crocco, Marco; Martelli, Samuele; Trucco, Andrea; Zunino, Andrea; Murino, Vittorio

2018-05-01

A novel method is proposed for generic target tracking by audio measurements from a microphone array. To cope with noisy environments characterized by persistent and high energy interfering sources, a classification map (CM) based on spectral signatures is calculated by means of a machine learning algorithm. Next, the CM is combined with the acoustic map, describing the spatial distribution of sound energy, in order to obtain a cleaned joint map in which contributions from the disturbing sources are removed. A likelihood function is derived from this map and fed to a particle filter yielding the target location estimation on the acoustic image. The method is tested on two real environments, addressing both speaker and vehicle tracking. The comparison with a couple of trackers, relying on the acoustic map only, shows a sharp improvement in performance, paving the way to the application of audio tracking in real challenging environments.
Virtual environment display for a 3D audio room simulation

NASA Technical Reports Server (NTRS)

Chapin, William L.; Foster, Scott H.

1992-01-01

The development of a virtual environment simulation system integrating a 3D acoustic audio model with an immersive 3D visual scene is discussed. The system complements the acoustic model and is specified to: allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; reinforce the listener's feeling of telepresence in the acoustical environment with visual and proprioceptive sensations; enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations.
Effects of a theory-based audio HIV/AIDS intervention for illiterate rural females in Amhara, Ethiopia.

PubMed

Bogale, Gebeyehu W; Boer, Henk; Seydel, Erwin R

2011-02-01

In Ethiopia the level of illiteracy in rural areas is very high. In this study, we investigated the effects of an audio HIV/AIDS prevention intervention targeted at rural illiterate females. In the intervention we used social-oriented presentation formats, such as discussion between similar females and role-play. In a pretest and posttest experimental study with an intervention group (n = 210) and control group (n = 210), we investigated the effects on HIV/AIDS knowledge and social cognitions. The intervention led to significant and relevant increases in HIV/AIDS knowledge, self-efficacy, perceived vulnerability to HIV/AIDS infection, response efficacy of condoms and condom use intention. In the intervention group, self-efficacy at posttest was the main determinant of condom use intention, with also a significant contribution of vulnerability. We conclude that audio HIV/AIDS prevention interventions can play an important role in empowering rural illiterate females in the prevention of HIV/AIDS.

Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

ERIC Educational Resources Information Center

Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

2010-01-01

Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…
How actions shape perception: learning action-outcome relations and predicting sensory outcomes promote audio-visual temporal binding

PubMed Central

Desantis, Andrea; Haggard, Patrick

2016-01-01

To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events. PMID:27982063
How actions shape perception: learning action-outcome relations and predicting sensory outcomes promote audio-visual temporal binding.

PubMed

Desantis, Andrea; Haggard, Patrick

2016-12-16

To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events.
MedlinePlus FAQ: Is audio description available for videos on MedlinePlus?

MedlinePlus

... audiodescription.html Question: Is audio description available for videos on MedlinePlus? To use the sharing features on ... page, please enable JavaScript. Answer: Audio description of videos helps make the content of videos accessible to ...
Design and Usability Testing of an Audio Platform Game for Players with Visual Impairments

ERIC Educational Resources Information Center

Oren, Michael; Harding, Chris; Bonebright, Terri L.

2008-01-01

This article reports on the evaluation of a novel audio platform game that creates a spatial, interactive experience via audio cues. A pilot study with players with visual impairments, and usability testing comparing the visual and audio game versions using both sighted players and players with visual impairments, revealed that all the…
Revealing the ecological content of long-duration audio-recordings of the environment through clustering and visualisation.

PubMed

Phillips, Yvonne F; Towsey, Michael; Roe, Paul

2018-01-01

Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration.
Revealing the ecological content of long-duration audio-recordings of the environment through clustering and visualisation

PubMed Central

Towsey, Michael; Roe, Paul

2018-01-01

Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration. PMID:29494629
Online Instructor's Use of Audio Feedback to Increase Social Presence and Student Satisfaction

ERIC Educational Resources Information Center

Portolese Dias, Laura; Trumpy, Robert

2014-01-01

This study investigates the impact of written group feedback, versus audio feedback, based upon four student satisfaction measures in the online classroom environment. Undergraduate students in the control group were provided both individual written feedback and group written feedback, while undergraduate students in the experimental treatment…
Deutsch Durch Audio-Visuelle Methode: An Audio-Lingual-Oral Approach to the Teaching of German.

ERIC Educational Resources Information Center

Dickinson Public Schools, ND. Instructional Media Center.

This teaching guide, designed to accompany Chilton's "Deutsch Durch Audio-Visuelle Methode" for German 1 and 2 in a three-year secondary school program, focuses major attention on the operational plan of the program and a student orientation unit. A section on teaching a unit discusses four phases: (1) presentation, (2) explanation, (3)…
Audio Restoration

NASA Astrophysics Data System (ADS)

Esquef, Paulo A. A.

The first reproducible recording of human voice was made in 1877 on a tinfoil cylinder phonograph devised by Thomas A. Edison. Since then, much effort has been expended to find better ways to record and reproduce sounds. By the mid-1920s, the first electrical recordings appeared and gradually took over purely acoustic recordings. The development of electronic computers, in conjunction with the ability to record data onto magnetic or optical media, culminated in the standardization of compact disc format in 1980. Nowadays, digital technology is applied to several audio applications, not only to improve the quality of modern and old recording/reproduction techniques, but also to trade off sound quality for less storage space and less taxing transmission capacity requirements.
Active Learning for Automatic Audio Processing of Unwritten Languages (ALAPUL)

DTIC Science & Technology

2016-07-01

AFRL-RH-WP-TR-2016-0074 ACTIVE LEARNING FOR AUTOMATIC AUDIO PROCESSING OF UNWRITTEN LANGUAGES (ALAPUL) Dimitra Vergyri Andreas Kathol Wen Wang...June 2015-July 2016 4. TITLE AND SUBTITLE Active Learning for Automatic Audio Processing of Unwritten Languages (ALAPUL) 5a. CONTRACT NUMBER...5430, 27 October 2016 1. SUMMARY The goal of the project was to investigate development of an automatic spoken language processing (ASLP) system
Guidelines for the Production of Audio Materials for Print Handicapped Readers.

ERIC Educational Resources Information Center

National Library of Australia, Canberra.

Procedural guidelines developed by the Audio Standards Committee of the National Library of Australia to help improve the overall quality of production of audio materials for visually handicapped readers are presented. This report covers the following areas: selection of narrators and the narration itself; copyright; recording of books, magazines,…
Rethinking the Red Ink: Audio-Feedback in the ESL Writing Classroom.

ERIC Educational Resources Information Center

Johanson, Robert

1999-01-01

This paper describes audio-feedback as a teaching method for English-as-a-Second-Language (ESL) writing classes. Using this method, writing instructors respond to students' compositions by recording their comments onto an audiocassette, then returning the paper and cassette to the students. The first section describes audio-feedback and explains…
The Introduction and Refinement of the Assessment of Digitally Recorded Audio Presentations

ERIC Educational Resources Information Center

Sinclair, Stefanie

2016-01-01

This case study critically evaluates benefits and challenges of a form of assessment included in a final year undergraduate Religious Studies Open University module, which combines a written essay task with a digital audio recording of a short oral presentation. Based on the analysis of student and tutor feedback and sample assignments, this study…
Effectiveness and Comparison of Various Audio Distraction Aids in Management of Anxious Dental Paediatric Patients.

PubMed

Navit, Saumya; Johri, Nikita; Khan, Suleman Abbas; Singh, Rahul Kumar; Chadha, Dheera; Navit, Pragati; Sharma, Anshul; Bahuguna, Rachana

2015-12-01

Dental anxiety is a widespread phenomenon and a concern for paediatric dentistry. The inability of children to deal with threatening dental stimuli often manifests as behaviour management problems. Nowadays, the use of non-aversive behaviour management techniques is more advocated, which are more acceptable to parents, patients and practitioners. Therefore, this present study was conducted to find out which audio aid was the most effective in the managing anxious children. The aim of the present study was to compare the efficacy of audio-distraction aids in reducing the anxiety of paediatric patients while undergoing various stressful and invasive dental procedures. The objectives were to ascertain whether audio distraction is an effective means of anxiety management and which type of audio aid is the most effective. A total number of 150 children, aged between 6 to 12 years, randomly selected amongst the patients who came for their first dental check-up, were placed in five groups of 30 each. These groups were the control group, the instrumental music group, the musical nursery rhymes group, the movie songs group and the audio stories group. The control group was treated under normal set-up & audio group listened to various audio presentations during treatment. Each child had four visits. In each visit, after the procedures was completed, the anxiety levels of the children were measured by the Venham's Picture Test (VPT), Venham's Clinical Rating Scale (VCRS) and pulse rate measurement with the help of pulse oximeter. A significant difference was seen between all the groups for the mean pulse rate, with an increase in subsequent visit. However, no significant difference was seen in the VPT & VCRS scores between all the groups. Audio aids in general reduced anxiety in comparison to the control group, and the most significant reduction in anxiety level was observed in the audio stories group. The conclusion derived from the present study was that audio distraction
Musical stairs: the impact of audio feedback during stair-climbing physical therapies for children.

PubMed

Khan, Ajmal; Biddiss, Elaine

2015-05-01

Enhanced biofeedback during rehabilitation therapies has the potential to provide a therapeutic environment optimally designed for neuroplasticity. This study investigates the impact of audio feedback on the achievement of a targeted therapeutic goal, namely, use of reciprocal steps. Stair-climbing therapy sessions conducted with and without audio feedback were compared in a randomized AB/BA cross-over study design. Seventeen children, aged 4-7 years, with various diagnoses participated. Reports from the participants, therapists, and a blinded observer were collected to evaluate achievement of the therapeutic goal, motivation and enjoyment during the therapy sessions. Audio feedback resulted in a 5.7% increase (p = 0.007) in reciprocal steps. Levels of participant enjoyment increased significantly (p = 0.031) and motivation was reported by child participants and therapists to be greater when audio feedback was provided. These positive results indicate that audio feedback may influence the achievement of therapeutic goals and promote enjoyment and motivation in young patients engaged in rehabilitation therapies. This study lays the groundwork for future research to determine the long term effects of audio feedback on functional outcomes of therapy. Stair-climbing is an important mobility skill for promoting independence and activities of daily life and is a key component of rehabilitation therapies for physically disabled children. Provision of audio feedback during stair-climbing therapies for young children may increase their achievement of a targeted therapeutic goal (i.e., use of reciprocal steps). Children's motivation and enjoyment of the stair-climbing therapy was enhanced when audio feedback was provided.
Quick Response (QR) Codes for Audio Support in Foreign Language Learning

ERIC Educational Resources Information Center

Vigil, Kathleen Murray

2017-01-01

This study explored the potential benefits and barriers of using quick response (QR) codes as a means by which to provide audio materials to middle-school students learning Spanish as a foreign language. Eleven teachers of Spanish to middle-school students created transmedia materials containing QR codes linking to audio resources. Students…
Validation of a digital audio recording method for the objective assessment of cough in the horse.

PubMed

Duz, M; Whittaker, A G; Love, S; Parkin, T D H; Hughes, K J

2010-10-01

To validate the use of digital audio recording and analysis for quantification of coughing in horses. Part A: Nine simultaneous digital audio and video recordings were collected individually from seven stabled horses over a 1 h period using a digital audio recorder attached to the halter. Audio files were analysed using audio analysis software. Video and audio recordings were analysed for cough count and timing by two blinded operators on two occasions using a randomised study design for determination of intra-operator and inter-operator agreement. Part B: Seventy-eight hours of audio recordings obtained from nine horses were analysed once by two blinded operators to assess inter-operator repeatability on a larger sample. Part A: There was complete agreement between audio and video analyses and inter- and intra-operator analyses. Part B: There was >97% agreement between operators on number and timing of 727 coughs recorded over 78 h. The results of this study suggest that the cough monitor methodology used has excellent sensitivity and specificity for the objective assessment of cough in horses and intra- and inter-operator variability of recorded coughs is minimal. Crown Copyright 2010. Published by Elsevier India Pvt Ltd. All rights reserved.
NFL Films audio, video, and film production facilities

NASA Astrophysics Data System (ADS)

Berger, Russ; Schrag, Richard C.; Ridings, Jason J.

2003-04-01

The new NFL Films 200,000 sq. ft. headquarters is home for the critically acclaimed film production that preserves the NFL's visual legacy week-to-week during the football season, and is also the technical plant that processes and archives football footage from the earliest recorded media to the current network broadcasts. No other company in the country shoots more film than NFL Films, and the inclusion of cutting-edge video and audio formats demands that their technical spaces continually integrate the latest in the ever-changing world of technology. This facility houses a staggering array of acoustically sensitive spaces where music and sound are equal partners with the visual medium. Over 90,000 sq. ft. of sound critical technical space is comprised of an array of sound stages, music scoring stages, audio control rooms, music writing rooms, recording studios, mixing theaters, video production control rooms, editing suites, and a screening theater. Every production control space in the building is designed to monitor and produce multi channel surround sound audio. An overview of the architectural and acoustical design challenges encountered for each sophisticated listening, recording, viewing, editing, and sound critical environment will be discussed.
Audio Control Handbook For Radio and Television Broadcasting. Third Revised Edition.

ERIC Educational Resources Information Center

Oringel, Robert S.

Audio control is the operation of all the types of sound equipment found in the studios and control rooms of a radio or television station. Written in a nontechnical style for beginners, the book explains thoroughly the operation of all types of audio equipment. Diagrams and photographs of commercial consoles, microphones, turntables, and tape…

Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering

PubMed Central

Savran, Arman; Cao, Houwei; Shah, Miraj; Nenkova, Ani; Verma, Ragini

2013-01-01

We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively. PMID:25300451
Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering.

PubMed

Savran, Arman; Cao, Houwei; Shah, Miraj; Nenkova, Ani; Verma, Ragini

2012-01-01

We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.
Image Steganography In Securing Sound File Using Arithmetic Coding Algorithm, Triple Data Encryption Standard (3DES) and Modified Least Significant Bit (MLSB)

NASA Astrophysics Data System (ADS)

Nasution, A. B.; Efendi, S.; Suwilo, S.

2018-04-01

The amount of data inserted in the form of audio samples that use 8 bits with LSB algorithm, affect the value of PSNR which resulted in changes in image quality of the insertion (fidelity). So in this research will be inserted audio samples using 5 bits with MLSB algorithm to reduce the number of data insertion where previously the audio sample will be compressed with Arithmetic Coding algorithm to reduce file size. In this research will also be encryption using Triple DES algorithm to better secure audio samples. The result of this research is the value of PSNR more than 50dB so it can be concluded that the image quality is still good because the value of PSNR has exceeded 40dB.
Three dimensional audio versus head down TCAS displays

NASA Technical Reports Server (NTRS)

Begault, Durand R.; Pittman, Marc T.

1994-01-01

The advantage of a head up auditory display was evaluated in an experiment designed to measure and compare the acquisition time for capturing visual targets under two conditions: Standard head down traffic collision avoidance system (TCAS) display, and three-dimensional (3-D) audio TCAS presentation. Ten commercial airline crews were tested under full mission simulation conditions at the NASA Ames Crew-Vehicle Systems Research Facility Advanced Concepts Flight Simulator. Scenario software generated targets corresponding to aircraft which activated a 3-D aural advisory or a TCAS advisory. Results showed a significant difference in target acquisition time between the two conditions, favoring the 3-D audio TCAS condition by 500 ms.
Providing Students with Formative Audio Feedback

ERIC Educational Resources Information Center

Brearley, Francis Q.; Cullen, W. Rod

2012-01-01

The provision of timely and constructive feedback is increasingly challenging for busy academics. Ensuring effective student engagement with feedback is equally difficult. Increasingly, studies have explored provision of audio recorded feedback to enhance effectiveness and engagement with feedback. Few, if any, of these focus on purely formative…
Writing on wet paper

NASA Astrophysics Data System (ADS)

Fridrich, Jessica; Goljan, Miroslav; Lisonek, Petr; Soukal, David

2005-03-01

In this paper, we show that the communication channel known as writing in memory with defective cells is a relevant information-theoretical model for a specific case of passive warden steganography when the sender embeds a secret message into a subset C of the cover object X without sharing the selection channel C with the recipient. The set C could be arbitrary, determined by the sender from the cover object using a deterministic, pseudo-random, or a truly random process. We call this steganography "writing on wet paper" and realize it using low-density random linear codes with the encoding step based on the LT process. The importance of writing on wet paper for covert communication is discussed within the context of adaptive steganography and perturbed quantization steganography. Heuristic arguments supported by tests using blind steganalysis indicate that the wet paper steganography provides improved steganographic security for embedding in JPEG images and is less vulnerable to attacks when compared to existing methods with shared selection channels.
Audio-visual sensory deprivation degrades visuo-tactile peri-personal space.

PubMed

Noel, Jean-Paul; Park, Hyeong-Dong; Pasqualini, Isabella; Lissek, Herve; Wallace, Mark; Blanke, Olaf; Serino, Andrea

2018-05-01

Self-perception is scaffolded upon the integration of multisensory cues on the body, the space surrounding the body (i.e., the peri-personal space; PPS), and from within the body. We asked whether reducing information available from external space would change: PPS, interoceptive accuracy, and self-experience. Twenty participants were exposed to 15 min of audio-visual deprivation and performed: (i) a visuo-tactile interaction task measuring their PPS; (ii) a heartbeat perception task measuring interoceptive accuracy; and (iii) a series of questionnaires related to self-perception and mental illness. These tasks were carried out in two conditions: while exposed to a standard sensory environment and under a condition of audio-visual deprivation. Results suggest that while PPS becomes ill defined after audio-visual deprivation, interoceptive accuracy is unaltered at a group-level, with some participants improving and some worsening in interoceptive accuracy. Interestingly, correlational individual differences analyses revealed that changes in PPS after audio-visual deprivation were related to interoceptive accuracy and self-reports of "unusual experiences" on an individual subject basis. Taken together, the findings argue for a relationship between the malleability of PPS, interoceptive accuracy, and an inclination toward aberrant ideation often associated with mental illness. Copyright © 2018. Published by Elsevier Inc.
The relationship between basic audio quality and overall listening experience.

PubMed

Schoeffler, Michael; Herre, Jürgen

2016-09-01

Basic audio quality (BAQ) is a well-known perceptual attribute, which is rated in various listening test methods to measure the performance of audio systems. Unfortunately, when it comes to purchasing audio systems, BAQ might not have a significant influence on the customers' buying decisions since other factors, like brand loyalty, might be more important. In contrast to BAQ, overall listening experience (OLE) is an affective attribute which incorporates all aspects that are important to an individual assessor, including his or her preference for music genre and audio quality. In this work, the relationship between BAQ and OLE is investigated in more detail. To this end, an experiment was carried out, in which participants rated the BAQ and the OLE of music excerpts with different timbral and spatial degradations. In a between-group-design procedure, participants were assigned into two groups, in each of which a different set of stimuli was rated. The results indicate that rating of both attributes, BAQ and OLE, leads to similar rankings, even if a different set of stimuli is rated. In contrast to the BAQ ratings, which were more influenced by timbral than spatial degradations, the OLE ratings were almost equally influenced by timbral and spatial degradations.
Audio Feedback: Richer Language but No Measurable Impact on Student Performance

ERIC Educational Resources Information Center

Chalmers, Charlotte; MacCallum, Janis; Mowat, Elaine; Fulton, Norma

2014-01-01

Audio feedback has been shown to be popular and well received by students. However, there is little published work to indicate how effective audio feedback is in improving student performance. Sixty students from a first year science degree agreed to take part in the study; thirty were randomly assigned to receive written feedback on coursework,…
Designing between Pedagogies and Cultures: Audio-Visual Chinese Language Resources for Australian Schools

ERIC Educational Resources Information Center

Yuan, Yifeng; Shen, Huizhong

2016-01-01

This design-based study examines the creation and development of audio-visual Chinese language teaching and learning materials for Australian schools by incorporating users' feedback and content writers' input that emerged in the designing process. Data were collected from workshop feedback of two groups of Chinese-language teachers from primary…
Synchronized personalized music audio-playlists to improve adherence to physical activity among patients participating in a structured exercise program: a proof-of-principle feasibility study.

PubMed

Alter, David A; O'Sullivan, Mary; Oh, Paul I; Redelmeier, Donald A; Marzolini, Susan; Liu, Richard; Forhan, Mary; Silver, Michael; Goodman, Jack M; Bartel, Lee R

2015-01-01

Preference-based tempo-pace synchronized music has been shown to reduce perceived physical activity exertion and improve exercise performance. The extent to which such strategies can improve adherence to physical activity remains unknown. The objective of the study is to explore the feasibility and efficacy of tempo-pace synchronized preference-based music audio-playlists on adherence to physical activity among cardiovascular disease patients participating in a cardiac rehabilitation. Thirty-four cardiac rehabilitation patients were randomly allocated to one of two strategies: (1) no music usual-care control and (2) tempo-pace synchronized audio-devices with personalized music playlists + usual-care. All songs uploaded onto audio-playlist devices took into account patient personal music genre and artist preferences. However, actual song selection was restricted to music whose tempos approximated patients' prescribed exercise walking/running pace (steps per minute) to achieve tempo-pace synchrony. Patients allocated to audio-music playlists underwent further randomization in which half of the patients received songs that were sonically enhanced with rhythmic auditory stimulation (RAS) to accentuate tempo-pace synchrony, whereas the other half did not. RAS was achieved through blinded rhythmic sonic-enhancements undertaken manually to songs within individuals' music playlists. The primary outcome consisted of the weekly volume of physical activity undertaken over 3 months as determined by tri-axial accelerometers. Statistical methods employed an intention to treat and repeated-measures design. Patients randomized to personalized audio-playlists with tempo-pace synchrony achieved higher weekly volumes of physical activity than did their non-music usual-care comparators (475.6 min vs. 370.2 min, P < 0.001). Improvements in weekly physical activity volumes among audio-playlist recipients were driven by those randomized to the RAS group which attained weekly
16 CFR 307.8 - Requirements for disclosure in audiovisual and audio advertising.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 16 Commercial Practices 1 2010-01-01 2010-01-01 false Requirements for disclosure in audiovisual and audio advertising. 307.8 Section 307.8 Commercial Practices FEDERAL TRADE COMMISSION REGULATIONS... ACT OF 1986 Advertising Disclosures § 307.8 Requirements for disclosure in audiovisual and audio...
[Development of Audio Indicator System for Respiratory Dynamic CT Imaging].

PubMed

Muramatsu, Shun; Moriya, Hiroshi; Tsukagoshi, Shinsuke; Yamada, Norikazu

We created the device, which can conduct a radiological technologist's voice to a subject during CT scanning. For 149 lung cancer, dynamic respiratory CT were performed. 92 cases were performed using this device, the others were without this device. The respiratory cycle and respiratory amplitude were analyzed from the lung density. A stable respirating cycle was obtained by using the audio indicator system. The audio indicator system is useful for respiratory dynamic CT.
An Audio Jack-Based Electrochemical Impedance Spectroscopy Sensor for Point-of-Care Diagnostics.

PubMed

Jiang, Haowei; Sun, Alex; Venkatesh, A G; Hall, Drew A

2017-02-01

Portable and easy-to-use point-of-care (POC) diagnostic devices hold high promise for dramatically improving public health and wellness. In this paper, we present a mobile health (mHealth) immunoassay platform based on audio jack embedded devices, such as smartphones and laptops, that uses electrochemical impedance spectroscopy (EIS) to detect binding of target biomolecules. Compared to other biomolecular detection tools, this platform is intended to be used as a plug-and-play peripheral that reuses existing hardware in the mobile device and does not require an external battery, thereby improving upon its convenience and portability. Experimental data using a passive circuit network to mimic an electrochemical cell demonstrate that the device performs comparably to laboratory grade instrumentation with 0.3% and 0.5° magnitude and phase error, respectively, over a 17 Hz to 17 kHz frequency range. The measured power consumption is 2.5 mW with a dynamic range of 60 dB. This platform was verified by monitoring the real-time formation of a NeutrAvidin self-assembled monolayer (SAM) on a gold electrode demonstrating the potential for POC diagnostics.
Effects of a Theory-Based Audio HIV/AIDS Intervention for Illiterate Rural Females in Amhara, Ethiopia

ERIC Educational Resources Information Center

Bogale, Gebeyehu W.; Boer, Henk; Seydel, Erwin R.

2011-01-01

In Ethiopia the level of illiteracy in rural areas is very high. In this study, we investigated the effects of an audio HIV/AIDS prevention intervention targeted at rural illiterate females. In the intervention we used social-oriented presentation formats, such as discussion between similar females and role-play. In a pretest and posttest…
Audio-video decision support for patients: the documentary genré as a basis for decision aids.

PubMed

Volandes, Angelo E; Barry, Michael J; Wood, Fiona; Elwyn, Glyn

2013-09-01

Decision support tools are increasingly using audio-visual materials. However, disagreement exists about the use of audio-visual materials as they may be subjective and biased. This is a literature review of the major texts for documentary film studies to extrapolate issues of objectivity and bias from film to decision support tools. The key features of documentary films are that they attempt to portray real events and that the attempted reality is always filtered through the lens of the filmmaker. The same key features can be said of decision support tools that use audio-visual materials. Three concerns arising from documentary film studies as they apply to the use of audio-visual materials in decision support tools include whose perspective matters (stakeholder bias), how to choose among audio-visual materials (selection bias) and how to ensure objectivity (editorial bias). Decision science needs to start a debate about how audio-visual materials are to be used in decision support tools. Simply because audio-visual materials may be subjective and open to bias does not mean that we should not use them. Methods need to be found to ensure consensus around balance and editorial control, such that audio-visual materials can be used. © 2011 John Wiley & Sons Ltd.
Informed spectral analysis: audio signal parameter estimation using side information

NASA Astrophysics Data System (ADS)

Fourer, Dominique; Marchand, Sylvain

2013-12-01

Parametric models are of great interest for representing and manipulating sounds. However, the quality of the resulting signals depends on the precision of the parameters. When the signals are available, these parameters can be estimated, but the presence of noise decreases the resulting precision of the estimation. Furthermore, the Cramér-Rao bound shows the minimal error reachable with the best estimator, which can be insufficient for demanding applications. These limitations can be overcome by using the coding approach which consists in directly transmitting the parameters with the best precision using the minimal bitrate. However, this approach does not take advantage of the information provided by the estimation from the signal and may require a larger bitrate and a loss of compatibility with existing file formats. The purpose of this article is to propose a compromised approach, called the 'informed approach,' which combines analysis with (coded) side information in order to increase the precision of parameter estimation using a lower bitrate than pure coding approaches, the audio signal being known. Thus, the analysis problem is presented in a coder/decoder configuration where the side information is computed and inaudibly embedded into the mixture signal at the coder. At the decoder, the extra information is extracted and is used to assist the analysis process. This study proposes applying this approach to audio spectral analysis using sinusoidal modeling which is a well-known model with practical applications and where theoretical bounds have been calculated. This work aims at uncovering new approaches for audio quality-based applications. It provides a solution for challenging problems like active listening of music, source separation, and realistic sound transformations.
Real-Time Transmission and Storage of Video, Audio, and Health Data in Emergency and Home Care Situations

NASA Astrophysics Data System (ADS)

Barbieri, Ivano; Lambruschini, Paolo; Raggio, Marco; Stagnaro, Riccardo

2007-12-01

The increase in the availability of bandwidth for wireless links, network integration, and the computational power on fixed and mobile platforms at affordable costs allows nowadays for the handling of audio and video data, their quality making them suitable for medical application. These information streams can support both continuous monitoring and emergency situations. According to this scenario, the authors have developed and implemented the mobile communication system which is described in this paper. The system is based on ITU-T H.323 multimedia terminal recommendation, suitable for real-time data/video/audio and telemedical applications. The audio and video codecs, respectively, H.264 and G723.1, were implemented and optimized in order to obtain high performance on the system target processors. Offline media streaming storage and retrieval functionalities were supported by integrating a relational database in the hospital central system. The system is based on low-cost consumer technologies such as general packet radio service (GPRS) and wireless local area network (WLAN or WiFi) for lowband data/video transmission. Implementation and testing were carried out for medical emergency and telemedicine application. In this paper, the emergency case study is described.
Stress Reduction through Audio Distraction in Anxious Pediatric Dental Patients: An Adjunctive Clinical Study.

PubMed

Singh, Divya; Samadi, Firoza; Jaiswal, Jn; Tripathi, Abhay Mani

2014-01-01

The purpose of the present study was to evaluate the eff-cacy of 'audio distraction' in anxious pediatric dental patients. Sixty children were randomly selected and equally divided into two groups of thirty each. The first group was control group (group A) and the second group was music group (group B). The dental procedure employed was extraction for both the groups. The children included in music group were allowed to hear audio presentation throughout the treatment procedure. Anxiety was measured by using Venham's picture test, pulse rate, blood pressure and oxygen saturation. 'Audio distraction' was found efficacious in alleviating anxiety of pediatric dental patients. 'Audio distraction' did decrease the anxiety in pediatric patients to a significant extent. How to cite this article: Singh D, Samadi F, Jaiswal JN, Tripathi AM. Stress Reduction through Audio Distraction in Anxious Pediatric Dental Patients: An Adjunctive Clinical Study. Int J Clin Pediatr Dent 2014;7(3):149-152.
An ESL Audio-Script Writing Workshop

ERIC Educational Resources Information Center

Miller, Carla

2012-01-01

The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

Attention to and Memory for Audio and Video Information in Television Scenes.

ERIC Educational Resources Information Center

Basil, Michael D.

A study investigated whether selective attention to a particular television modality resulted in different levels of attention to and memory for each modality. Two independent variables manipulated selective attention. These were the semantic channel (audio or video) and viewers' instructed focus (audio or video). These variables were fully…
An Analysis of Certain Elements of an Audio-Tape Approach to Instruction.

ERIC Educational Resources Information Center

Bell, Ronald Ernest

This study was designed to determine the association between selected variables and an audio-tape approach to instruction. Fifty sophomore students enrolled in a physical anthropology course at Shoreline Community College (Washington) participated in an experimental instructional program that consisted of thirty-two audio-tapes and three optional…
Characterization of HF Propagation for Digital Audio Broadcasting

NASA Technical Reports Server (NTRS)

Vaisnys, Arvydas

1997-01-01

The purpose of this presentation is to give a brief overview of some propagation measurements in the Short Wave (3-30 MHz) bands, made in support of a digital audio transmission system design for the Voice of America. This task is a follow on to the Digital Broadcast Satellite Radio task, during which several mitigation techniques would be applicable to digital audio in the Short Wave bands as well, in spite of the differences in propagation impairments in these two bands. Two series of propagation measurements were made to quantify the range of impairments that could be expected. An assessment of the performance of a prototype version of the receiver was also made.
One Message, Many Voices: Mobile Audio Counselling in Health Education.

PubMed

Pimmer, Christoph; Mbvundula, Francis

2018-01-01

Health workers' use of counselling information on their mobile phones for health education is a central but little understood phenomenon in numerous mobile health (mHealth) projects in Sub-Saharan Africa. Drawing on empirical data from an interpretive case study in the setting of the Millennium Villages Project in rural Malawi, this research investigates the ways in which community health workers (CHWs) perceive that audio-counselling messages support their health education practice. Three main themes emerged from the analysis: phone-aided audio counselling (1) legitimises the CHWs' use of mobile phones during household visits; (2) helps CHWs to deliver a comprehensive counselling message; (3) supports CHWs in persuading communities to change their health practices. The findings show the complexity and interplay of the multi-faceted, sociocultural, political, and socioemotional meanings associated with audio-counselling use. Practical implications and the demand for further research are discussed.
Comparison between audio-only and audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy.

PubMed

Yu, Jesang; Choi, Ji Hoon; Ma, Sun Young; Jeung, Tae Sig; Lim, Sangwook

2015-09-01

To compare audio-only biofeedback to conventional audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy, limiting damage to healthy surrounding tissues caused by organ movement. Six healthy volunteers were assisted by audiovisual or audio-only biofeedback systems to regulate their respirations. Volunteers breathed through a mask developed for this study by following computer-generated guiding curves displayed on a screen, combined with instructional sounds. They then performed breathing following instructional sounds only. The guiding signals and the volunteers' respiratory signals were logged at 20 samples per second. The standard deviations between the guiding and respiratory curves for the audiovisual and audio-only biofeedback systems were 21.55% and 23.19%, respectively; the average correlation coefficients were 0.9778 and 0.9756, respectively. The regularities between audiovisual and audio-only biofeedback for six volunteers' respirations were same statistically from the paired t-test. The difference between the audiovisual and audio-only biofeedback methods was not significant. Audio-only biofeedback has many advantages, as patients do not require a mask and can quickly adapt to this method in the clinic.
Impact of Audio-Coaching on the Position of Lung Tumors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haasbeek, Cornelis J.A.; Spoelstra, Femke; Lagerwaard, Frank J.

2008-07-15

Purpose: Respiration-induced organ motion is a major source of positional, or geometric, uncertainty in thoracic radiotherapy. Interventions to mitigate the impact of motion include audio-coached respiration-gated radiotherapy (RGRT). To assess the impact of coaching on average tumor position during gating, we analyzed four-dimensional computed tomography (4DCT) scans performed both with and without audio-coaching. Methods and Materials: Our RGRT protocol requires that an audio-coached 4DCT scan is performed when the initial free-breathing 4DCT indicates a potential benefit with gating. We retrospectively analyzed 22 such paired scans in patients with well-circumscribed tumors. Changes in lung volume and position of internal target volumesmore » (ITV) generated in three consecutive respiratory phases at both end-inspiration and end-expiration were analyzed. Results: Audio-coaching increased end-inspiration lung volumes by a mean of 10.2% (range, -13% to +43%) when compared with free breathing (p = 0.001). The mean three-dimensional displacement of the center of ITV was 3.6 mm (SD, 2.5; range, 0.3-9.6mm), mainly caused by displacement in the craniocaudal direction. Displacement of ITV caused by coaching was more than 5 mm in 5 patients, all of whom were in the subgroup of 9 patients showing total tumor motion of 10 mm or more during both coached and uncoached breathing. Comparable ITV displacements were observed at end-expiration phases of the 4DCT. Conclusions: Differences in ITV position exceeding 5 mm between coached and uncoached 4DCT scans were detected in up to 56% of mobile tumors. Both end-inspiration and end-expiration RGRT were susceptible to displacements. This indicates that the method of audio-coaching should remain unchanged throughout the course of treatment.« less
Audio Adapted Assessment Data: Does the Addition of Audio to Written Items Modify the Item Calibration?

ERIC Educational Resources Information Center

Snyder, James

2010-01-01

This dissertation research examined the changes in item RIT calibration that occurred when adding audio to a set of currently calibrated RIT items and then placing these new items as field test items in the modified assessments on the NWEA MAP test platform. The researcher used test results from over 600 students in the Poway School District in…
Audio-visual speech processing in age-related hearing loss: Stronger integration and increased frontal lobe recruitment.

PubMed

Rosemann, Stephanie; Thiel, Christiane M

2018-07-15

Hearing loss is associated with difficulties in understanding speech, especially under adverse listening conditions. In these situations, seeing the speaker improves speech intelligibility in hearing-impaired participants. On the neuronal level, previous research has shown cross-modal plastic reorganization in the auditory cortex following hearing loss leading to altered processing of auditory, visual and audio-visual information. However, how reduced auditory input effects audio-visual speech perception in hearing-impaired subjects is largely unknown. We here investigated the impact of mild to moderate age-related hearing loss on processing audio-visual speech using functional magnetic resonance imaging. Normal-hearing and hearing-impaired participants performed two audio-visual speech integration tasks: a sentence detection task inside the scanner and the McGurk illusion outside the scanner. Both tasks consisted of congruent and incongruent audio-visual conditions, as well as auditory-only and visual-only conditions. We found a significantly stronger McGurk illusion in the hearing-impaired participants, which indicates stronger audio-visual integration. Neurally, hearing loss was associated with an increased recruitment of frontal brain areas when processing incongruent audio-visual, auditory and also visual speech stimuli, which may reflect the increased effort to perform the task. Hearing loss modulated both the audio-visual integration strength measured with the McGurk illusion and brain activation in frontal areas in the sentence task, showing stronger integration and higher brain activation with increasing hearing loss. Incongruent compared to congruent audio-visual speech revealed an opposite brain activation pattern in left ventral postcentral gyrus in both groups, with higher activation in hearing-impaired participants in the incongruent condition. Our results indicate that already mild to moderate hearing loss impacts audio-visual speech processing
Focus on Hinduism: Audio-Visual Resources for Teaching Religion. Occasional Publication No. 23.

ERIC Educational Resources Information Center

Dell, David; And Others

The guide presents annotated lists of audio and visual materials about the Hindu religion. The authors point out that Hinduism cannot be comprehended totally by reading books; thus the resources identified in this guide will enhance understanding based on reading. The guide is intended for use by high school and college students, teachers,…
Students' Attitudes to and Usage of Academic Feedback Provided via Audio Files

ERIC Educational Resources Information Center

Merry, Stephen; Orsmond, Paul

2008-01-01

This study explores students' attitudes to the provision of formative feedback on academic work using audio files together with the ways in which students implement such feedback within their learning. Fifteen students received audio file feedback on written work and were subsequently interviewed regarding their utilisation of that feedback within…
Reliability and validity of an audio signal modified shuttle walk test.

PubMed

Singla, Rupak; Rai, Richa; Faye, Abhishek Anil; Jain, Anil Kumar; Chowdhury, Ranadip; Bandyopadhyay, Debdutta

2017-01-01

The audio signal in the conventionally accepted protocol of shuttle walk test (SWT) is not well-understood by the patients and modification of the audio signal may improve the performance of the test. The aim of this study is to study the validity and reliability of an audio signal modified SWT, called the Singla-Richa modified SWT (SWTSR), in healthy normal adults. In SWTSR, the audio signal was modified with the addition of reverse counting to it. A total of 54 healthy normal adults underwent conventional SWT (CSWT) at one instance and two times SWTSRon the same day. The validity was assessed by comparing outcomes of the SWTSRto outcomes of CSWT using the Pearson correlation coefficient and Bland-Altman plot. Test-retest reliability of SWTSRwas assessed using the intraclass correlation coefficient (ICC). The acceptability of the modified test in comparison to the conventional test was assessed using Likert scale. The distance walked (mean ± standard deviation) in the CSWT and SWTSRtest was 853.33 ± 217.33 m and 857.22 ± 219.56 m, respectively (Pearson correlation coefficient - 0.98; P < 0.001) indicating SWTSRto be a valid test. The SWTSRwas found to be a reliable test with ICC of 0.98 (95% confidence interval: 0.97-0.99). The acceptability of SWTSRwas significantly higher than CSWT. The SWTSRwith modified audio signal with reverse counting is a reliable as well as a valid test when compared with CSWT in healthy normal adults. It better understood by subjects compared to CSWT.
Acceptance Inspection for Audio Cassette Recorders.

ERIC Educational Resources Information Center

Smith, Edgar A.

A series of inspections for cassette recorders that can be performed to assure that the devices are acceptable is described. The inspections can be completed in 20 minutes and can be performed by instructional personnel. The series of inspection procedures includes tests of the intelligibility of audio, physical condition, tape speed, impulse…
Apollo 11 Mission Audio - Day 1

NASA Image and Video Library

1969-07-16

Audio from mission control during the launch of Apollo 11, which was the United States' first lunar landing mission. While astronauts Armstrong and Aldrin descended in the Lunar Module "Eagle" to explore the Sea of Tranquility region of the moon, astronaut Collins remained with the Command and Service Modules "Columbia" in lunar orbit.
Comparing a number line and audio prompts in supporting price comparison by students with intellectual disability.

PubMed

Bouck, Emily C; Satsangi, Rajiv; Bartlett, Whitney

2016-01-01

Price comparison is an important and complex skill, but it lacks sufficient research attention in terms of educating secondary students with intellectual disability and/or autism spectrum disorder. This alternating treatment design study compared the use of a paper-based number line and audio prompts delivered via an audio recorder to support three secondary students with intellectual disability to independently and accuracy compare the price of three separate grocery items. The study consisted of 22 sessions, spread across baseline, intervention, best treatment, and two different generalization phases. Data were collected on the percent of task analysis steps completed independently, the type of prompts needed, students' accuracy selecting the lowest priced item, and task completion time. With both intervention conditions, students were able to independently complete the task analysis steps as well as accurately select the lowest priced item and decrease their task completion time. For two of the students, the audio recorder condition resulted in the greatest independence and for one the number line. For only one student was the condition with the greatest independence also the condition for the highest rate of accuracy. The results suggest both tools can support students with price comparison. Yet, audio recorders offer students and teachers an age-appropriate and setting-appropriate option. Copyright © 2016 Elsevier Ltd. All rights reserved.
Influence of audio triggered emotional attention on video perception

NASA Astrophysics Data System (ADS)

Torres, Freddy; Kalva, Hari

2014-02-01

Perceptual video coding methods attempt to improve compression efficiency by discarding visual information not perceived by end users. Most of the current approaches for perceptual video coding only use visual features ignoring the auditory component. Many psychophysical studies have demonstrated that auditory stimuli affects our visual perception. In this paper we present our study of audio triggered emotional attention and it's applicability to perceptual video coding. Experiments with movie clips show that the reaction time to detect video compression artifacts was longer when video was presented with the audio information. The results reported are statistically significant with p=0.024.
Transitioning from analog to digital audio recording in childhood speech sound disorders.

PubMed

Shriberg, Lawrence D; McSweeny, Jane L; Anderson, Bruce E; Campbell, Thomas F; Chial, Michael R; Green, Jordan R; Hauner, Katherina K; Moore, Christopher A; Rusiewicz, Heather L; Wilson, David L

2005-06-01

Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants' speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise.
Transitioning from analog to digital audio recording in childhood speech sound disorders

PubMed Central

Shriberg, Lawrence D.; McSweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.

2014-01-01

Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants’ speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise. PMID:16019779
Audio-Visual Speech Perception Is Special

ERIC Educational Resources Information Center

Tuomainen, J.; Andersen, T.S.; Tiippana, K.; Sams, M.

2005-01-01

In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and…
Computerized Audio-Visual Instructional Sequences (CAVIS): A Versatile System for Listening Comprehension in Foreign Language Teaching.

ERIC Educational Resources Information Center

Aleman-Centeno, Josefina R.

1983-01-01

Discusses the development and evaluation of CAVIS, which consists of an Apple microcomputer used with audiovisual dialogs. Includes research on the effects of three conditions: (1) computer with audio and visual, (2) computer with audio alone and (3) audio alone in short-term and long-term recall. (EKN)
Audio-Visual and Meaningful Semantic Context Enhancements in Older and Younger Adults.

PubMed

Smayda, Kirsten E; Van Engen, Kristin J; Maddox, W Todd; Chandrasekaran, Bharath

2016-01-01

Speech perception is critical to everyday life. Oftentimes noise can degrade a speech signal; however, because of the cues available to the listener, such as visual and semantic cues, noise rarely prevents conversations from continuing. The interaction of visual and semantic cues in aiding speech perception has been studied in young adults, but the extent to which these two cues interact for older adults has not been studied. To investigate the effect of visual and semantic cues on speech perception in older and younger adults, we recruited forty-five young adults (ages 18-35) and thirty-three older adults (ages 60-90) to participate in a speech perception task. Participants were presented with semantically meaningful and anomalous sentences in audio-only and audio-visual conditions. We hypothesized that young adults would outperform older adults across SNRs, modalities, and semantic contexts. In addition, we hypothesized that both young and older adults would receive a greater benefit from a semantically meaningful context in the audio-visual relative to audio-only modality. We predicted that young adults would receive greater visual benefit in semantically meaningful contexts relative to anomalous contexts. However, we predicted that older adults could receive a greater visual benefit in either semantically meaningful or anomalous contexts. Results suggested that in the most supportive context, that is, semantically meaningful sentences presented in the audiovisual modality, older adults performed similarly to young adults. In addition, both groups received the same amount of visual and meaningful benefit. Lastly, across groups, a semantically meaningful context provided more benefit in the audio-visual modality relative to the audio-only modality, and the presence of visual cues provided more benefit in semantically meaningful contexts relative to anomalous contexts. These results suggest that older adults can perceive speech as well as younger adults when both

Audio-Visual and Meaningful Semantic Context Enhancements in Older and Younger Adults

PubMed Central

Smayda, Kirsten E.; Van Engen, Kristin J.; Maddox, W. Todd; Chandrasekaran, Bharath

2016-01-01

Speech perception is critical to everyday life. Oftentimes noise can degrade a speech signal; however, because of the cues available to the listener, such as visual and semantic cues, noise rarely prevents conversations from continuing. The interaction of visual and semantic cues in aiding speech perception has been studied in young adults, but the extent to which these two cues interact for older adults has not been studied. To investigate the effect of visual and semantic cues on speech perception in older and younger adults, we recruited forty-five young adults (ages 18–35) and thirty-three older adults (ages 60–90) to participate in a speech perception task. Participants were presented with semantically meaningful and anomalous sentences in audio-only and audio-visual conditions. We hypothesized that young adults would outperform older adults across SNRs, modalities, and semantic contexts. In addition, we hypothesized that both young and older adults would receive a greater benefit from a semantically meaningful context in the audio-visual relative to audio-only modality. We predicted that young adults would receive greater visual benefit in semantically meaningful contexts relative to anomalous contexts. However, we predicted that older adults could receive a greater visual benefit in either semantically meaningful or anomalous contexts. Results suggested that in the most supportive context, that is, semantically meaningful sentences presented in the audiovisual modality, older adults performed similarly to young adults. In addition, both groups received the same amount of visual and meaningful benefit. Lastly, across groups, a semantically meaningful context provided more benefit in the audio-visual modality relative to the audio-only modality, and the presence of visual cues provided more benefit in semantically meaningful contexts relative to anomalous contexts. These results suggest that older adults can perceive speech as well as younger adults when
Behavioral Science Design for Audio-Visual Software Development

ERIC Educational Resources Information Center

Foster, Dennis L.

1974-01-01

A discussion of the basic structure of the behavioral audio-visual production which consists of objectives analysis, approach determination, technical production, fulfillment evaluation, program refinement, implementation, and follow-up. (Author)
Incorporating Auditory Models in Speech/Audio Applications

NASA Astrophysics Data System (ADS)

Krishnamoorthi, Harish

2011-12-01

Following the success in incorporating perceptual models in audio coding algorithms, their application in other speech/audio processing systems is expanding. In general, all perceptual speech/audio processing algorithms involve minimization of an objective function that directly/indirectly incorporates properties of human perception. This dissertation primarily investigates the problems associated with directly embedding an auditory model in the objective function formulation and proposes possible solutions to overcome high complexity issues for use in real-time speech/audio algorithms. Specific problems addressed in this dissertation include: 1) the development of approximate but computationally efficient auditory model implementations that are consistent with the principles of psychoacoustics, 2) the development of a mapping scheme that allows synthesizing a time/frequency domain representation from its equivalent auditory model output. The first problem is aimed at addressing the high computational complexity involved in solving perceptual objective functions that require repeated application of auditory model for evaluation of different candidate solutions. In this dissertation, a frequency pruning and a detector pruning algorithm is developed that efficiently implements the various auditory model stages. The performance of the pruned model is compared to that of the original auditory model for different types of test signals in the SQAM database. Experimental results indicate only a 4-7% relative error in loudness while attaining up to 80-90 % reduction in computational complexity. Similarly, a hybrid algorithm is developed specifically for use with sinusoidal signals and employs the proposed auditory pattern combining technique together with a look-up table to store representative auditory patterns. The second problem obtains an estimate of the auditory representation that minimizes a perceptual objective function and transforms the auditory pattern back to
Applications of ENF criterion in forensic audio, video, computer and telecommunication analysis.

PubMed

Grigoras, Catalin

2007-04-11

This article reports on the electric network frequency criterion as a means of assessing the integrity of digital audio/video evidence and forensic IT and telecommunication analysis. A brief description is given to different ENF types and phenomena that determine ENF variations. In most situations, to reach a non-authenticity opinion, the visual inspection of spectrograms and comparison with an ENF database are enough. A more detailed investigation, in the time domain, requires short time windows measurements and analyses. The stability of the ENF over geographical distances has been established by comparison of synchronized recordings made at different locations on the same network. Real cases are presented, in which the ENF criterion was used to investigate audio and video files created with secret surveillance systems, a digitized audio/video recording and a TV broadcasted reportage. By applying the ENF Criterion in forensic audio/video analysis, one can determine whether and where a digital recording has been edited, establish whether it was made at the time claimed, and identify the time and date of the registering operation.
Comparison between audio-only and audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy

PubMed Central

Yu, Jesang; Choi, Ji Hoon; Ma, Sun Young; Jeung, Tae Sig

2015-01-01

Purpose To compare audio-only biofeedback to conventional audiovisual biofeedback for regulating patients' respiration during four-dimensional radiotherapy, limiting damage to healthy surrounding tissues caused by organ movement. Materials and Methods Six healthy volunteers were assisted by audiovisual or audio-only biofeedback systems to regulate their respirations. Volunteers breathed through a mask developed for this study by following computer-generated guiding curves displayed on a screen, combined with instructional sounds. They then performed breathing following instructional sounds only. The guiding signals and the volunteers' respiratory signals were logged at 20 samples per second. Results The standard deviations between the guiding and respiratory curves for the audiovisual and audio-only biofeedback systems were 21.55% and 23.19%, respectively; the average correlation coefficients were 0.9778 and 0.9756, respectively. The regularities between audiovisual and audio-only biofeedback for six volunteers' respirations were same statistically from the paired t-test. Conclusion The difference between the audiovisual and audio-only biofeedback methods was not significant. Audio-only biofeedback has many advantages, as patients do not require a mask and can quickly adapt to this method in the clinic. PMID:26484309
Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses.

PubMed

Ogawa, Akitoshi; Bordier, Cecile; Macaluso, Emiliano

2013-01-01

The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard "condition-based" designs, as well as "computational" methods based on the extraction of time-varying features of the stimuli (e.g. motion). Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround), 3D with monaural sound (3D-Mono), 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG). The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life-like stimuli.
Audio-Enhanced Tablet Computers to Assess Children's Food Frequency From Migrant Farmworker Mothers.

PubMed

Kilanowski, Jill F; Trapl, Erika S; Kofron, Ryan M

2013-06-01

This study sought to improve data collection in children's food frequency surveys for non-English speaking immigrant/migrant farmworker mothers using audio-enhanced tablet computers (ATCs). We hypothesized that by using technological adaptations, we would be able to improve data capture and therefore reduce lost surveys. This Food Frequency Questionnaire (FFQ), a paper-based dietary assessment tool, was adapted for ATCs and assessed consumption of 66 food items asking 3 questions for each food item: frequency, quantity of consumption, and serving size. The tablet-based survey was audio enhanced with each question "read" to participants, accompanied by food item images, together with an embedded short instructional video. Results indicated that respondents were able to complete the 198 questions from the 66 food item FFQ on ATCs in approximately 23 minutes. Compared with paper-based FFQs, ATC-based FFQs had less missing data. Despite overall reductions in missing data by use of ATCs, respondents still appeared to have difficulty with question 2 of the FFQ. Ability to score the FFQ was dependent on what sections missing data were located. Unlike the paper-based FFQs, no ATC-based FFQs were unscored due to amount or location of missing data. An ATC-based FFQ was feasible and increased ability to score this survey on children's food patterns from migrant farmworker mothers. This adapted technology may serve as an exemplar for other non-English speaking immigrant populations.
Cross-Modal Matching of Audio-Visual German and French Fluent Speech in Infancy

PubMed Central

Kubicek, Claudia; Hillairet de Boisferon, Anne; Dupierrix, Eve; Pascalis, Olivier; Lœvenbruck, Hélène; Gervain, Judit; Schwarzer, Gudrun

2014-01-01

The present study examined when and how the ability to cross-modally match audio-visual fluent speech develops in 4.5-, 6- and 12-month-old German-learning infants. In Experiment 1, 4.5- and 6-month-old infants’ audio-visual matching ability of native (German) and non-native (French) fluent speech was assessed by presenting auditory and visual speech information sequentially, that is, in the absence of temporal synchrony cues. The results showed that 4.5-month-old infants were capable of matching native as well as non-native audio and visual speech stimuli, whereas 6-month-olds perceived the audio-visual correspondence of native language stimuli only. This suggests that intersensory matching narrows for fluent speech between 4.5 and 6 months of age. In Experiment 2, auditory and visual speech information was presented simultaneously, therefore, providing temporal synchrony cues. Here, 6-month-olds were found to match native as well as non-native speech indicating facilitation of temporal synchrony cues on the intersensory perception of non-native fluent speech. Intriguingly, despite the fact that audio and visual stimuli cohered temporally, 12-month-olds matched the non-native language only. Results were discussed with regard to multisensory perceptual narrowing during the first year of life. PMID:24586651
Transcript of Audio Narrative Portion of: Scandinavian Heritage. A Set of Five Audio-Visual Film Strip/Cassette Presentations.

ERIC Educational Resources Information Center

Anderson, Gerald D.; Olson, David B.

The document presents the transcript of the audio narrative portion of approximately 100 interviews with first and second generation Scandinavian immigrants to the United States. The document is intended for use by secondary school classroom teachers as they develop and implement educational programs related to the Scandinavian heritage in…
Audio Spectrogram Representations for Processing with Convolutional Neural Networks

NASA Astrophysics Data System (ADS)

Wyse, L.

2017-05-01

One of the decisions that arise when designing a neural network for any application is how the data should be represented in order to be presented to, and possibly generated by, a neural network. For audio, the choice is less obvious than it seems to be for visual images, and a variety of representations have been used for different applications including the raw digitized sample stream, hand-crafted features, machine discovered features, MFCCs and variants that include deltas, and a variety of spectral representations. This paper reviews some of these representations and issues that arise, focusing particularly on spectrograms for generating audio using neural networks for style transfer.
Exploring Meaning Negotiation Patterns in Synchronous Audio and Video Conferencing English Classes in China

ERIC Educational Resources Information Center

Li, Chenxi; Wu, Ligao; Li, Chen; Tang, Jinlan

2017-01-01

This work-in-progress doctoral research project aims to identify meaning negotiation patterns in synchronous audio and video Computer-Mediated Communication (CMC) environments based on the model of CMC text chat proposed by Smith (2003). The study was conducted in the Institute of Online Education at Beijing Foreign Studies University. Four dyads…
A first demonstration of audio-frequency optical coherence elastography of tissue

NASA Astrophysics Data System (ADS)

Adie, Steven G.; Alexandrov, Sergey A.; Armstrong, Julian J.; Kennedy, Brendan F.; Sampson, David D.

2008-12-01

Optical elastography is aimed at using the visco-elastic properties of soft tissue as a contrast mechanism, and could be particularly suitable for high-resolution differentiation of tumour from surrounding normal tissue. We present a new approach to measure the effect of an applied stimulus in the kilohertz frequency range that is based on optical coherence tomography. We describe the approach and present the first in vivo optical coherence elastography measurements in human skin at audio excitation frequencies.
Auditory and audio-visual processing in patients with cochlear, auditory brainstem, and auditory midbrain implants: An EEG study.

PubMed

Schierholz, Irina; Finke, Mareike; Kral, Andrej; Büchner, Andreas; Rach, Stefan; Lenarz, Thomas; Dengler, Reinhard; Sandmann, Pascale

2017-04-01

There is substantial variability in speech recognition ability across patients with cochlear implants (CIs), auditory brainstem implants (ABIs), and auditory midbrain implants (AMIs). To better understand how this variability is related to central processing differences, the current electroencephalography (EEG) study compared hearing abilities and auditory-cortex activation in patients with electrical stimulation at different sites of the auditory pathway. Three different groups of patients with auditory implants (Hannover Medical School; ABI: n = 6, CI: n = 6; AMI: n = 2) performed a speeded response task and a speech recognition test with auditory, visual, and audio-visual stimuli. Behavioral performance and cortical processing of auditory and audio-visual stimuli were compared between groups. ABI and AMI patients showed prolonged response times on auditory and audio-visual stimuli compared with NH listeners and CI patients. This was confirmed by prolonged N1 latencies and reduced N1 amplitudes in ABI and AMI patients. However, patients with central auditory implants showed a remarkable gain in performance when visual and auditory input was combined, in both speech and non-speech conditions, which was reflected by a strong visual modulation of auditory-cortex activation in these individuals. In sum, the results suggest that the behavioral improvement for audio-visual conditions in central auditory implant patients is based on enhanced audio-visual interactions in the auditory cortex. Their findings may provide important implications for the optimization of electrical stimulation and rehabilitation strategies in patients with central auditory prostheses. Hum Brain Mapp 38:2206-2225, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Audio-Visual Perception of 3D Cinematography: An fMRI Study Using Condition-Based and Computation-Based Analyses

PubMed Central

Ogawa, Akitoshi; Bordier, Cecile; Macaluso, Emiliano

2013-01-01

The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard “condition-based” designs, as well as “computational” methods based on the extraction of time-varying features of the stimuli (e.g. motion). Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround), 3D with monaural sound (3D-Mono), 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG). The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life-like stimuli
Detection of emetic activity in the cat by monitoring venous pressure and audio signals

NASA Technical Reports Server (NTRS)

Nagahara, A.; Fox, Robert A.; Daunton, Nancy G.; Elfar, S.

1991-01-01

To investigate the use of audio signals as a simple, noninvasive measure of emetic activity, the relationship between the somatic events and sounds associated with retching and vomiting was studied. Thoracic venous pressure obtained from an implanted external jugular catheter was shown to provide a precise measure of the somatic events associated with retching and vomiting. Changes in thoracic venous pressure monitored through an indwelling external jugular catheter with audio signals, obtained from a microphone located above the animal in a test chamber, were compared. In addition, two independent observers visually monitored emetic episodes. Retching and vomiting were induced by injection of xylazine (0.66mg/kg s.c.), or by motion. A unique audio signal at a frequency of approximately 250 Hz is produced at the time of the negative thoracic venous pressure change associated with retching. Sounds with higher frequencies (around 2500 Hz) occur in conjunction with the positive pressure changes associated with vomiting. These specific signals could be discriminated reliably by individuals reviewing the audio recordings of the sessions. Retching and those emetic episodes associated with positive venous pressure changes were detected accurately by audio monitoring, with 90 percent of retches and 100 percent of emetic episodes correctly identified. Retching was detected more accurately (p is less than .05) by audio monitoring than by direct visual observation. However, with visual observation a few incidents in which stomach contents were expelled in the absence of positive pressure changes or detectable sounds were identified. These data suggest that in emetic situations, the expulsion of stomach contents may be accomplished by more than one neuromuscular system and that audio signals can be used to detect emetic episodes associated with thoracic venous pressure changes.
Agency Video, Audio and Imagery Library

NASA Technical Reports Server (NTRS)

Grubbs, Rodney

2015-01-01

The purpose of this presentation was to inform the ISS International Partners of the new NASA Agency Video, Audio and Imagery Library (AVAIL) website. AVAIL is a new resource for the public to search for and download NASA-related imagery, and is not intended to replace the current process by which the International Partners receive their Space Station imagery products.
A Robust Zero-Watermarking Algorithm for Audio

NASA Astrophysics Data System (ADS)

Chen, Ning; Zhu, Jie

2007-12-01

In traditional watermarking algorithms, the insertion of watermark into the host signal inevitably introduces some perceptible quality degradation. Another problem is the inherent conflict between imperceptibility and robustness. Zero-watermarking technique can solve these problems successfully. Instead of embedding watermark, the zero-watermarking technique extracts some essential characteristics from the host signal and uses them for watermark detection. However, most of the available zero-watermarking schemes are designed for still image and their robustness is not satisfactory. In this paper, an efficient and robust zero-watermarking technique for audio signal is presented. The multiresolution characteristic of discrete wavelet transform (DWT), the energy compression characteristic of discrete cosine transform (DCT), and the Gaussian noise suppression property of higher-order cumulant are combined to extract essential features from the host audio signal and they are then used for watermark recovery. Simulation results demonstrate the effectiveness of our scheme in terms of inaudibility, detection reliability, and robustness.
Audio feature extraction using probability distribution function

NASA Astrophysics Data System (ADS)

Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.

2015-05-01

Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.
Worldwide survey of direct-to-listener digital audio delivery systems development since WARC-1992

NASA Technical Reports Server (NTRS)

Messer, Dion D.

1993-01-01

Each country was allocated frequency band(s) for direct-to-listener digital audio broadcasting at WARC-92. These allocations were near 1500, 2300, and 2600 MHz. In addition, some countries are encouraging the development of digital audio broadcasting services for terrestrial delivery only in the VHF bands (at frequencies from roughly 50 to 300 MHz) and in the medium-wave broadcasting band (AM band) (from roughly 0.5 to 1.7 MHz). The development activity increase was explosive. Current development, as of February 1993, as it is known to the author is summarized. The information given includes the following characteristics, as appropriate, for each planned system: coverage areas, audio quality, number of audio channels, delivery via satellite/terrestrial or both, carrier frequency bands, modulation methods, source coding, and channel coding. Most proponents claim that they will be operational in 3 or 4 years.
When the third party observer of a neuropsychological evaluation is an audio-recorder.

PubMed

Constantinou, Marios; Ashendorf, Lee; McCaffrey, Robert J

2002-08-01

The presence of third parties during neuropsychological evaluations is an issue of concern for contemporary neuropsychologists. Previous studies have reported that the presence of an observer during neuropsychological testing alters the performance of individuals under evaluation. The present study sought to investigate whether audio-recording affects the neuropsychological test performance of individuals in the same way that third party observation does. In the presence of an audio-recorder the performance of the participants on memory tests declined. Performance on motor tests, on the other hand, was not affected by the presence of an audio-recorder. The implications of these findings in forensic neuropsychological evaluations are discussed.

Multidimensional QoE of Multiview Video and Selectable Audio IP Transmission

PubMed Central

Nunome, Toshiro; Ishida, Takuya

2015-01-01

We evaluate QoE of multiview video and selectable audio (MVV-SA), in which users can switch not only video but also audio according to a viewpoint change request, transmitted over IP networks by a subjective experiment. The evaluation is performed by the semantic differential (SD) method with 13 adjective pairs. In the subjective experiment, we ask assessors to evaluate 40 stimuli which consist of two kinds of UDP load traffic, two kinds of fixed additional delay, five kinds of playout buffering time, and selectable or unselectable audio (i.e., MVV-SA or the previous MVV-A). As a result, MVV-SA gives higher presence to the user than MVV-A and then enhances QoE. In addition, we employ factor analysis for subjective assessment results to clarify the component factors of QoE. We then find that three major factors affect QoE in MVV-SA. PMID:26106640
Development and Evaluation of a Feedback Support System with Audio and Playback Strokes

ERIC Educational Resources Information Center

Li, Kai; Akahori, Kanji

2008-01-01

This paper describes the development and evaluation of a handwritten correction support system with audio and playback strokes used to teach Japanese writing. The study examined whether audio and playback strokes have a positive effect on students using honorific expressions in Japanese writing. The results showed that error feedback with audio…
SPACE FOR AUDIO-VISUAL LARGE GROUP INSTRUCTION.

ERIC Educational Resources Information Center

GAUSEWITZ, CARL H.

WITH AN INCREASING INTEREST IN AND UTILIZATION OF AUDIO-VISUAL MEDIA IN EDUCATION FACILITIES, IT IS IMPORTANT THAT STANDARDS ARE ESTABLISHED FOR ESTIMATING THE SPACE REQUIRED FOR VIEWING THESE VARIOUS MEDIA. THIS MONOGRAPH SUGGESTS SUCH STANDARDS FOR VIEWING AREAS, VIEWING ANGLES, SEATING PATTERNS, SCREEN CHARACTERISTICS AND EQUIPMENT PERFORMANCES…
Improving Audio Quality in Distance Learning Applications.

ERIC Educational Resources Information Center

Richardson, Craig H.

This paper discusses common causes of problems encountered with audio systems in distance learning networks and offers practical suggestions for correcting the problems. Problems and discussions are divided into nine categories: (1) acoustics, including reverberant classrooms leading to distorted or garbled voices, as well as one-dimensional audio…
HomeBank: An Online Repository of Daylong Child-Centered Audio Recordings

PubMed Central

VanDam, Mark; Warlaumont, Anne S.; Bergelson, Elika; Cristia, Alejandrina; Soderstrom, Melanie; De Palma, Paul; MacWhinney, Brian

2017-01-01

HomeBank is introduced here. It is a public, permanent, extensible, online database of daylong audio recorded in naturalistic environments. HomeBank serves two primary purposes. First, it is a repository for raw audio and associated files: one database requires special permissions, and another redacted database allows unrestricted public access. Associated files include metadata such as participant demographics and clinical diagnostics, automated annotations, and human-generated transcriptions and annotations. Many recordings use the child-perspective LENA recorders (LENA Research Foundation, Boulder, Colorado, United States), but various recordings and metadata can be accommodated. The HomeBank database can have both vetted and unvetted recordings, with different levels of accessibility. Additionally, HomeBank is an open repository for processing and analysis tools for HomeBank or similar data sets. HomeBank is flexible for users and contributors, making primary data available to researchers, especially those in child development, linguistics, and audio engineering. HomeBank facilitates researchers’ access to large-scale data and tools, linking the acoustic, auditory, and linguistic characteristics of children’s environments with a variety of variables including socioeconomic status, family characteristics, language trajectories, and disorders. Automated processing applied to daylong home audio recordings is now becoming widely used in early intervention initiatives, helping parents to provide richer speech input to at-risk children. PMID:27111272
Active Learning in the Online Environment: The Integration of Student-Generated Audio Files

ERIC Educational Resources Information Center

Bolliger, Doris U.; Armier, David Des, Jr.

2013-01-01

Educators have integrated instructor-produced audio files in a variety of settings and environments for purposes such as content presentation, lecture reviews, student feedback, and so forth. Few instructors, however, require students to produce audio files and share them with peers. The purpose of this study was to obtain empirical data on…
Audio-visual speech intelligibility benefits with bilateral cochlear implants when talker location varies.

PubMed

van Hoesel, Richard J M

2015-04-01

One of the key benefits of using cochlear implants (CIs) in both ears rather than just one is improved localization. It is likely that in complex listening scenes, improved localization allows bilateral CI users to orient toward talkers to improve signal-to-noise ratios and gain access to visual cues, but to date, that conjecture has not been tested. To obtain an objective measure of that benefit, seven bilateral CI users were assessed for both auditory-only and audio-visual speech intelligibility in noise using a novel dynamic spatial audio-visual test paradigm. For each trial conducted in spatially distributed noise, first, an auditory-only cueing phrase that was spoken by one of four talkers was selected and presented from one of four locations. Shortly afterward, a target sentence was presented that was either audio-visual or, in another test configuration, audio-only and was spoken by the same talker and from the same location as the cueing phrase. During the target presentation, visual distractors were added at other spatial locations. Results showed that in terms of speech reception thresholds (SRTs), the average improvement for bilateral listening over the better performing ear alone was 9 dB for the audio-visual mode, and 3 dB for audition-alone. Comparison of bilateral performance for audio-visual and audition-alone showed that inclusion of visual cues led to an average SRT improvement of 5 dB. For unilateral device use, no such benefit arose, presumably due to the greatly reduced ability to localize the target talker to acquire visual information. The bilateral CI speech intelligibility advantage over the better ear in the present study is much larger than that previously reported for static talker locations and indicates greater everyday speech benefits and improved cost-benefit than estimated to date.
Linguistic experience and audio-visual perception of non-native fricatives.

PubMed

Wang, Yue; Behne, Dawn M; Jiang, Haisheng

2008-09-01

This study examined the effects of linguistic experience on audio-visual (AV) perception of non-native (L2) speech. Canadian English natives and Mandarin Chinese natives differing in degree of English exposure [long and short length of residence (LOR) in Canada] were presented with English fricatives of three visually distinct places of articulation: interdentals nonexistent in Mandarin and labiodentals and alveolars common in both languages. Stimuli were presented in quiet and in a cafe-noise background in four ways: audio only (A), visual only (V), congruent AV (AVc), and incongruent AV (AVi). Identification results showed that overall performance was better in the AVc than in the A or V condition and better in quiet than in cafe noise. While the Mandarin long LOR group approximated the native English patterns, the short LOR group showed poorer interdental identification, more reliance on visual information, and greater AV-fusion with the AVi materials, indicating the failure of L2 visual speech category formation with the short LOR non-natives and the positive effects of linguistic experience with the long LOR non-natives. These results point to an integrated network in AV speech processing as a function of linguistic background and provide evidence to extend auditory-based L2 speech learning theories to the visual domain.
Reasons to Rethink the Use of Audio and Video Lectures in Online Courses

ERIC Educational Resources Information Center

Stetz, Thomas A.; Bauman, Antonina A.

2013-01-01

Recent technological developments allow any instructor to create audio and video lectures for the use in online classes. However, it is questionable if it is worth the time and effort that faculty put into preparing those lectures. This paper presents thirteen factors that should be considered before preparing and using audio and video lectures in…
A haptic-inspired audio approach for structural health monitoring decision-making

NASA Astrophysics Data System (ADS)

Mao, Zhu; Todd, Michael; Mascareñas, David

2015-03-01

Haptics is the field at the interface of human touch (tactile sensation) and classification, whereby tactile feedback is used to train and inform a decision-making process. In structural health monitoring (SHM) applications, haptic devices have been introduced and applied in a simplified laboratory scale scenario, in which nonlinearity, representing the presence of damage, was encoded into a vibratory manual interface. In this paper, the "spirit" of haptics is adopted, but here ultrasonic guided wave scattering information is transformed into audio (rather than tactile) range signals. After sufficient training, the structural damage condition, including occurrence and location, can be identified through the encoded audio waveforms. Different algorithms are employed in this paper to generate the transformed audio signals and the performance of each encoding algorithms is compared, and also compared with standard machine learning classifiers. In the long run, the haptic decision-making is aiming to detect and classify structural damages in a more rigorous environment, and approaching a baseline-free fashion with embedded temperature compensation.
37 CFR 201.28 - Statements of Account for digital audio recording devices or media.

Code of Federal Regulations, 2011 CFR

2011-07-01

... digital audio recording devices or media. 201.28 Section 201.28 Patents, Trademarks, and Copyrights COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE AND PROCEDURES GENERAL PROVISIONS § 201.28 Statements of Account for digital audio recording devices or media. (a) General. This section prescribes rules...
37 CFR 201.28 - Statements of Account for digital audio recording devices or media.

Code of Federal Regulations, 2012 CFR

2012-07-01

... digital audio recording devices or media. 201.28 Section 201.28 Patents, Trademarks, and Copyrights COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE AND PROCEDURES GENERAL PROVISIONS § 201.28 Statements of Account for digital audio recording devices or media. (a) General. This section prescribes rules...
37 CFR 201.28 - Statements of Account for digital audio recording devices or media.

Code of Federal Regulations, 2013 CFR

2013-07-01

... digital audio recording devices or media. 201.28 Section 201.28 Patents, Trademarks, and Copyrights COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE AND PROCEDURES GENERAL PROVISIONS § 201.28 Statements of Account for digital audio recording devices or media. (a) General. This section prescribes rules...
Audio-Enhanced Tablet Computers to Assess Children’s Food Frequency From Migrant Farmworker Mothers

PubMed Central

Kilanowski, Jill F.; Trapl, Erika S.; Kofron, Ryan M.

2014-01-01

This study sought to improve data collection in children’s food frequency surveys for non-English speaking immigrant/migrant farmworker mothers using audio-enhanced tablet computers (ATCs). We hypothesized that by using technological adaptations, we would be able to improve data capture and therefore reduce lost surveys. This Food Frequency Questionnaire (FFQ), a paper-based dietary assessment tool, was adapted for ATCs and assessed consumption of 66 food items asking 3 questions for each food item: frequency, quantity of consumption, and serving size. The tablet-based survey was audio enhanced with each question “read” to participants, accompanied by food item images, together with an embedded short instructional video. Results indicated that respondents were able to complete the 198 questions from the 66 food item FFQ on ATCs in approximately 23 minutes. Compared with paper-based FFQs, ATC-based FFQs had less missing data. Despite overall reductions in missing data by use of ATCs, respondents still appeared to have difficulty with question 2 of the FFQ. Ability to score the FFQ was dependent on what sections missing data were located. Unlike the paper-based FFQs, no ATC-based FFQs were unscored due to amount or location of missing data. An ATC-based FFQ was feasible and increased ability to score this survey on children’s food patterns from migrant farmworker mothers. This adapted technology may serve as an exemplar for other non-English speaking immigrant populations. PMID:25343004
Description of Audio-Visual Recording Equipment and Method of Installation for Pilot Training.

ERIC Educational Resources Information Center

Neese, James A.

The Audio-Video Recorder System was developed to evaluate the effectiveness of in-flight audio/video recording as a pilot training technique for the U.S. Air Force Pilot Training Program. It will be used to gather background and performance data for an experimental program. A detailed description of the system is presented and construction and…
Building Digital Audio Preservation Infrastructure and Workflows

ERIC Educational Resources Information Center

Young, Anjanette; Olivieri, Blynne; Eckler, Karl; Gerontakos, Theodore

2010-01-01

In 2009 the University of Washington (UW) Libraries special collections received funding for the digital preservation of its audio indigenous language holdings. The university libraries, where the authors work in various capacities, had begun digitizing image and text collections in 1997. Because of this, at the onset of the project, workflows (a…
Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review

PubMed Central

Santos, Rui; Pombo, Nuno; Flórez-Revuelta, Francisco

2018-01-01

An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of the sound in that particular environment. This paper reviews audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices. A comprehensive literature search was conducted in order to identify relevant English language works aimed at the identification of the environment of ADLs using data acquired with mobile devices, published between 2002 and 2017. In total, 40 studies were analyzed and selected from 115 citations. The results highlight several audio fingerprinting techniques, including Modified discrete cosine transform (MDCT), Mel-frequency cepstrum coefficients (MFCC), Principal Component Analysis (PCA), Fast Fourier Transform (FFT), Gaussian mixture models (GMM), likelihood estimation, logarithmic moduled complex lapped transform (LMCLT), support vector machine (SVM), constant Q transform (CQT), symmetric pairwise boosting (SPB), Philips robust hash (PRH), linear discriminant analysis (LDA) and discrete cosine transform (DCT). PMID:29315232
37 CFR 201.28 - Statements of Account for digital audio recording devices or media.

Code of Federal Regulations, 2014 CFR

2014-07-01

... digital audio recording devices or media. 201.28 Section 201.28 Patents, Trademarks, and Copyrights U.S. COPYRIGHT OFFICE, LIBRARY OF CONGRESS COPYRIGHT OFFICE AND PROCEDURES GENERAL PROVISIONS § 201.28 Statements of Account for digital audio recording devices or media. (a) General. This section prescribes rules...
Effects of audio-visual presentation of target words in word translation training

NASA Astrophysics Data System (ADS)

Akahane-Yamada, Reiko; Komaki, Ryo; Kubo, Rieko

2004-05-01

Komaki and Akahane-Yamada (Proc. ICA2004) used 2AFC translation task in vocabulary training, in which the target word is presented visually in orthographic form of one language, and the appropriate meaning in another language has to be chosen between two choices. Present paper examined the effect of audio-visual presentation of target word when native speakers of Japanese learn to translate English words into Japanese. Pairs of English words contrasted in several phonemic distinctions (e.g., /r/-/l/, /b/-/v/, etc.) were used as word materials, and presented in three conditions; visual-only (V), audio-only (A), and audio-visual (AV) presentations. Identification accuracy of those words produced by two talkers was also assessed. During pretest, the accuracy for A stimuli was lowest, implying that insufficient translation ability and listening ability interact with each other when aurally presented word has to be translated. However, there was no difference in accuracy between V and AV stimuli, suggesting that participants translate the words depending on visual information only. The effect of translation training using AV stimuli did not transfer to identification ability, showing that additional audio information during translation does not help improve speech perception. Further examination is necessary to determine the effective L2 training method. [Work supported by TAO, Japan.
Audio spectrum and sound pressure levels vary between pulse oximeters.

PubMed

Chandra, Deven; Tessler, Michael J; Usher, John

2006-01-01

The variable-pitch pulse oximeter is an important intraoperative patient monitor. Our ability to hear its auditory signal depends on its acoustical properties and our hearing. This study quantitatively describes the audio spectrum and sound pressure levels of the monitoring tones produced by five variable-pitch pulse oximeters. We compared the Datex-Ohmeda Capnomac Ultima, Hewlett-Packard M1166A, Datex-Engstrom AS/3, Ohmeda Biox 3700, and Datex-Ohmeda 3800 oximeters. Three machines of each of the five models were assessed for sound pressure levels (using a precision sound level meter) and audio spectrum (using a hanning windowed fast Fourier trans-form of three beats at saturations of 99%, 90%, and 85%). The widest range of sound pressure levels was produced by the Hewlett-Packard M1166A (46.5 +/- 1.74 dB to 76.9 +/- 2.77 dB). The loudest model was the Datex-Engstrom AS/3 (89.2 +/- 5.36 dB). Three oximeters, when set to the lower ranges of their volume settings, were indistinguishable from background operating room noise. Each model produced sounds with different audio spectra. Although each model produced a fundamental tone with multiple harmonic overtones, the number of harmonics varied with each model; from three harmonic tones on the Hewlett-Packard M1166A, to 12 on the Ohmeda Biox 3700. There were variations between models, and individual machines of the same model with respect to the fundamental tone associated with a given saturation. There is considerable variance in the sound pressure and audio spectrum of commercially-available pulse oximeters. Further studies are warranted in order to establish standards.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.