NASA Astrophysics Data System (ADS)
Ng, Theam Foo; Pham, Tuan D.; Zhou, Xiaobo
2010-01-01
With the fast development of multi-dimensional data compression and pattern classification techniques, vector quantization (VQ) has become a system that allows large reduction of data storage and computational effort. One of the most recent VQ techniques that handle the poor estimation of vector centroids due to biased data from undersampling is to use fuzzy declustering-based vector quantization (FDVQ) technique. Therefore, in this paper, we are motivated to propose a justification of FDVQ based hidden Markov model (HMM) for investigating its effectiveness and efficiency in classification of genotype-image phenotypes. The performance evaluation and comparison of the recognition accuracy between a proposed FDVQ based HMM (FDVQ-HMM) and a well-known LBG (Linde, Buzo, Gray) vector quantization based HMM (LBG-HMM) will be carried out. The experimental results show that the performances of both FDVQ-HMM and LBG-HMM are almost similar. Finally, we have justified the competitiveness of FDVQ-HMM in classification of cellular phenotype image database by using hypotheses t-test. As a result, we have validated that the FDVQ algorithm is a robust and an efficient classification technique in the application of RNAi genome-wide screening image data.
Cough event classification by pretrained deep neural network.
Liu, Jia-Ming; You, Mingyu; Wang, Zheng; Li, Guo-Zheng; Xu, Xianghuai; Qiu, Zhongmin
2015-01-01
Cough is an essential symptom in respiratory diseases. In the measurement of cough severity, an accurate and objective cough monitor is expected by respiratory disease society. This paper aims to introduce a better performed algorithm, pretrained deep neural network (DNN), to the cough classification problem, which is a key step in the cough monitor. The deep neural network models are built from two steps, pretrain and fine-tuning, followed by a Hidden Markov Model (HMM) decoder to capture tamporal information of the audio signals. By unsupervised pretraining a deep belief network, a good initialization for a deep neural network is learned. Then the fine-tuning step is a back propogation tuning the neural network so that it can predict the observation probability associated with each HMM states, where the HMM states are originally achieved by force-alignment with a Gaussian Mixture Model Hidden Markov Model (GMM-HMM) on the training samples. Three cough HMMs and one noncough HMM are employed to model coughs and noncoughs respectively. The final decision is made based on viterbi decoding algorihtm that generates the most likely HMM sequence for each sample. A sample is labeled as cough if a cough HMM is found in the sequence. The experiments were conducted on a dataset that was collected from 22 patients with respiratory diseases. Patient dependent (PD) and patient independent (PI) experimental settings were used to evaluate the models. Five criteria, sensitivity, specificity, F1, macro average and micro average are shown to depict different aspects of the models. From overall evaluation criteria, the DNN based methods are superior to traditional GMM-HMM based method on F1 and micro average with maximal 14% and 11% error reduction in PD and 7% and 10% in PI, meanwhile keep similar performances on macro average. They also surpass GMM-HMM model on specificity with maximal 14% error reduction on both PD and PI. In this paper, we tried pretrained deep neural network in cough classification problem. Our results showed that comparing with the conventional GMM-HMM framework, the HMM-DNN could get better overall performance on cough classification task.
Hidden Semi-Markov Models and Their Application
NASA Astrophysics Data System (ADS)
Beyreuther, M.; Wassermann, J.
2008-12-01
In the framework of detection and classification of seismic signals there are several different approaches. Our choice for a more robust detection and classification algorithm is to adopt Hidden Markov Models (HMM), a technique showing major success in speech recognition. HMM provide a powerful tool to describe highly variable time series based on a double stochastic model and therefore allow for a broader class description than e.g. template based pattern matching techniques. Being a fully probabilistic model, HMM directly provide a confidence measure of an estimated classification. Furthermore and in contrast to classic artificial neuronal networks or support vector machines, HMM are incorporating the time dependence explicitly in the models thus providing a adequate representation of the seismic signal. As the majority of detection algorithms, HMM are not based on the time and amplitude dependent seismogram itself but on features estimated from the seismogram which characterize the different classes. Features, or in other words characteristic functions, are e.g. the sonogram bands, instantaneous frequency, instantaneous bandwidth or centroid time. In this study we apply continuous Hidden Semi-Markov Models (HSMM), an extension of continuous HMM. The duration probability of a HMM is an exponentially decaying function of the time, which is not a realistic representation of the duration of an earthquake. In contrast HSMM use Gaussians as duration probabilities, which results in an more adequate model. The HSMM detection and classification system is running online as an EARTHWORM module at the Bavarian Earthquake Service. Here the signals that are to be classified simply differ in epicentral distance. This makes it possible to easily decide whether a classification is correct or wrong and thus allows to better evaluate the advantages and disadvantages of the proposed algorithm. The evaluation is based on several month long continuous data and the results are additionally compared to the previously published discrete HMM, continuous HMM and a classic STA/LTA. The intermediate evaluation results are very promising.
Monthly streamflow forecasting based on hidden Markov model and Gaussian Mixture Regression
NASA Astrophysics Data System (ADS)
Liu, Yongqi; Ye, Lei; Qin, Hui; Hong, Xiaofeng; Ye, Jiajun; Yin, Xingli
2018-06-01
Reliable streamflow forecasts can be highly valuable for water resources planning and management. In this study, we combined a hidden Markov model (HMM) and Gaussian Mixture Regression (GMR) for probabilistic monthly streamflow forecasting. The HMM is initialized using a kernelized K-medoids clustering method, and the Baum-Welch algorithm is then executed to learn the model parameters. GMR derives a conditional probability distribution for the predictand given covariate information, including the antecedent flow at a local station and two surrounding stations. The performance of HMM-GMR was verified based on the mean square error and continuous ranked probability score skill scores. The reliability of the forecasts was assessed by examining the uniformity of the probability integral transform values. The results show that HMM-GMR obtained reasonably high skill scores and the uncertainty spread was appropriate. Different HMM states were assumed to be different climate conditions, which would lead to different types of observed values. We demonstrated that the HMM-GMR approach can handle multimodal and heteroscedastic data.
NASA Astrophysics Data System (ADS)
Nishiura, Takanobu; Nakamura, Satoshi
2003-10-01
Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self-moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)-based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound-added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional HMM composition with speech HMMs and a noise (environmental sound) HMM trained using noise periods prior to the target speech in a captured signal. [Work supported by Ministry of Public Management, Home Affairs, Posts and Telecommunications of Japan.
Efficient view based 3-D object retrieval using Hidden Markov Model
NASA Astrophysics Data System (ADS)
Jain, Yogendra Kumar; Singh, Roshan Kumar
2013-12-01
Recent research effort has been dedicated to view based 3-D object retrieval, because of highly discriminative property of 3-D object and has multi view representation. The state-of-art method is highly depending on their own camera array setting for capturing views of 3-D object and use complex Zernike descriptor, HAC for representative view selection which limit their practical application and make it inefficient for retrieval. Therefore, an efficient and effective algorithm is required for 3-D Object Retrieval. In order to move toward a general framework for efficient 3-D object retrieval which is independent of camera array setting and avoidance of representative view selection, we propose an Efficient View Based 3-D Object Retrieval (EVBOR) method using Hidden Markov Model (HMM). In this framework, each object is represented by independent set of view, which means views are captured from any direction without any camera array restriction. In this, views are clustered (including query view) to generate the view cluster, which is then used to build the query model with HMM. In our proposed method, HMM is used in twofold: in the training (i.e. HMM estimate) and in the retrieval (i.e. HMM decode). The query model is trained by using these view clusters. The EVBOR query model is worked on the basis of query model combining with HMM. The proposed approach remove statically camera array setting for view capturing and can be apply for any 3-D object database to retrieve 3-D object efficiently and effectively. Experimental results demonstrate that the proposed scheme has shown better performance than existing methods. [Figure not available: see fulltext.
Schulz, Vincent; Chen, Min; Tuck, David
2010-01-01
Background Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data. Methods We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells. Conclusions We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM. Availability The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM. PMID:20532221
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sabyasachi; Das, Nandan K.; Kurmi, Indrajit; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2017-10-01
We report the application of a hidden Markov model (HMM) on multifractal tissue optical properties derived via the Born approximation-based inverse light scattering method for effective discrimination of precancerous human cervical tissue sites from the normal ones. Two global fractal parameters, generalized Hurst exponent and the corresponding singularity spectrum width, computed by multifractal detrended fluctuation analysis (MFDFA), are used here as potential biomarkers. We develop a methodology that makes use of these multifractal parameters by integrating with different statistical classifiers like the HMM and support vector machine (SVM). It is shown that the MFDFA-HMM integrated model achieves significantly better discrimination between normal and different grades of cancer as compared to the MFDFA-SVM integrated model.
Research on gait-based human identification
NASA Astrophysics Data System (ADS)
Li, Youguo
Gait recognition refers to automatic identification of individual based on his/her style of walking. This paper proposes a gait recognition method based on Continuous Hidden Markov Model with Mixture of Gaussians(G-CHMM). First, we initialize a Gaussian mix model for training image sequence with K-means algorithm, then train the HMM parameters using a Baum-Welch algorithm. These gait feature sequences can be trained and obtain a Continuous HMM for every person, therefore, the 7 key frames and the obtained HMM can represent each person's gait sequence. Finally, the recognition is achieved by Front algorithm. The experiments made on CASIA gait databases obtain comparatively high correction identification ratio and comparatively strong robustness for variety of bodily angle.
Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression
Liu, Yu-Ying; Li, Shuang; Li, Fuxin; Song, Le; Rehg, James M.
2016-01-01
The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive approach to modeling disease progression due to its ability to describe noisy observations arriving irregularly in time. However, the lack of an efficient parameter learning algorithm for CT-HMM restricts its use to very small models or requires unrealistic constraints on the state transitions. In this paper, we present the first complete characterization of efficient EM-based learning methods for CT-HMM models. We demonstrate that the learning problem consists of two challenges: the estimation of posterior state probabilities and the computation of end-state conditioned statistics. We solve the first challenge by reformulating the estimation problem in terms of an equivalent discrete time-inhomogeneous hidden Markov model. The second challenge is addressed by adapting three approaches from the continuous time Markov chain literature to the CT-HMM domain. We demonstrate the use of CT-HMMs with more than 100 states to visualize and predict disease progression using a glaucoma dataset and an Alzheimer’s disease dataset. PMID:27019571
A state-based probabilistic model for tumor respiratory motion prediction
NASA Astrophysics Data System (ADS)
Kalet, Alan; Sandison, George; Wu, Huanmei; Schmitz, Ruth
2010-12-01
This work proposes a new probabilistic mathematical model for predicting tumor motion and position based on a finite state representation using the natural breathing states of exhale, inhale and end of exhale. Tumor motion was broken down into linear breathing states and sequences of states. Breathing state sequences and the observables representing those sequences were analyzed using a hidden Markov model (HMM) to predict the future sequences and new observables. Velocities and other parameters were clustered using a k-means clustering algorithm to associate each state with a set of observables such that a prediction of state also enables a prediction of tumor velocity. A time average model with predictions based on average past state lengths was also computed. State sequences which are known a priori to fit the data were fed into the HMM algorithm to set a theoretical limit of the predictive power of the model. The effectiveness of the presented probabilistic model has been evaluated for gated radiation therapy based on previously tracked tumor motion in four lung cancer patients. Positional prediction accuracy is compared with actual position in terms of the overall RMS errors. Various system delays, ranging from 33 to 1000 ms, were tested. Previous studies have shown duty cycles for latencies of 33 and 200 ms at around 90% and 80%, respectively, for linear, no prediction, Kalman filter and ANN methods as averaged over multiple patients. At 1000 ms, the previously reported duty cycles range from approximately 62% (ANN) down to 34% (no prediction). Average duty cycle for the HMM method was found to be 100% and 91 ± 3% for 33 and 200 ms latency and around 40% for 1000 ms latency in three out of four breathing motion traces. RMS errors were found to be lower than linear and no prediction methods at latencies of 1000 ms. The results show that for system latencies longer than 400 ms, the time average HMM prediction outperforms linear, no prediction, and the more general HMM-type predictive models. RMS errors for the time average model approach the theoretical limit of the HMM, and predicted state sequences are well correlated with sequences known to fit the data.
An economic evaluation of home management of malaria in Uganda: an interactive Markov model.
Lubell, Yoel; Mills, Anne J; Whitty, Christopher J M; Staedke, Sarah G
2010-08-27
Home management of malaria (HMM), promoting presumptive treatment of febrile children in the community, is advocated to improve prompt appropriate treatment of malaria in Africa. The cost-effectiveness of HMM is likely to vary widely in different settings and with the antimalarial drugs used. However, no data on the cost-effectiveness of HMM programmes are available. A Markov model was constructed to estimate the cost-effectiveness of HMM as compared to conventional care for febrile illnesses in children without HMM. The model was populated with data from Uganda, but is designed to be interactive, allowing the user to adjust certain parameters, including the antimalarials distributed. The model calculates the cost per disability adjusted life year averted and presents the incremental cost-effectiveness ratio compared to a threshold value. Model output is stratified by level of malaria transmission and the probability that a child would receive appropriate care from a health facility, to indicate the circumstances in which HMM is likely to be cost-effective. The model output suggests that the cost-effectiveness of HMM varies with malaria transmission, the probability of appropriate care, and the drug distributed. Where transmission is high and the probability of appropriate care is limited, HMM is likely to be cost-effective from a provider perspective. Even with the most effective antimalarials, HMM remains an attractive intervention only in areas of high malaria transmission and in medium transmission areas with a lower probability of appropriate care. HMM is generally not cost-effective in low transmission areas, regardless of which antimalarial is distributed. Considering the analysis from the societal perspective decreases the attractiveness of HMM. Syndromic HMM for children with fever may be a useful strategy for higher transmission settings with limited health care and diagnosis, but is not appropriate for all settings. HMM may need to be tailored to specific settings, accounting for local malaria transmission intensity and availability of health services.
Zamunér, Antonio R.; Catai, Aparecida M.; Martins, Luiz E. B.; Sakabe, Daniel I.; Silva, Ester Da
2013-01-01
Background The second heart rate (HR) turn point has been extensively studied, however there are few studies determining the first HR turn point. Also, the use of mathematical and statistical models for determining changes in dynamic characteristics of physiological variables during an incremental cardiopulmonary test has been suggested. Objectives To determine the first turn point by analysis of HR, surface electromyography (sEMG), and carbon dioxide output () using two mathematical models and to compare the results to those of the visual method. Method Ten sedentary middle-aged men (53.9±3.2 years old) were submitted to cardiopulmonary exercise testing on an electromagnetic cycle ergometer until exhaustion. Ventilatory variables, HR, and sEMG of the vastus lateralis were obtained in real time. Three methods were used to determine the first turn point: 1) visual analysis based on loss of parallelism between and oxygen uptake (); 2) the linear-linear model, based on fitting the curves to the set of data (Lin-Lin ); 3) a bi-segmental linear regression of Hinkley' s algorithm applied to HR (HMM-HR), (HMM- ), and sEMG data (HMM-RMS). Results There were no differences between workload, HR, and ventilatory variable values at the first ventilatory turn point as determined by the five studied parameters (p>0.05). The Bland-Altman plot showed an even distribution of the visual analysis method with Lin-Lin , HMM-HR, HMM-CO2, and HMM-RMS. Conclusion The proposed mathematical models were effective in determining the first turn point since they detected the linear pattern change and the deflection point of , HR responses, and sEMG. PMID:24346296
Zamunér, Antonio R; Catai, Aparecida M; Martins, Luiz E B; Sakabe, Daniel I; Da Silva, Ester
2013-01-01
The second heart rate (HR) turn point has been extensively studied, however there are few studies determining the first HR turn point. Also, the use of mathematical and statistical models for determining changes in dynamic characteristics of physiological variables during an incremental cardiopulmonary test has been suggested. To determine the first turn point by analysis of HR, surface electromyography (sEMG), and carbon dioxide output (VCO2) using two mathematical models and to compare the results to those of the visual method. Ten sedentary middle-aged men (53.9 ± 3.2 years old) were submitted to cardiopulmonary exercise testing on an electromagnetic cycle ergometer until exhaustion. Ventilatory variables, HR, and sEMG of the vastus lateralis were obtained in real time. Three methods were used to determine the first turn point: 1) visual analysis based on loss of parallelism between VCO2 and oxygen uptake (VO2); 2) the linear-linear model, based on fitting the curves to the set of VCO2 data (Lin-LinVCO2); 3) a bi-segmental linear regression of Hinkley's algorithm applied to HR (HMM-HR), VCO2 (HMM-VCO2), and sEMG data (HMM-RMS). There were no differences between workload, HR, and ventilatory variable values at the first ventilatory turn point as determined by the five studied parameters (p>0.05). The Bland-Altman plot showed an even distribution of the visual analysis method with Lin-LinVCO2, HMM-HR, HMM-VCO2, and HMM-RMS. The proposed mathematical models were effective in determining the first turn point since they detected the linear pattern change and the deflection point of VCO2, HR responses, and sEMG.
Analysis of swallowing sounds using hidden Markov models.
Aboofazeli, Mohammad; Moussavi, Zahra
2008-04-01
In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N=9 was higher than that of N=8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N=8.
QRS complex detection based on continuous density hidden Markov models using univariate observations
NASA Astrophysics Data System (ADS)
Sotelo, S.; Arenas, W.; Altuve, M.
2018-04-01
In the electrocardiogram (ECG), the detection of QRS complexes is a fundamental step in the ECG signal processing chain since it allows the determination of other characteristics waves of the ECG and provides information about heart rate variability. In this work, an automatic QRS complex detector based on continuous density hidden Markov models (HMM) is proposed. HMM were trained using univariate observation sequences taken either from QRS complexes or their derivatives. The detection approach is based on the log-likelihood comparison of the observation sequence with a fixed threshold. A sliding window was used to obtain the observation sequence to be evaluated by the model. The threshold was optimized by receiver operating characteristic curves. Sensitivity (Sen), specificity (Spc) and F1 score were used to evaluate the detection performance. The approach was validated using ECG recordings from the MIT-BIH Arrhythmia database. A 6-fold cross-validation shows that the best detection performance was achieved with 2 states HMM trained with QRS complexes sequences (Sen = 0.668, Spc = 0.360 and F1 = 0.309). We concluded that these univariate sequences provide enough information to characterize the QRS complex dynamics from HMM. Future works are directed to the use of multivariate observations to increase the detection performance.
DNA motif elucidation using belief propagation.
Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei
2013-09-01
Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.
An HMM model for coiled-coil domains and a comparison with PSSM-based predictions.
Delorenzi, Mauro; Speed, Terry
2002-04-01
Large-scale sequence data require methods for the automated annotation of protein domains. Many of the predictive methods are based either on a Position Specific Scoring Matrix (PSSM) of fixed length or on a window-less Hidden Markov Model (HMM). The performance of the two approaches is tested for Coiled-Coil Domains (CCDs). The prediction of CCDs is used frequently, and its optimization seems worthwhile. We have conceived MARCOIL, an HMM for the recognition of proteins with a CCD on a genomic scale. A cross-validated study suggests that MARCOIL improves predictions compared to the traditional PSSM algorithm, especially for some protein families and for short CCDs. The study was designed to reveal differences inherent in the two methods. Potential confounding factors such as differences in the dimension of parameter space and in the parameter values were avoided by using the same amino acid propensities and by keeping the transition probabilities of the HMM constant during cross-validation. The prediction program and the databases are available at http://www.wehi.edu.au/bioweb/Mauro/Marcoil
Huda, Shamsul; Yearwood, John; Togneri, Roberto
2009-02-01
This paper attempts to overcome the tendency of the expectation-maximization (EM) algorithm to locate a local rather than global maximum when applied to estimate the hidden Markov model (HMM) parameters in speech signal modeling. We propose a hybrid algorithm for estimation of the HMM in automatic speech recognition (ASR) using a constraint-based evolutionary algorithm (EA) and EM, the CEL-EM. The novelty of our hybrid algorithm (CEL-EM) is that it is applicable for estimation of the constraint-based models with many constraints and large numbers of parameters (which use EM) like HMM. Two constraint-based versions of the CEL-EM with different fusion strategies have been proposed using a constraint-based EA and the EM for better estimation of HMM in ASR. The first one uses a traditional constraint-handling mechanism of EA. The other version transforms a constrained optimization problem into an unconstrained problem using Lagrange multipliers. Fusion strategies for the CEL-EM use a staged-fusion approach where EM has been plugged with the EA periodically after the execution of EA for a specific period of time to maintain the global sampling capabilities of EA in the hybrid algorithm. A variable initialization approach (VIA) has been proposed using a variable segmentation to provide a better initialization for EA in the CEL-EM. Experimental results on the TIMIT speech corpus show that CEL-EM obtains higher recognition accuracies than the traditional EM algorithm as well as a top-standard EM (VIA-EM, constructed by applying the VIA to EM).
Khatun, Jainab; Hamlett, Eric; Giddings, Morgan C
2008-03-01
The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.
Conditional Density Estimation with HMM Based Support Vector Machines
NASA Astrophysics Data System (ADS)
Hu, Fasheng; Liu, Zhenqiu; Jia, Chunxin; Chen, Dechang
Conditional density estimation is very important in financial engineer, risk management, and other engineering computing problem. However, most regression models have a latent assumption that the probability density is a Gaussian distribution, which is not necessarily true in many real life applications. In this paper, we give a framework to estimate or predict the conditional density mixture dynamically. Through combining the Input-Output HMM with SVM regression together and building a SVM model in each state of the HMM, we can estimate a conditional density mixture instead of a single gaussian. With each SVM in each node, this model can be applied for not only regression but classifications as well. We applied this model to denoise the ECG data. The proposed method has the potential to apply to other time series such as stock market return predictions.
Enhancing speech recognition using improved particle swarm optimization based hidden Markov model.
Selvaraj, Lokesh; Ganesan, Balakrishnan
2014-01-01
Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy.
An articulatorily constrained, maximum entropy approach to speech recognition and speech coding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.
Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values aremore » constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.« less
Wissel, Tobias; Pfeiffer, Tim; Frysch, Robert; Knight, Robert T.; Chang, Edward F.; Hinrichs, Hermann; Rieger, Jochem W.; Rose, Georg
2013-01-01
Objective Support Vector Machines (SVM) have developed into a gold standard for accurate classification in Brain-Computer-Interfaces (BCI). The choice of the most appropriate classifier for a particular application depends on several characteristics in addition to decoding accuracy. Here we investigate the implementation of Hidden Markov Models (HMM)for online BCIs and discuss strategies to improve their performance. Approach We compare the SVM, serving as a reference, and HMMs for classifying discrete finger movements obtained from the Electrocorticograms of four subjects doing a finger tapping experiment. The classifier decisions are based on a subset of low-frequency time domain and high gamma oscillation features. Main results We show that decoding optimization between the two approaches is due to the way features are extracted and selected and less dependent on the classifier. An additional gain in HMM performance of up to 6% was obtained by introducing model constraints. Comparable accuracies of up to 90% were achieved with both SVM and HMM with the high gamma cortical response providing the most important decoding information for both techniques. Significance We discuss technical HMM characteristics and adaptations in the context of the presented data as well as for general BCI applications. Our findings suggest that HMMs and their characteristics are promising for efficient online brain-computer interfaces. PMID:24045504
Accelerating Information Retrieval from Profile Hidden Markov Model Databases.
Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem
2016-01-01
Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data
Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa
2017-01-01
Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546
ERIC Educational Resources Information Center
Xu, Jianzhong
2014-01-01
This study examines models of variables posited to predict students' homework motivation management (HMM), based on survey data from 866 8th graders (61 classes) and 745 11th graders (46 classes) in the south-eastern USA. Most of the variance in HMM occurred at the student level, with parent education as the only significant predictor at the class…
Ralph, Duncan K; Matsen, Frederick A
2016-01-01
VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.
Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio
2017-06-01
The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition.
Bianne-Bernard, Anne-Laure; Menasri, Farès; Al-Hajj Mohamad, Rami; Mokbel, Chafic; Kermorvant, Christopher; Likforman-Sulem, Laurence
2011-10-01
This study aims at building an efficient word recognition system resulting from the combination of three handwriting recognizers. The main component of this combined system is an HMM-based recognizer which considers dynamic and contextual information for a better modeling of writing units. For modeling the contextual units, a state-tying process based on decision tree clustering is introduced. Decision trees are built according to a set of expert-based questions on how characters are written. Questions are divided into global questions, yielding larger clusters, and precise questions, yielding smaller ones. Such clustering enables us to reduce the total number of models and Gaussians densities by 10. We then apply this modeling to the recognition of handwritten words. Experiments are conducted on three publicly available databases based on Latin or Arabic languages: Rimes, IAM, and OpenHart. The results obtained show that contextual information embedded with dynamic modeling significantly improves recognition.
Ma, Xiang; Schonfeld, Dan; Khokhar, Ashfaq A
2009-06-01
In this paper, we propose a novel solution to an arbitrary noncausal, multidimensional hidden Markov model (HMM) for image and video classification. First, we show that the noncausal model can be solved by splitting it into multiple causal HMMs and simultaneously solving each causal HMM using a fully synchronous distributed computing framework, therefore referred to as distributed HMMs. Next we present an approximate solution to the multiple causal HMMs that is based on an alternating updating scheme and assumes a realistic sequential computing framework. The parameters of the distributed causal HMMs are estimated by extending the classical 1-D training and classification algorithms to multiple dimensions. The proposed extension to arbitrary causal, multidimensional HMMs allows state transitions that are dependent on all causal neighbors. We, thus, extend three fundamental algorithms to multidimensional causal systems, i.e., 1) expectation-maximization (EM), 2) general forward-backward (GFB), and 3) Viterbi algorithms. In the simulations, we choose to limit ourselves to a noncausal 2-D model whose noncausality is along a single dimension, in order to significantly reduce the computational complexity. Simulation results demonstrate the superior performance, higher accuracy rate, and applicability of the proposed noncausal HMM framework to image and video classification.
Classification of Multiple Seizure-Like States in Three Different Rodent Models of Epileptogenesis.
Guirgis, Mirna; Serletis, Demitre; Zhang, Jane; Florez, Carlos; Dian, Joshua A; Carlen, Peter L; Bardakjian, Berj L
2014-01-01
Epilepsy is a dynamical disease and its effects are evident in over fifty million people worldwide. This study focused on objective classification of the multiple states involved in the brain's epileptiform activity. Four datasets from three different rodent hippocampal preparations were explored, wherein seizure-like-events (SLE) were induced by the perfusion of a low - Mg(2+) /high-K(+) solution or 4-Aminopyridine. Local field potentials were recorded from CA3 pyramidal neurons and interneurons and modeled as Markov processes. Specifically, hidden Markov models (HMM) were used to determine the nature of the states present. Properties of the Hilbert transform were used to construct the feature spaces for HMM training. By sequentially applying the HMM training algorithm, multiple states were identified both in episodes of SLE and nonSLE activity. Specifically, preSLE and postSLE states were differentiated and multiple inner SLE states were identified. This was accomplished using features extracted from the lower frequencies (1-4 Hz, 4-8 Hz) alongside those of both the low- (40-100 Hz) and high-gamma (100-200 Hz) of the recorded electrical activity. The learning paradigm of this HMM-based system eliminates the inherent bias associated with other learning algorithms that depend on predetermined state segmentation and renders it an appropriate candidate for SLE classification.
Improving a HMM-based off-line handwriting recognition system using MME-PSO optimization
NASA Astrophysics Data System (ADS)
Hamdani, Mahdi; El Abed, Haikal; Hamdani, Tarek M.; Märgner, Volker; Alimi, Adel M.
2011-01-01
One of the trivial steps in the development of a classifier is the design of its architecture. This paper presents a new algorithm, Multi Models Evolvement (MME) using Particle Swarm Optimization (PSO). This algorithm is a modified version of the basic PSO, which is used to the unsupervised design of Hidden Markov Model (HMM) based architectures. For instance, the proposed algorithm is applied to an Arabic handwriting recognizer based on discrete probability HMMs. After the optimization of their architectures, HMMs are trained with the Baum- Welch algorithm. The validation of the system is based on the IfN/ENIT database. The performance of the developed approach is compared to the participating systems at the 2005 competition organized on Arabic handwriting recognition on the International Conference on Document Analysis and Recognition (ICDAR). The final system is a combination between an optimized HMM with 6 other HMMs obtained by a simple variation of the number of states. An absolute improvement of 6% of word recognition rate with about 81% is presented. This improvement is achieved comparing to the basic system (ARAB-IfN). The proposed recognizer outperforms also most of the known state-of-the-art systems.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648
Near-Native Protein Loop Sampling Using Nonparametric Density Estimation Accommodating Sparcity
Day, Ryan; Lennox, Kristin P.; Sukhanov, Paul; Dahl, David B.; Vannucci, Marina; Tsai, Jerry
2011-01-01
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD <2.0 Å), the DPM-HMM method performs as well or better than the best templates, demonstrating that our automated method recaptures these canonical loops without inclusion of any IgG specific terms or manual intervention. In cases with poor or few good templates (mean RMSD >7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. PMID:22028638
Post processing of optically recognized text via second order hidden Markov model
NASA Astrophysics Data System (ADS)
Poudel, Srijana
In this thesis, we describe a postprocessing system on Optical Character Recognition(OCR) generated text. Second Order Hidden Markov Model (HMM) approach is used to detect and correct the OCR related errors. The reason for choosing the 2nd order HMM is to keep track of the bigrams so that the model can represent the system more accurately. Based on experiments with training data of 159,733 characters and testing of 5,688 characters, the model was able to correct 43.38 % of the errors with a precision of 75.34 %. However, the precision value indicates that the model introduced some new errors, decreasing the correction percentage to 26.4%.
Mining adverse drug reactions from online healthcare forums using hidden Markov model.
Sampathkumar, Hariprasad; Chen, Xue-wen; Luo, Bo
2014-10-23
Adverse Drug Reactions are one of the leading causes of injury or death among patients undergoing medical treatments. Not all Adverse Drug Reactions are identified before a drug is made available in the market. Current post-marketing drug surveillance methods, which are based purely on voluntary spontaneous reports, are unable to provide the early indications necessary to prevent the occurrence of such injuries or fatalities. The objective of this research is to extract reports of adverse drug side-effects from messages in online healthcare forums and use them as early indicators to assist in post-marketing drug surveillance. We treat the task of extracting adverse side-effects of drugs from healthcare forum messages as a sequence labeling problem and present a Hidden Markov Model(HMM) based Text Mining system that can be used to classify a message as containing drug side-effect information and then extract the adverse side-effect mentions from it. A manually annotated dataset from http://www.medications.com is used in the training and validation of the HMM based Text Mining system. A 10-fold cross-validation on the manually annotated dataset yielded on average an F-Score of 0.76 from the HMM Classifier, in comparison to 0.575 from the Baseline classifier. Without the Plain Text Filter component as a part of the Text Processing module, the F-Score of the HMM Classifier was reduced to 0.378 on average, while absence of the HTML Filter component was found to have no impact. Reducing the Drug names dictionary size by half, on average reduced the F-Score of the HMM Classifier to 0.359, while a similar reduction to the side-effects dictionary yielded an F-Score of 0.651 on average. Adverse side-effects mined from http://www.medications.com and http://www.steadyhealth.com were found to match the Adverse Drug Reactions on the Drug Package Labels of several drugs. In addition, some novel adverse side-effects, which can be potential Adverse Drug Reactions, were also identified. The results from the HMM based Text Miner are encouraging to pursue further enhancements to this approach. The mined novel side-effects can act as early indicators for health authorities to help focus their efforts in post-marketing drug surveillance.
Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi
2016-11-03
Copy Number Variation (CNV) is envisaged to be a major source of large structural variations in the human genome. In recent years, many studies apply Next Generation Sequencing (NGS) data for the CNV detection. However, still there is a necessity to invent more accurate computational tools. In this study, mate pair NGS data are used for the CNV detection in a Hidden Markov Model (HMM). The proposed HMM has position specific emission probabilities, i.e. a Gaussian mixture distribution. Each component in the Gaussian mixture distribution captures a different type of aberration that is observed in the mate pairs, after being mapped to the reference genome. These aberrations may include any increase (decrease) in the insertion size or change in the direction of mate pairs that are mapped to the reference genome. This HMM with Position-Specific Emission probabilities (PSE-HMM) is utilized for the genome-wide detection of deletions and tandem duplications. The performance of PSE-HMM is evaluated on a simulated dataset and also on a real data of a Yoruban HapMap individual, NA18507. PSE-HMM is effective in taking observation dependencies into account and reaches a high accuracy in detecting genome-wide CNVs. MATLAB programs are available at http://bs.ipm.ir/softwares/PSE-HMM/ .
Stylistic gait synthesis based on hidden Markov models
NASA Astrophysics Data System (ADS)
Tilmanne, Joëlle; Moinet, Alexis; Dutoit, Thierry
2012-12-01
In this work we present an expressive gait synthesis system based on hidden Markov models (HMMs), following and modifying a procedure originally developed for speaking style adaptation, in speech synthesis. A large database of neutral motion capture walk sequences was used to train an HMM of average walk. The model was then used for automatic adaptation to a particular style of walk using only a small amount of training data from the target style. The open source toolkit that we adapted for motion modeling also enabled us to take into account the dynamics of the data and to model accurately the duration of each HMM state. We also address the assessment issue and propose a procedure for qualitative user evaluation of the synthesized sequences. Our tests show that the style of these sequences can easily be recognized and look natural to the evaluators.
Handwritten digits recognition using HMM and PSO based on storks
NASA Astrophysics Data System (ADS)
Yan, Liao; Jia, Zhenhong; Yang, Jie; Pang, Shaoning
2010-07-01
A new method for handwritten digits recognition based on hidden markov model (HMM) and particle swarm optimization (PSO) is proposed. This method defined 24 strokes with the sense of directional, to make up for the shortage that is sensitive in choice of stating point in traditional methods, but also reduce the ambiguity caused by shakes. Make use of excellent global convergence of PSO; improving the probability of finding the optimum and avoiding local infinitesimal obviously. Experimental results demonstrate that compared with the traditional methods, the proposed method can make most of the recognition rate of handwritten digits improved.
Bidargaddi, Niranjan P; Chetty, Madhu; Kamruzzaman, Joarder
2008-06-01
Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.
Damage evaluation by a guided wave-hidden Markov model based method
NASA Astrophysics Data System (ADS)
Mei, Hanfei; Yuan, Shenfang; Qiu, Lei; Zhang, Jinjin
2016-02-01
Guided wave based structural health monitoring has shown great potential in aerospace applications. However, one of the key challenges of practical engineering applications is the accurate interpretation of the guided wave signals under time-varying environmental and operational conditions. This paper presents a guided wave-hidden Markov model based method to improve the damage evaluation reliability of real aircraft structures under time-varying conditions. In the proposed approach, an HMM based unweighted moving average trend estimation method, which can capture the trend of damage propagation from the posterior probability obtained by HMM modeling is used to achieve a probabilistic evaluation of the structural damage. To validate the developed method, experiments are performed on a hole-edge crack specimen under fatigue loading condition and a real aircraft wing spar under changing structural boundary conditions. Experimental results show the advantage of the proposed method.
Chan, Kuang-Lim; Rosli, Rozana; Tatarinova, Tatiana V; Hogan, Michael; Firdaus-Raih, Mohd; Low, Eng-Ti Leslie
2017-01-27
Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion. We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure). Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.
Hidden Markov models and neural networks for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic
1994-01-01
Neural networks plus hidden Markov models (HMM) can provide excellent detection and false alarm rate performance in fault detection applications, as shown in this viewgraph presentation. Modified models allow for novelty detection. Key contributions of neural network models are: (1) excellent nonparametric discrimination capability; (2) a good estimator of posterior state probabilities, even in high dimensions, and thus can be embedded within overall probabilistic model (HMM); and (3) simple to implement compared to other nonparametric models. Neural network/HMM monitoring model is currently being integrated with the new Deep Space Network (DSN) antenna controller software and will be on-line monitoring a new DSN 34-m antenna (DSS-24) by July, 1994.
Cluster-based adaptive power control protocol using Hidden Markov Model for Wireless Sensor Networks
NASA Astrophysics Data System (ADS)
Vinutha, C. B.; Nalini, N.; Nagaraja, M.
2017-06-01
This paper presents strategies for an efficient and dynamic transmission power control technique, in order to reduce packet drop and hence energy consumption of power-hungry sensor nodes operated in highly non-linear channel conditions of Wireless Sensor Networks. Besides, we also focus to prolong network lifetime and scalability by designing cluster-based network structure. Specifically we consider weight-based clustering approach wherein, minimum significant node is chosen as Cluster Head (CH) which is computed stemmed from the factors distance, remaining residual battery power and received signal strength (RSS). Further, transmission power control schemes to fit into dynamic channel conditions are meticulously implemented using Hidden Markov Model (HMM) where probability transition matrix is formulated based on the observed RSS measurements. Typically, CH estimates initial transmission power of its cluster members (CMs) from RSS using HMM and broadcast this value to its CMs for initialising their power value. Further, if CH finds that there are variations in link quality and RSS of the CMs, it again re-computes and optimises the transmission power level of the nodes using HMM to avoid packet loss due noise interference. We have demonstrated our simulation results to prove that our technique efficiently controls the power levels of sensing nodes to save significant quantity of energy for different sized network.
2011-01-01
Background Epilepsy is a common neurological disorder characterized by recurrent electrophysiological activities, known as seizures. Without the appropriate detection strategies, these seizure episodes can dramatically affect the quality of life for those afflicted. The rationale of this study is to develop an unsupervised algorithm for the detection of seizure states so that it may be implemented along with potential intervention strategies. Methods Hidden Markov model (HMM) was developed to interpret the state transitions of the in vitro rat hippocampal slice local field potentials (LFPs) during seizure episodes. It can be used to estimate the probability of state transitions and the corresponding characteristics of each state. Wavelet features were clustered and used to differentiate the electrophysiological characteristics at each corresponding HMM states. Using unsupervised training method, the HMM and the clustering parameters were obtained simultaneously. The HMM states were then assigned to the electrophysiological data using expert guided technique. Minimum redundancy maximum relevance (mRMR) analysis and Akaike Information Criterion (AICc) were applied to reduce the effect of over-fitting. The sensitivity, specificity and optimality index of chronic seizure detection were compared for various HMM topologies. The ability of distinguishing early and late tonic firing patterns prior to chronic seizures were also evaluated. Results Significant improvement in state detection performance was achieved when additional wavelet coefficient rates of change information were used as features. The final HMM topology obtained using mRMR and AICc was able to detect non-ictal (interictal), early and late tonic firing, chronic seizures and postictal activities. A mean sensitivity of 95.7%, mean specificity of 98.9% and optimality index of 0.995 in the detection of chronic seizures was achieved. The detection of early and late tonic firing was validated with experimental intracellular electrical recordings of seizures. Conclusions The HMM implementation of a seizure dynamics detector is an improvement over existing approaches using visual detection and complexity measures. The subjectivity involved in partitioning the observed data prior to training can be eliminated. It can also decipher the probabilities of seizure state transitions using the magnitude and rate of change wavelet information of the LFPs. PMID:21504608
NASA Astrophysics Data System (ADS)
Rustamov, Samir; Mustafayev, Elshan; Clements, Mark A.
2018-04-01
The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM) can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.
Ju, Lining; Wang, Yijie Dylan; Hung, Ying; Wu, Chien-Fu Jeff; Zhu, Cheng
2013-01-01
Motivation: Abrupt reduction/resumption of thermal fluctuations of a force probe has been used to identify association/dissociation events of protein–ligand bonds. We show that off-rate of molecular dissociation can be estimated by the analysis of the bond lifetime, while the on-rate of molecular association can be estimated by the analysis of the waiting time between two neighboring bond events. However, the analysis relies heavily on subjective judgments and is time-consuming. To automate the process of mapping out bond events from thermal fluctuation data, we develop a hidden Markov model (HMM)-based method. Results: The HMM method represents the bond state by a hidden variable with two values: bound and unbound. The bond association/dissociation is visualized and pinpointed. We apply the method to analyze a key receptor–ligand interaction in the early stage of hemostasis and thrombosis: the von Willebrand factor (VWF) binding to platelet glycoprotein Ibα (GPIbα). The numbers of bond lifetime and waiting time events estimated by the HMM are much more than those estimated by a descriptive statistical method from the same set of raw data. The kinetic parameters estimated by the HMM are in excellent agreement with those by a descriptive statistical analysis, but have much smaller errors for both wild-type and two mutant VWF-A1 domains. Thus, the computerized analysis allows us to speed up the analysis and improve the quality of estimates of receptor–ligand binding kinetics. Contact: jeffwu@isye.gatech.edu or cheng.zhu@bme.gatech.edu PMID:23599504
Subwavelength focusing of terahertz waves in silicon hyperbolic metamaterials.
Kannegulla, Akash; Cheng, Li-Jing
2016-08-01
We theoretically demonstrate the subwavelength focusing of terahertz (THz) waves in a hyperbolic metamaterial (HMM) based on a two-dimensional subwavelength silicon pillar array microstructure. The silicon microstructure with a doping concentration of at least 1017 cm-3 offers a hyperbolic dispersion at terahertz frequency range and promises the focusing of terahertz Gaussian beams. The results agree with the simulation based on effective medium theory. The focusing effect can be controlled by the doping concentration, which determines the real part of the out-of-plane permittivity and, therefore, the refraction angles in HMM. The focusing property in the HMM structure allows the propagation of terahertz wave through a subwavelength aperture. The silicon-based HMM structure can be realized using microfabrication technologies and has the potential to advance terahertz imaging with subwavelength resolution.
Steele, James S; Bush, Keith; Stowe, Zachary N; James, George A; Smitherman, Sonet; Kilts, Clint D; Cisler, Josh
2018-01-01
Numerous data demonstrate that distracting emotional stimuli cause behavioral slowing (i.e. emotional conflict) and that behavior dynamically adapts to such distractors. However, the cognitive and neural mechanisms that mediate these behavioral findings are poorly understood. Several theoretical models have been developed that attempt to explain these phenomena, but these models have not been directly tested on human behavior nor compared. A potential tool to overcome this limitation is Hidden Markov Modeling (HMM), which is a computational approach to modeling indirectly observed systems. Here, we administered an emotional Stroop task to a sample of healthy adolescent girls (N = 24) during fMRI and used HMM to implement theoretical behavioral models. We then compared the model fits and tested for neural representations of the hidden states of the most supported model. We found that a modified variant of the model posited by Mathews et al. (1998) was most concordant with observed behavior and that brain activity was related to the model-based hidden states. Particularly, while the valences of the stimuli themselves were encoded primarily in the ventral visual cortex, the model-based detection of threatening targets was associated with increased activity in the bilateral anterior insula, while task effort (i.e. adaptation) was associated with reduction in the activity of these areas. These findings suggest that emotional target detection and adaptation are accomplished partly through increases and decreases, respectively, in the perceived immediate relevance of threatening cues and also demonstrate the efficacy of using HMM to apply theoretical models to human behavior.
Bush, Keith; Stowe, Zachary N.; James, George A.; Smitherman, Sonet; Kilts, Clint D.; Cisler, Josh
2018-01-01
Numerous data demonstrate that distracting emotional stimuli cause behavioral slowing (i.e. emotional conflict) and that behavior dynamically adapts to such distractors. However, the cognitive and neural mechanisms that mediate these behavioral findings are poorly understood. Several theoretical models have been developed that attempt to explain these phenomena, but these models have not been directly tested on human behavior nor compared. A potential tool to overcome this limitation is Hidden Markov Modeling (HMM), which is a computational approach to modeling indirectly observed systems. Here, we administered an emotional Stroop task to a sample of healthy adolescent girls (N = 24) during fMRI and used HMM to implement theoretical behavioral models. We then compared the model fits and tested for neural representations of the hidden states of the most supported model. We found that a modified variant of the model posited by Mathews et al. (1998) was most concordant with observed behavior and that brain activity was related to the model-based hidden states. Particularly, while the valences of the stimuli themselves were encoded primarily in the ventral visual cortex, the model-based detection of threatening targets was associated with increased activity in the bilateral anterior insula, while task effort (i.e. adaptation) was associated with reduction in the activity of these areas. These findings suggest that emotional target detection and adaptation are accomplished partly through increases and decreases, respectively, in the perceived immediate relevance of threatening cues and also demonstrate the efficacy of using HMM to apply theoretical models to human behavior. PMID:29489856
Offline handwritten word recognition using MQDF-HMMs
NASA Astrophysics Data System (ADS)
Ramachandrula, Sitaram; Hambarde, Mangesh; Patial, Ajay; Sahoo, Dushyant; Kochar, Shaivi
2015-01-01
We propose an improved HMM formulation for offline handwriting recognition (HWR). The main contribution of this work is using modified quadratic discriminant function (MQDF) [1] within HMM framework. In an MQDF-HMM the state observation likelihood is calculated by a weighted combination of MQDF likelihoods of individual Gaussians of GMM (Gaussian Mixture Model). The quadratic discriminant function (QDF) of a multivariate Gaussian can be rewritten by avoiding the inverse of covariance matrix by using the Eigen values and Eigen vectors of it. The MQDF is derived from QDF by substituting few of badly estimated lower-most Eigen values by an appropriate constant. The estimation errors of non-dominant Eigen vectors and Eigen values of covariance matrix for which the training data is insufficient can be controlled by this approach. MQDF has been successfully shown to improve the character recognition performance [1]. The usage of MQDF in HMM improves the computation, storage and modeling power of HMM when there is limited training data. We have got encouraging results on offline handwritten character (NIST database) and word recognition in English using MQDF HMMs.
De novo identification of replication-timing domains in the human genome by deep learning.
Liu, Feng; Ren, Chao; Li, Hao; Zhou, Pingkun; Bo, Xiaochen; Shu, Wenjie
2016-03-01
The de novo identification of the initiation and termination zones-regions that replicate earlier or later than their upstream and downstream neighbours, respectively-remains a key challenge in DNA replication. Building on advances in deep learning, we developed a novel hybrid architecture combining a pre-trained, deep neural network and a hidden Markov model (DNN-HMM) for the de novo identification of replication domains using replication timing profiles. Our results demonstrate that DNN-HMM can significantly outperform strong, discriminatively trained Gaussian mixture model-HMM (GMM-HMM) systems and other six reported methods that can be applied to this challenge. We applied our trained DNN-HMM to identify distinct replication domain types, namely the early replication domain (ERD), the down transition zone (DTZ), the late replication domain (LRD) and the up transition zone (UTZ), using newly replicated DNA sequencing (Repli-Seq) data across 15 human cells. A subsequent integrative analysis revealed that these replication domains harbour unique genomic and epigenetic patterns, transcriptional activity and higher-order chromosomal structure. Our findings support the 'replication-domain' model, which states (1) that ERDs and LRDs, connected by UTZs and DTZs, are spatially compartmentalized structural and functional units of higher-order chromosomal structure, (2) that the adjacent DTZ-UTZ pairs form chromatin loops and (3) that intra-interactions within ERDs and LRDs tend to be short-range and long-range, respectively. Our model reveals an important chromatin organizational principle of the human genome and represents a critical step towards understanding the mechanisms regulating replication timing. Our DNN-HMM method and three additional algorithms can be freely accessed at https://github.com/wenjiegroup/DNN-HMM The replication domain regions identified in this study are available in GEO under the accession ID GSE53984. shuwj@bmi.ac.cn or boxc@bmi.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Using hidden Markov models to align multiple sequences.
Mount, David W
2009-07-01
A hidden Markov model (HMM) is a probabilistic model of a multiple sequence alignment (msa) of proteins. In the model, each column of symbols in the alignment is represented by a frequency distribution of the symbols (called a "state"), and insertions and deletions are represented by other states. One moves through the model along a particular path from state to state in a Markov chain (i.e., random choice of next move), trying to match a given sequence. The next matching symbol is chosen from each state, recording its probability (frequency) and also the probability of going to that state from a previous one (the transition probability). State and transition probabilities are multiplied to obtain a probability of the given sequence. The hidden nature of the HMM is due to the lack of information about the value of a specific state, which is instead represented by a probability distribution over all possible values. This article discusses the advantages and disadvantages of HMMs in msa and presents algorithms for calculating an HMM and the conditions for producing the best HMM.
Segment-based acoustic models for continuous speech recognition
NASA Astrophysics Data System (ADS)
Ostendorf, Mari; Rohlicek, J. R.
1993-07-01
This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition, by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which are more costly than traditional approaches because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling techniques to result in improved recognition performance over that achieved by current systems, which handle only frame-based observations and assume that these observations are independent given an underlying state sequence. In the fourth quarter of the project, we have completed the following: (1) ported our recognition system to the Wall Street Journal task, a standard task in the ARPA community; (2) developed an initial dependency-tree model of intra-utterance observation correlation; and (3) implemented baseline language model estimation software. Our initial results on the Wall Street Journal task are quite good and represent significantly improved performance over most HMM systems reporting on the Nov. 1992 5k vocabulary test set.
Comparing Four Age Model Techniques using Nine Sediment Cores from the Iberian Margin
NASA Astrophysics Data System (ADS)
Lisiecki, L. E.; Jones, A. M.; Lawrence, C.
2017-12-01
Interpretations of paleoclimate records from ocean sediment cores rely on age models, which provide estimates of age as a function of core depth. Here we compare four methods used to generate age models for sediment cores for the past 140 kyr. The first method is based on radiocarbon dating using the Bayesian statistical software, Bacon [Blaauw and Christen, 2011]. The second method aligns benthic δ18O to a target core using the probabilistic alignment algorithm, HMM-Match, which also generates age uncertainty estimates [Lin et al., 2014]. The third and fourth methods are planktonic δ18O and sea surface temperature (SST) alignments to the same target core, using the alignment algorithm Match [Lisiecki and Lisiecki, 2002]. Unlike HMM-Match, Match requires parameter tuning and does not produce uncertainty estimates. The results of these four age model techniques are compared for nine high-resolution cores from the Iberian margin. The root mean square error between the individual age model results and each core's average estimated age is 1.4 kyr. Additionally, HMM-Match and Bacon age estimates agree to within uncertainty and have similar 95% confidence widths of 1-2 kyr for the highest resolution records. In one core, the planktonic and SST alignments did not fall within the 95% confidence intervals from HMM-Match. For this core, the surface proxy alignments likely produce more reliable results due to millennial-scale SST variability and the presence of several gaps in the benthic δ18O data. Similar studies of other oceanographic regions are needed to determine the spatial extents over which these climate proxies may be stratigraphically correlated.
A robust omnifont open-vocabulary Arabic OCR system using pseudo-2D-HMM
NASA Astrophysics Data System (ADS)
Rashwan, Abdullah M.; Rashwan, Mohsen A.; Abdel-Hameed, Ahmed; Abdou, Sherif; Khalil, A. H.
2012-01-01
Recognizing old documents is highly desirable since the demand for quickly searching millions of archived documents has recently increased. Using Hidden Markov Models (HMMs) has been proven to be a good solution to tackle the main problems of recognizing typewritten Arabic characters. These attempts however achieved a remarkable success for omnifont OCR under very favorable conditions, they didn't achieve the same performance in practical conditions, i.e. noisy documents. In this paper we present an omnifont, large-vocabulary Arabic OCR system using Pseudo Two Dimensional Hidden Markov Model (P2DHMM), which is a generalization of the HMM. P2DHMM offers a more efficient way to model the Arabic characters, such model offer both minimal dependency on the font size/style (omnifont), and high level of robustness against noise. The evaluation results of this system are very promising compared to a baseline HMM system and best OCRs available in the market (Sakhr and NovoDynamics). The recognition accuracy of the P2DHMM classifier is measured against the classic HMM classifier, the average word accuracy rates for P2DHMM and HMM classifiers are 79% and 66% respectively. The overall system accuracy is measured against Sakhr and NovoDynamics OCR systems, the average word accuracy rates for P2DHMM, NovoDynamics, and Sakhr are 74%, 71%, and 61% respectively.
Human gait recognition by pyramid of HOG feature on silhouette images
NASA Astrophysics Data System (ADS)
Yang, Guang; Yin, Yafeng; Park, Jeanrok; Man, Hong
2013-03-01
As a uncommon biometric modality, human gait recognition has a great advantage of identify people at a distance without high resolution images. It has attracted much attention in recent years, especially in the fields of computer vision and remote sensing. In this paper, we propose a human gait recognition framework that consists of a reliable background subtraction method followed by the pyramid of Histogram of Gradient (pHOG) feature extraction on the silhouette image, and a Hidden Markov Model (HMM) based classifier. Through background subtraction, the silhouette of human gait in each frame is extracted and normalized from the raw video sequence. After removing the shadow and noise in each region of interest (ROI), pHOG feature is computed on the silhouettes images. Then the pHOG features of each gait class will be used to train a corresponding HMM. In the test stage, pHOG feature will be extracted from each test sequence and used to calculate the posterior probability toward each trained HMM model. Experimental results on the CASIA Gait Dataset B1 demonstrate that with our proposed method can achieve very competitive recognition rate.
Using hidden Markov models and observed evolution to annotate viral genomes.
McCauley, Stephen; Hein, Jotun
2006-06-01
ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Computer-Vision-Assisted Palm Rehabilitation With Supervised Learning.
Vamsikrishna, K M; Dogra, Debi Prosad; Desarkar, Maunendra Sankar
2016-05-01
Physical rehabilitation supported by the computer-assisted-interface is gaining popularity among health-care fraternity. In this paper, we have proposed a computer-vision-assisted contactless methodology to facilitate palm and finger rehabilitation. Leap motion controller has been interfaced with a computing device to record parameters describing 3-D movements of the palm of a user undergoing rehabilitation. We have proposed an interface using Unity3D development platform. Our interface is capable of analyzing intermediate steps of rehabilitation without the help of an expert, and it can provide online feedback to the user. Isolated gestures are classified using linear discriminant analysis (DA) and support vector machines (SVM). Finally, a set of discrete hidden Markov models (HMM) have been used to classify gesture sequence performed during rehabilitation. Experimental validation using a large number of samples collected from healthy volunteers reveals that DA and SVM perform similarly while applied on isolated gesture recognition. We have compared the results of HMM-based sequence classification with CRF-based techniques. Our results confirm that both HMM and CRF perform quite similarly when tested on gesture sequences. The proposed system can be used for home-based palm or finger rehabilitation in the absence of experts.
Hidden Markov models for evolution and comparative genomics analysis.
Bykova, Nadezda A; Favorov, Alexander V; Mironov, Andrey A
2013-01-01
The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.
Human Mars Mission: Weights and Mass Properties. Pt. 1
NASA Technical Reports Server (NTRS)
Brothers, Bobby
1999-01-01
This paper presents a final report on The Human Mars Mission Weights and Measures. The topics included in this report are: 1) Trans-Earth Injection Storage Human Mars Mission (HMM) Chemical Design Reference Mission (DRM) v4.0a Weight Breakout; 2) Ascent Stage HMM Chemical DRM v4.0a Weight Breakout; 3) Descent Stages HMM Chemical DRM v4.0a Weight Breakout; 4) Trans-Mars Injection Stages HMM Chemical DRM v4.0a Weight Breakout; 5) Trans-Earth Injection Stage HMM Solar Electric Propulsion (SEP) DRM v4.0a Weight Breakout; 6) Ascent Stage HMM SEP DRM v4.0a Weight Breakout; 7) Descent Stages HMM SEP DRM v4.0a Weight Breakout; 8) Trans-Mars Injection Stages HMM SEP DRM v4.0a Weight Breakout; 9) Crew Taxi Stage HMM SEP DRM v4.0 Weight Breakout; 10)Trans-Earth Injection Stage HMM Nuclear DRM v4.0a Weight Breakout; 11) Ascent Stage HMM Nuclear DRM v4.0a Weight Breakout; 12) Descent Stages HMM Nuclear DRM v4.0a Weight Breakout; 13) Trans-Mars Injection Stages HMM Nuclear DRM v4.0a Weight Breakout; and 14) HMM Mass Properties Coordinate System.
NASA Astrophysics Data System (ADS)
Carn, S. A.; Sutton, A. J.; Elias, T.; Patrick, M. R.; Owen, R. C.; Wu, S.
2009-12-01
Satellite remote sensing is providing unique constraints on sulfur dioxide (SO2) emissions associated with the ongoing eruption of Halema‘uma‘u (HMM), and daily observations of volcanic plume dispersion. We use synoptic SO2 measurements by the Ozone Monitoring Instrument (OMI) on NASA’s Aura satellite to chart the fluctuation in SO2 emissions and plume dispersion. Prior to the onset of degassing from HMM, OMI detected SO2 emissions from the east rift Pu‘u ‘O‘o vent; the average daily SO2 burden measured between Sept 6, 2004 and Feb 29, 2008 was 0.7 kilotons (kt) ±1 (1σ). The additional SO2 production from HMM caused total SO2 burdens in the composite Kilauea plume to increase notably in March-April 2008, and a daily average SO2 burden of ~4 kt ±4 (1σ) was measured by OMI between Mar 1, 2008 and Jul 31, 2009 (all burdens are preliminary and assume a SO2 plume altitude of 3 km). A total of ~2 Megatons of SO2 was measured by OMI in the Kilauea emissions between March 2008 and July 2009. The increased SO2 emissions provide an excellent opportunity to compare ground-based ultraviolet (UV) spectrometer and space-based UV OMI measurements of SO2 output, and test algorithms for derivation of emission rates from satellite data. Kilauea data analyzed to date show that trends in ground-based SO2 emission rates and OMI SO2 burdens are in qualitative agreement but differ in magnitude. Plume altitude is a critical factor in satellite SO2 retrievals, and interpretation of the Kilauea observations is complicated by the presence of two SO2 plumes (from HMM and Pu‘u ‘O‘o) within the OMI field-of-view. In order to constrain plume heights and SO2 lifetimes, we use plume simulations generated by the FLEXPART particle dispersion model and compare the model output with OMI SO2 observations. We validate the model-generated plume altitudes using vertical aerosol profiles derived from the CALIPSO space-borne lidar instrument. Gaussian plume models parameterized using visual observations of the HMM plume injection height further constrain near-source plume dispersion and downwind evolution. Refinement of SO2 altitude provides improved constraints on SO2 burdens in observed plumes. A more rigorous approach to deriving source emission strengths from satellite observations is an inverse modeling scheme incorporating measurements and models. Using Kilauea as a case study, we plan to develop such a scheme using OMI data, FLEXPART simulations and atmospheric chemistry and transport modeling using the GEOS-Chem model. Modeling of plume dispersion and chemistry will also provide estimates of SO2 and acid aerosol concentrations for potential use in air quality and health hazard assessments in Hawaii.
Passive Acoustic Leak Detection for Sodium Cooled Fast Reactors Using Hidden Markov Models
NASA Astrophysics Data System (ADS)
Marklund, A. Riber; Kishore, S.; Prakash, V.; Rajan, K. K.; Michel, F.
2016-06-01
Acoustic leak detection for steam generators of sodium fast reactors have been an active research topic since the early 1970s and several methods have been tested over the years. Inspired by its success in the field of automatic speech recognition, we here apply hidden Markov models (HMM) in combination with Gaussian mixture models (GMM) to the problem. To achieve this, we propose a new feature calculation scheme, based on the temporal evolution of the power spectral density (PSD) of the signal. Using acoustic signals recorded during steam/water injection experiments done at the Indira Gandhi Centre for Atomic Research (IGCAR), the proposed method is tested. We perform parametric studies on the HMM+GMM model size and demonstrate that the proposed method a) performs well without a priori knowledge of injection noise, b) can incorporate several noise models and c) has an output distribution that simplifies false alarm rate control.
A Hybrid of Deep Network and Hidden Markov Model for MCI Identification with Resting-State fMRI.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2015-10-01
In this paper, we propose a novel method for modelling functional dynamics in resting-state fMRI (rs-fMRI) for Mild Cognitive Impairment (MCI) identification. Specifically, we devise a hybrid architecture by combining Deep Auto-Encoder (DAE) and Hidden Markov Model (HMM). The roles of DAE and HMM are, respectively, to discover hierarchical non-linear relations among features, by which we transform the original features into a lower dimension space, and to model dynamic characteristics inherent in rs-fMRI, i.e. , internal state changes. By building a generative model with HMMs for each class individually, we estimate the data likelihood of a test subject as MCI or normal healthy control, based on which we identify the clinical label. In our experiments, we achieved the maximal accuracy of 81.08% with the proposed method, outperforming state-of-the-art methods in the literature.
A Hybrid of Deep Network and Hidden Markov Model for MCI Identification with Resting-State fMRI
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2015-01-01
In this paper, we propose a novel method for modelling functional dynamics in resting-state fMRI (rs-fMRI) for Mild Cognitive Impairment (MCI) identification. Specifically, we devise a hybrid architecture by combining Deep Auto-Encoder (DAE) and Hidden Markov Model (HMM). The roles of DAE and HMM are, respectively, to discover hierarchical non-linear relations among features, by which we transform the original features into a lower dimension space, and to model dynamic characteristics inherent in rs-fMRI, i.e., internal state changes. By building a generative model with HMMs for each class individually, we estimate the data likelihood of a test subject as MCI or normal healthy control, based on which we identify the clinical label. In our experiments, we achieved the maximal accuracy of 81.08% with the proposed method, outperforming state-of-the-art methods in the literature. PMID:27054199
State-space model with deep learning for functional dynamics estimation in resting-state fMRI.
Suk, Heung-Il; Wee, Chong-Yaw; Lee, Seong-Whan; Shen, Dinggang
2016-04-01
Studies on resting-state functional Magnetic Resonance Imaging (rs-fMRI) have shown that different brain regions still actively interact with each other while a subject is at rest, and such functional interaction is not stationary but changes over time. In terms of a large-scale brain network, in this paper, we focus on time-varying patterns of functional networks, i.e., functional dynamics, inherent in rs-fMRI, which is one of the emerging issues along with the network modelling. Specifically, we propose a novel methodological architecture that combines deep learning and state-space modelling, and apply it to rs-fMRI based Mild Cognitive Impairment (MCI) diagnosis. We first devise a Deep Auto-Encoder (DAE) to discover hierarchical non-linear functional relations among regions, by which we transform the regional features into an embedding space, whose bases are complex functional networks. Given the embedded functional features, we then use a Hidden Markov Model (HMM) to estimate dynamic characteristics of functional networks inherent in rs-fMRI via internal states, which are unobservable but can be inferred from observations statistically. By building a generative model with an HMM, we estimate the likelihood of the input features of rs-fMRI as belonging to the corresponding status, i.e., MCI or normal healthy control, based on which we identify the clinical label of a testing subject. In order to validate the effectiveness of the proposed method, we performed experiments on two different datasets and compared with state-of-the-art methods in the literature. We also analyzed the functional networks learned by DAE, estimated the functional connectivities by decoding hidden states in HMM, and investigated the estimated functional connectivities by means of a graph-theoretic approach. Copyright © 2016 Elsevier Inc. All rights reserved.
State-space model with deep learning for functional dynamics estimation in resting-state fMRI
Suk, Heung-Il; Wee, Chong-Yaw; Lee, Seong-Whan; Shen, Dinggang
2017-01-01
Studies on resting-state functional Magnetic Resonance Imaging (rs-fMRI) have shown that different brain regions still actively interact with each other while a subject is at rest, and such functional interaction is not stationary but changes over time. In terms of a large-scale brain network, in this paper, we focus on time-varying patterns of functional networks, i.e., functional dynamics, inherent in rs-fMRI, which is one of the emerging issues along with the network modelling. Specifically, we propose a novel methodological architecture that combines deep learning and state-space modelling, and apply it to rs-fMRI based Mild Cognitive Impairment (MCI) diagnosis. We first devise a Deep Auto-Encoder (DAE) to discover hierarchical non-linear functional relations among regions, by which we transform the regional features into an embedding space, whose bases are complex functional networks. Given the embedded functional features, we then use a Hidden Markov Model (HMM) to estimate dynamic characteristics of functional networks inherent in rs-fMRI via internal states, which are unobservable but can be inferred from observations statistically. By building a generative model with an HMM, we estimate the likelihood of the input features of rs-fMRI as belonging to the corresponding status, i.e., MCI or normal healthy control, based on which we identify the clinical label of a testing subject. In order to validate the effectiveness of the proposed method, we performed experiments on two different datasets and compared with state-of-the-art methods in the literature. We also analyzed the functional networks learned by DAE, estimated the functional connectivities by decoding hidden states in HMM, and investigated the estimated functional connectivities by means of a graph-theoretic approach. PMID:26774612
HMM based automated wheelchair navigation using EOG traces in EEG
NASA Astrophysics Data System (ADS)
Aziz, Fayeem; Arof, Hamzah; Mokhtar, Norrima; Mubin, Marizan
2014-10-01
This paper presents a wheelchair navigation system based on a hidden Markov model (HMM), which we developed to assist those with restricted mobility. The semi-autonomous system is equipped with obstacle/collision avoidance sensors and it takes the electrooculography (EOG) signal traces from the user as commands to maneuver the wheelchair. The EOG traces originate from eyeball and eyelid movements and they are embedded in EEG signals collected from the scalp of the user at three different locations. Features extracted from the EOG traces are used to determine whether the eyes are open or closed, and whether the eyes are gazing to the right, center, or left. These features are utilized as inputs to a few support vector machine (SVM) classifiers, whose outputs are regarded as observations to an HMM. The HMM determines the state of the system and generates commands for navigating the wheelchair accordingly. The use of simple features and the implementation of a sliding window that captures important signatures in the EOG traces result in a fast execution time and high classification rates. The wheelchair is equipped with a proximity sensor and it can move forward and backward in three directions. The asynchronous system achieved an average classification rate of 98% when tested with online data while its average execution time was less than 1 s. It was also tested in a navigation experiment where all of the participants managed to complete the tasks successfully without collisions.
HMM based automated wheelchair navigation using EOG traces in EEG.
Aziz, Fayeem; Arof, Hamzah; Mokhtar, Norrima; Mubin, Marizan
2014-10-01
This paper presents a wheelchair navigation system based on a hidden Markov model (HMM), which we developed to assist those with restricted mobility. The semi-autonomous system is equipped with obstacle/collision avoidance sensors and it takes the electrooculography (EOG) signal traces from the user as commands to maneuver the wheelchair. The EOG traces originate from eyeball and eyelid movements and they are embedded in EEG signals collected from the scalp of the user at three different locations. Features extracted from the EOG traces are used to determine whether the eyes are open or closed, and whether the eyes are gazing to the right, center, or left. These features are utilized as inputs to a few support vector machine (SVM) classifiers, whose outputs are regarded as observations to an HMM. The HMM determines the state of the system and generates commands for navigating the wheelchair accordingly. The use of simple features and the implementation of a sliding window that captures important signatures in the EOG traces result in a fast execution time and high classification rates. The wheelchair is equipped with a proximity sensor and it can move forward and backward in three directions. The asynchronous system achieved an average classification rate of 98% when tested with online data while its average execution time was less than 1 s. It was also tested in a navigation experiment where all of the participants managed to complete the tasks successfully without collisions.
Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture
NASA Astrophysics Data System (ADS)
Bideault, Gautier; Mioulet, Luc; Chatelain, Clément; Paquet, Thierry
2015-01-01
In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.
Multi-scale chromatin state annotation using a hierarchical hidden Markov model
NASA Astrophysics Data System (ADS)
Marco, Eugenio; Meuleman, Wouter; Huang, Jialiang; Glass, Kimberly; Pinello, Luca; Wang, Jianrong; Kellis, Manolis; Yuan, Guo-Cheng
2017-04-01
Chromatin-state analysis is widely applied in the studies of development and diseases. However, existing methods operate at a single length scale, and therefore cannot distinguish large domains from isolated elements of the same type. To overcome this limitation, we present a hierarchical hidden Markov model, diHMM, to systematically annotate chromatin states at multiple length scales. We apply diHMM to analyse a public ChIP-seq data set. diHMM not only accurately captures nucleosome-level information, but identifies domain-level states that vary in nucleosome-level state composition, spatial distribution and functionality. The domain-level states recapitulate known patterns such as super-enhancers, bivalent promoters and Polycomb repressed regions, and identify additional patterns whose biological functions are not yet characterized. By integrating chromatin-state information with gene expression and Hi-C data, we identify context-dependent functions of nucleosome-level states. Thus, diHMM provides a powerful tool for investigating the role of higher-order chromatin structure in gene regulation.
Integrating hidden Markov model and PRAAT: a toolbox for robust automatic speech transcription
NASA Astrophysics Data System (ADS)
Kabir, A.; Barker, J.; Giurgiu, M.
2010-09-01
An automatic time-aligned phone transcription toolbox of English speech corpora has been developed. Especially the toolbox would be very useful to generate robust automatic transcription and able to produce phone level transcription using speaker independent models as well as speaker dependent models without manual intervention. The system is based on standard Hidden Markov Models (HMM) approach and it was successfully experimented over a large audiovisual speech corpus namely GRID corpus. One of the most powerful features of the toolbox is the increased flexibility in speech processing where the speech community would be able to import the automatic transcription generated by HMM Toolkit (HTK) into a popular transcription software, PRAAT, and vice-versa. The toolbox has been evaluated through statistical analysis on GRID data which shows that automatic transcription deviates by an average of 20 ms with respect to manual transcription.
Understanding eye movements in face recognition using hidden Markov models.
Chuk, Tim; Chan, Antoni B; Hsiao, Janet H
2014-09-16
We use a hidden Markov model (HMM) based approach to analyze eye movement data in face recognition. HMMs are statistical models that are specialized in handling time-series data. We conducted a face recognition task with Asian participants, and model each participant's eye movement pattern with an HMM, which summarized the participant's scan paths in face recognition with both regions of interest and the transition probabilities among them. By clustering these HMMs, we showed that participants' eye movements could be categorized into holistic or analytic patterns, demonstrating significant individual differences even within the same culture. Participants with the analytic pattern had longer response times, but did not differ significantly in recognition accuracy from those with the holistic pattern. We also found that correct and wrong recognitions were associated with distinctive eye movement patterns; the difference between the two patterns lies in the transitions rather than locations of the fixations alone. © 2014 ARVO.
Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; ...
2014-09-26
A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less
Chae, Minho; Danko, Charles G; Kraus, W Lee
2015-07-16
Global run-on coupled with deep sequencing (GRO-seq) provides extensive information on the location and function of coding and non-coding transcripts, including primary microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and enhancer RNAs (eRNAs), as well as yet undiscovered classes of transcripts. However, few computational tools tailored toward this new type of sequencing data are available, limiting the applicability of GRO-seq data for identifying novel transcription units. Here, we present groHMM, a computational tool in R, which defines the boundaries of transcription units de novo using a two state hidden-Markov model (HMM). A systematic comparison of the performance between groHMM and two existing peak-calling methods tuned to identify broad regions (SICER and HOMER) favorably supports our approach on existing GRO-seq data from MCF-7 breast cancer cells. To demonstrate the broader utility of our approach, we have used groHMM to annotate a diverse array of transcription units (i.e., primary transcripts) from four GRO-seq data sets derived from cells representing a variety of different human tissue types, including non-transformed cells (cardiomyocytes and lung fibroblasts) and transformed cells (LNCaP and MCF-7 cancer cells), as well as non-mammalian cells (from flies and worms). As an example of the utility of groHMM and its application to questions about the transcriptome, we show how groHMM can be used to analyze cell type-specific enhancers as defined by newly annotated enhancer transcripts. Our results show that groHMM can reveal new insights into cell type-specific transcription by identifying novel transcription units, and serve as a complete and useful tool for evaluating functional genomic elements in cells.
Maina, Ndegwa Henry; Pitkänen, Leena; Heikkinen, Sami; Tuomainen, Päivi; Virkki, Liisa; Tenkanen, Maija
2014-01-01
Dilute solutions of various dextran standards, a high-molar mass (HMM) commercial dextran from Leuconostoc spp., and HMM dextrans isolated from Weissella confusa and Leuconostoc citreum were analyzed with high-performance size-exclusion chromatography (HPSEC), asymmetric flow field-flow fractionation (AsFlFFF), and diffusion-ordered NMR spectroscopy (DOSY). HPSEC analyses were performed in aqueous and dimethyl sulfoxide (DMSO) solutions, while only aqueous solutions were utilized in AsFlFFF and DOSY. The study showed that all methods were applicable to dextran analysis, but differences between the aqueous and DMSO-based solutions were obtained for HMM samples. These differences were attributed to the presence of aggregates in aqueous solution that were less prevalent in DMSO. The study showed that DOSY provides an estimate of the size of HMM dextrans, though calibration standards may be required for each experimental set-up. To our knowledge, this is the first study utilizing these three methods in analyzing HMM dextrans. Copyright © 2013 Elsevier Ltd. All rights reserved.
Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.
Seiser, Eric L; Innocenti, Federico
2014-01-01
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2018-02-01
In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.
Detecting seismic waves using a binary hidden Markov model classifier
NASA Astrophysics Data System (ADS)
Ray, J.; Lefantzi, S.; Brogan, R. A.; Forrest, R.; Hansen, C. W.; Young, C. J.
2016-12-01
We explore the use of Hidden Markov Models (HMM) to detect the arrival of seismic waves using data captured by a seismogram. HMMs define the state of a station as a binary variable based on whether the station is receiving a signal or not. HMMs are simple and fast, allowing them to monitor multiple datastreams arising from a large distributed network of seismographs. In this study we examine the efficacy of HMM-based detectors with respect to their false positive and negative rates as well as the accuracy of the signal onset time as compared to the value determined by an expert analyst. The study uses 3 component International Monitoring System (IMS) data from a carefully analyzed 2 week period from May, 2010, for which our analyst tried to identify every signal. Part of this interval is used for training the HMM to recognize the transition between state from noise to signal, while the other is used for evaluating the effectiveness of our new detection algorithm. We compare our results with the STA/LTA detection processing applied by the IDC to assess potential for operational use. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Chao, Michael C.; Pritchard, Justin R.; Zhang, Yanjia J.; Rubin, Eric J.; Livny, Jonathan; Davis, Brigid M.; Waldor, Matthew K.
2013-01-01
The coupling of high-density transposon mutagenesis to high-throughput DNA sequencing (transposon-insertion sequencing) enables simultaneous and genome-wide assessment of the contributions of individual loci to bacterial growth and survival. We have refined analysis of transposon-insertion sequencing data by normalizing for the effect of DNA replication on sequencing output and using a hidden Markov model (HMM)-based filter to exploit heretofore unappreciated information inherent in all transposon-insertion sequencing data sets. The HMM can smooth variations in read abundance and thereby reduce the effects of read noise, as well as permit fine scale mapping that is independent of genomic annotation and enable classification of loci into several functional categories (e.g. essential, domain essential or ‘sick’). We generated a high-resolution map of genomic loci (encompassing both intra- and intergenic sequences) that are required or beneficial for in vitro growth of the cholera pathogen, Vibrio cholerae. This work uncovered new metabolic and physiologic requirements for V. cholerae survival, and by combining transposon-insertion sequencing and transcriptomic data sets, we also identified several novel noncoding RNA species that contribute to V. cholerae growth. Our findings suggest that HMM-based approaches will enhance extraction of biological meaning from transposon-insertion sequencing genomic data. PMID:23901011
Hidden Markov models in automatic speech recognition
NASA Astrophysics Data System (ADS)
Wrzoskowicz, Adam
1993-11-01
This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
Popularity Modeling for Mobile Apps: A Sequential Approach.
Zhu, Hengshu; Liu, Chuanren; Ge, Yong; Xiong, Hui; Chen, Enhong
2015-07-01
The popularity information in App stores, such as chart rankings, user ratings, and user reviews, provides an unprecedented opportunity to understand user experiences with mobile Apps, learn the process of adoption of mobile Apps, and thus enables better mobile App services. While the importance of popularity information is well recognized in the literature, the use of the popularity information for mobile App services is still fragmented and under-explored. To this end, in this paper, we propose a sequential approach based on hidden Markov model (HMM) for modeling the popularity information of mobile Apps toward mobile App services. Specifically, we first propose a popularity based HMM (PHMM) to model the sequences of the heterogeneous popularity observations of mobile Apps. Then, we introduce a bipartite based method to precluster the popularity observations. This can help to learn the parameters and initial values of the PHMM efficiently. Furthermore, we demonstrate that the PHMM is a general model and can be applicable for various mobile App services, such as trend based App recommendation, rating and review spam detection, and ranking fraud detection. Finally, we validate our approach on two real-world data sets collected from the Apple Appstore. Experimental results clearly validate both the effectiveness and efficiency of the proposed popularity modeling approach.
Du, Tianchuan; Liao, Li; Wu, Cathy H
2016-12-01
Identifying the residues in a protein that are involved in protein-protein interaction and identifying the contact matrix for a pair of interacting proteins are two computational tasks at different levels of an in-depth analysis of protein-protein interaction. Various methods for solving these two problems have been reported in the literature. However, the interacting residue prediction and contact matrix prediction were handled by and large independently in those existing methods, though intuitively good prediction of interacting residues will help with predicting the contact matrix. In this work, we developed a novel protein interacting residue prediction system, contact matrix-interaction profile hidden Markov model (CM-ipHMM), with the integration of contact matrix prediction and the ipHMM interaction residue prediction. We propose to leverage what is learned from the contact matrix prediction and utilize the predicted contact matrix as "feedback" to enhance the interaction residue prediction. The CM-ipHMM model showed significant improvement over the previous method that uses the ipHMM for predicting interaction residues only. It indicates that the downstream contact matrix prediction could help the interaction site prediction.
Hopkins, Heidi; Talisuna, Ambrose; Whitty, Christopher Jm; Staedke, Sarah G
2007-10-08
Home-based management of malaria (HMM) is promoted as a major strategy to improve prompt delivery of effective malaria treatment in Africa. HMM involves presumptively treating febrile children with pre-packaged antimalarial drugs distributed by members of the community. HMM has been implemented in several African countries, and artemisinin-based combination therapies (ACTs) will likely be introduced into these programmes on a wide scale. The published literature was searched for studies that evaluated the health impact of community- and home-based treatment for malaria in Africa. Criteria for inclusion were: 1) the intervention consisted of antimalarial treatment administered presumptively for febrile illness; 2) the treatment was administered by local community members who had no formal education in health care; 3) measured outcomes included specific health indicators such as malaria morbidity (incidence, severity, parasite rates) and/or mortality; and 4) the study was conducted in Africa. Of 1,069 potentially relevant publications identified, only six studies, carried out over 18 years, were identified as meeting inclusion criteria. Heterogeneity of the evaluations, including variability in study design, precluded meta-analysis. All trials evaluated presumptive treatment with chloroquine and were conducted in rural areas, and most were done in settings with seasonal malaria transmission. Conclusions regarding the impact of HMM on morbidity and mortality endpoints were mixed. Two studies showed no health impact, while another showed a decrease in malaria prevalence and incidence, but no impact on mortality. One study in Burkina Faso suggested that HMM decreased the proportion of severe malaria cases, while another study from the same country showed a decrease in the risk of progression to severe malaria. Of the four studies with mortality endpoints only one from Ethiopia showed a positive impact, with a reduction in the under-5 mortality rate of 40.6% (95% CI 29.2 - 50.6). Currently the evidence base for HMM in Africa, particularly regarding use of ACTs, is narrow and priorities for further research are discussed. To optimize treatment and maximize health benefits, drug regimens and delivery strategies in HMM programmes may need to be tailored to local conditions. Additional research could help guide programme development, policy decision-making, and implementation.
Recognizing visual focus of attention from head pose in natural meetings.
Ba, Sileye O; Odobez, Jean-Marc
2009-02-01
We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian mixture model (GMM) or a hidden Markov model (HMM) whose hidden states correspond to the VFOA. The novelties of this paper are threefold. First, contrary to previous studies on the topic, in our setup, the potential VFOA of a person is not restricted to other participants only. It includes environmental targets as well (a table and a projection screen), which increases the complexity of the task, with more VFOA targets spread in the pan as well as tilt gaze space. Second, we propose a geometric model to set the GMM or HMM parameters by exploiting results from cognitive science on saccadic eye motion, which allows the prediction of the head pose given a gaze target. Third, an unsupervised parameter adaptation step not using any labeled data is proposed, which accounts for the specific gazing behavior of each participant. Using a publicly available corpus of eight meetings featuring four persons, we analyze the above methods by evaluating, through objective performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device or a vision-based tracking system. The results clearly show that in such complex but realistic situations, the VFOA recognition performance is highly dependent on how well the visual targets are separated for a given meeting participant. In addition, the results show that the use of a geometric model with unsupervised adaptation achieves better results than the use of training data to set the HMM parameters.
Passive acoustic leak detection for sodium cooled fast reactors using hidden Markov models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riber Marklund, A.; Kishore, S.; Prakash, V.
2015-07-01
Acoustic leak detection for steam generators of sodium fast reactors have been an active research topic since the early 1970's and several methods have been tested over the years. Inspired by its success in the field of automatic speech recognition, we here apply hidden Markov models (HMM) in combination with Gaussian mixture models (GMM) to the problem. To achieve this, we propose a new feature calculation scheme, based on the temporal evolution of the power spectral density (PSD) of the signal. Using acoustic signals recorded during steam/water injection experiments done at the Indira Gandhi Centre for Atomic Research (IGCAR), themore » proposed method is tested. We perform parametric studies on the HMM+GMM model size and demonstrate that the proposed method a) performs well without a priori knowledge of injection noise, b) can incorporate several noise models and c) has an output distribution that simplifies false alarm rate control. (authors)« less
ECG signal analysis through hidden Markov models.
Andreão, Rodrigo V; Dorizzi, Bernadette; Boudy, Jérôme
2006-08-01
This paper presents an original hidden Markov model (HMM) approach for online beat segmentation and classification of electrocardiograms. The HMM framework has been visited because of its ability of beat detection, segmentation and classification, highly suitable to the electrocardiogram (ECG) problem. Our approach addresses a large panel of topics some of them never studied before in other HMM related works: waveforms modeling, multichannel beat segmentation and classification, and unsupervised adaptation to the patient's ECG. The performance was evaluated on the two-channel QT database in terms of waveform segmentation precision, beat detection and classification. Our waveform segmentation results compare favorably to other systems in the literature. We also obtained high beat detection performance with sensitivity of 99.79% and a positive predictivity of 99.96%, using a test set of 59 recordings. Moreover, premature ventricular contraction beats were detected using an original classification strategy. The results obtained validate our approach for real world application.
Prediction of lipoprotein signal peptides in Gram-negative bacteria.
Juncker, Agnieszka S; Willenbrock, Hanni; Von Heijne, Gunnar; Brunak, Søren; Nielsen, Henrik; Krogh, Anders
2003-08-01
A method to predict lipoprotein signal peptides in Gram-negative Eubacteria, LipoP, has been developed. The hidden Markov model (HMM) was able to distinguish between lipoproteins (SPaseII-cleaved proteins), SPaseI-cleaved proteins, cytoplasmic proteins, and transmembrane proteins. This predictor was able to predict 96.8% of the lipoproteins correctly with only 0.3% false positives in a set of SPaseI-cleaved, cytoplasmic, and transmembrane proteins. The results obtained were significantly better than those of previously developed methods. Even though Gram-positive lipoprotein signal peptides differ from Gram-negatives, the HMM was able to identify 92.9% of the lipoproteins included in a Gram-positive test set. A genome search was carried out for 12 Gram-negative genomes and one Gram-positive genome. The results for Escherichia coli K12 were compared with new experimental data, and the predictions by the HMM agree well with the experimentally verified lipoproteins. A neural network-based predictor was developed for comparison, and it gave very similar results. LipoP is available as a Web server at www.cbs.dtu.dk/services/LipoP/.
Prediction of lipoprotein signal peptides in Gram-negative bacteria
Juncker, Agnieszka S.; Willenbrock, Hanni; von Heijne, Gunnar; Brunak, Søren; Nielsen, Henrik; Krogh, Anders
2003-01-01
A method to predict lipoprotein signal peptides in Gram-negative Eubacteria, LipoP, has been developed. The hidden Markov model (HMM) was able to distinguish between lipoproteins (SPaseII-cleaved proteins), SPaseI-cleaved proteins, cytoplasmic proteins, and transmembrane proteins. This predictor was able to predict 96.8% of the lipoproteins correctly with only 0.3% false positives in a set of SPaseI-cleaved, cytoplasmic, and transmembrane proteins. The results obtained were significantly better than those of previously developed methods. Even though Gram-positive lipoprotein signal peptides differ from Gram-negatives, the HMM was able to identify 92.9% of the lipoproteins included in a Gram-positive test set. A genome search was carried out for 12 Gram-negative genomes and one Gram-positive genome. The results for Escherichia coli K12 were compared with new experimental data, and the predictions by the HMM agree well with the experimentally verified lipoproteins. A neural network-based predictor was developed for comparison, and it gave very similar results. LipoP is available as a Web server at www.cbs.dtu.dk/services/LipoP/. PMID:12876315
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.
Zaman, Rianon; Chowdhury, Shahana Yasmin; Rashid, Mahmood A; Sharma, Alok; Dehzangi, Abdollah; Shatabda, Swakkhar
2017-01-01
DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.
A Novel Approach to ECG Classification Based upon Two-Layered HMMs in Body Sensor Networks
Liang, Wei; Zhang, Yinlong; Tan, Jindong; Li, Yang
2014-01-01
This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient's ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. In ECG preprocessing, an integral-coefficient-band-stop (ICBS) filter is applied, which omits time-consuming floating-point computations. In addition, two-layered Hidden Markov Models (HMMs) are applied to achieve ECG feature extraction and classification. The periodic ECG waveforms are segmented into ISO intervals, P subwave, QRS complex and T subwave respectively in the first HMM layer where expert-annotation assisted Baum-Welch algorithm is utilized in HMM modeling. Then the corresponding interval features are selected and applied to categorize the ECG into normal type or abnormal type (PVC, APC) in the second HMM layer. For verifying the effectiveness of our algorithm on abnormal signal detection, we have developed an ECG body sensor network (BSN) platform, whereby real-time ECG signals are collected, transmitted, displayed and the corresponding classification outcomes are deduced and shown on the BSN screen. PMID:24681668
Sarode, Ashish; Wang, Peng; Cote, Catherine; Worthen, David R
2013-03-01
Hydroxypropylcellulose (HPC)-SL and -SSL, low-viscosity hydroxypropylcellulose polymers, are versatile pharmaceutical excipients. The utility of HPC polymers was assessed for both dissolution enhancement and sustained release of pharmaceutical drugs using various processing techniques. The BCS class II drugs carbamazepine (CBZ), hydrochlorthiazide, and phenytoin (PHT) were hot melt mixed (HMM) with various polymers. PHT formulations produced by solvent evaporation (SE) and ball milling (BM) were prepared using HPC-SSL. HMM formulations of BCS class I chlorpheniramine maleate (CPM) were prepared using HPC-SL and -SSL. These solid dispersions (SDs) manufactured using different processes were evaluated for amorphous transformation and dissolution characteristics. Drug degradation because of HMM processing was also assessed. Amorphous conversion using HMM could be achieved only for relatively low-melting CBZ and CPM. SE and BM did not produce amorphous SDs of PHT using HPC-SSL. Chemical stability of all the drugs was maintained using HPC during the HMM process. Dissolution enhancement was observed in HPC-based HMMs and compared well to other polymers. The dissolution enhancement of PHT was in the order of SE>BM>HMM>physical mixtures, as compared to the pure drug, perhaps due to more intimate mixing that occurred during SE and BM than in HMM. Dissolution of CPM could be significantly sustained in simulated gastric and intestinal fluids using HPC polymers. These studies revealed that low-viscosity HPC-SL and -SSL can be employed to produce chemically stable SDs of poorly as well as highly water-soluble drugs using various pharmaceutical processes in order to control drug dissolution.
Uncovering the cognitive processes underlying mental rotation: an eye-movement study.
Xue, Jiguo; Li, Chunyong; Quan, Cheng; Lu, Yiming; Yue, Jingwei; Zhang, Chenggang
2017-08-30
Mental rotation is an important paradigm for spatial ability. Mental-rotation tasks are assumed to involve five or three sequential cognitive-processing states, though this has not been demonstrated experimentally. Here, we investigated how processing states alternate during mental-rotation tasks. Inference was carried out using an advanced statistical modelling and data-driven approach - a discriminative hidden Markov model (dHMM) trained using eye-movement data obtained from an experiment consisting of two different strategies: (I) mentally rotate the right-side figure to be aligned with the left-side figure and (II) mentally rotate the left-side figure to be aligned with the right-side figure. Eye movements were found to contain the necessary information for determining the processing strategy, and the dHMM that best fit our data segmented the mental-rotation process into three hidden states, which we termed encoding and searching, comparison, and searching on one-side pair. Additionally, we applied three classification methods, logistic regression, support vector model and dHMM, of which dHMM predicted the strategies with the highest accuracy (76.8%). Our study did confirm that there are differences in processing states between these two of mental-rotation strategies, and were consistent with the previous suggestion that mental rotation is discrete process that is accomplished in a piecemeal fashion.
A Robust Self-Alignment Method for Ship's Strapdown INS Under Mooring Conditions
Sun, Feng; Lan, Haiyu; Yu, Chunyang; El-Sheimy, Naser; Zhou, Guangtao; Cao, Tong; Liu, Hang
2013-01-01
Strapdown inertial navigation systems (INS) need an alignment process to determine the initial attitude matrix between the body frame and the navigation frame. The conventional alignment process is to compute the initial attitude matrix using the gravity and Earth rotational rate measurements. However, under mooring conditions, the inertial measurement unit (IMU) employed in a ship's strapdown INS often suffers from both the intrinsic sensor noise components and the external disturbance components caused by the motions of the sea waves and wind waves, so a rapid and precise alignment of a ship's strapdown INS without any auxiliary information is hard to achieve. A robust solution is given in this paper to solve this problem. The inertial frame based alignment method is utilized to adapt the mooring condition, most of the periodical low-frequency external disturbance components could be removed by the mathematical integration and averaging characteristic of this method. A novel prefilter named hidden Markov model based Kalman filter (HMM-KF) is proposed to remove the relatively high-frequency error components. Different from the digital filters, the HMM-KF barely cause time-delay problem. The turntable, mooring and sea experiments favorably validate the rapidness and accuracy of the proposed self-alignment method and the good de-noising performance of HMM-KF. PMID:23799492
Using DEDICOM for completely unsupervised part-of-speech tagging.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chew, Peter A.; Bader, Brett William; Rozovskaya, Alla
A standard and widespread approach to part-of-speech tagging is based on Hidden Markov Models (HMMs). An alternative approach, pioneered by Schuetze (1993), induces parts of speech from scratch using singular value decomposition (SVD). We introduce DEDICOM as an alternative to SVD for part-of-speech induction. DEDICOM retains the advantages of SVD in that it is completely unsupervised: no prior knowledge is required to induce either the tagset or the associations of terms with tags. However, unlike SVD, it is also fully compatible with the HMM framework, in that it can be used to estimate emission- and transition-probability matrices which can thenmore » be used as the input for an HMM. We apply the DEDICOM method to the CONLL corpus (CONLL 2000) and compare the output of DEDICOM to the part-of-speech tags given in the corpus, and find that the correlation (almost 0.5) is quite high. Using DEDICOM, we also estimate part-of-speech ambiguity for each term, and find that these estimates correlate highly with part-of-speech ambiguity as measured in the original corpus (around 0.88). Finally, we show how the output of DEDICOM can be evaluated and compared against the more familiar output of supervised HMM-based tagging.« less
End-to-End ASR-Free Keyword Search From Speech
NASA Astrophysics Data System (ADS)
Audhkhasi, Kartik; Rosenberg, Andrew; Sethy, Abhinav; Ramabhadran, Bhuvana; Kingsbury, Brian
2017-12-01
End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid hidden Markov model (HMM)-deep neural network based automatic speech recognition (ASR) systems. Such E2E systems are attractive due to the lack of dependence on alignments between input acoustic and output grapheme or HMM state sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to reconstruct the audio through a finite-dimensional representation. The second sub-system is a character-level RNN language model using embeddings learned from a convolutional neural network. Since the acoustic and text query embeddings occupy different representation spaces, they are input to a third feed-forward neural network that predicts whether the query occurs in the acoustic utterance or not. This E2E ASR-free KWS system performs respectably despite lacking a conventional ASR system and trains much faster.
Self-Organizing Hidden Markov Model Map (SOHMMM).
Ferles, Christos; Stafylopatis, Andreas
2013-12-01
A hybrid approach combining the Self-Organizing Map (SOM) and the Hidden Markov Model (HMM) is presented. The Self-Organizing Hidden Markov Model Map (SOHMMM) establishes a cross-section between the theoretic foundations and algorithmic realizations of its constituents. The respective architectures and learning methodologies are fused in an attempt to meet the increasing requirements imposed by the properties of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein chain molecules. The fusion and synergy of the SOM unsupervised training and the HMM dynamic programming algorithms bring forth a novel on-line gradient descent unsupervised learning algorithm, which is fully integrated into the SOHMMM. Since the SOHMMM carries out probabilistic sequence analysis with little or no prior knowledge, it can have a variety of applications in clustering, dimensionality reduction and visualization of large-scale sequence spaces, and also, in sequence discrimination, search and classification. Two series of experiments based on artificial sequence data and splice junction gene sequences demonstrate the SOHMMM's characteristics and capabilities. Copyright © 2013 Elsevier Ltd. All rights reserved.
Hidden Markov models of biological primary sequence information.
Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A
1994-01-01
Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831
Accelerated Profile HMM Searches
Eddy, Sean R.
2011-01-01
Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. PMID:22039361
Histogram equalization with Bayesian estimation for noise robust speech recognition.
Suh, Youngjoo; Kim, Hoirin
2018-02-01
The histogram equalization approach is an efficient feature normalization technique for noise robust automatic speech recognition. However, it suffers from performance degradation when some fundamental conditions are not satisfied in the test environment. To remedy these limitations of the original histogram equalization methods, class-based histogram equalization approach has been proposed. Although this approach showed substantial performance improvement under noise environments, it still suffers from performance degradation due to the overfitting problem when test data are insufficient. To address this issue, the proposed histogram equalization technique employs the Bayesian estimation method in the test cumulative distribution function estimation. It was reported in a previous study conducted on the Aurora-4 task that the proposed approach provided substantial performance gains in speech recognition systems based on the acoustic modeling of the Gaussian mixture model-hidden Markov model. In this work, the proposed approach was examined in speech recognition systems with deep neural network-hidden Markov model (DNN-HMM), the current mainstream speech recognition approach where it also showed meaningful performance improvement over the conventional maximum likelihood estimation-based method. The fusion of the proposed features with the mel-frequency cepstral coefficients provided additional performance gains in DNN-HMM systems, which otherwise suffer from performance degradation in the clean test condition.
Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen
2017-09-25
In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.
On the Tradeoff Between Altruism and Selfishness in MANET Trust Management
2016-04-07
to discourage selfish behaviors, using a hidden Markov model (HMM) to quanti - tatively measure the trustworthiness of nodes. Adams et al. [18...based reliability metric to predict trust-based system survivability. Section 4 analyzes numerical results obtained through the evaluation of our SPN...concepts in MANETs, trust man- agement for MANETs should consider the following design features: trust metrics must be customizable, evaluation of
Effects of emotion on different phoneme classes
NASA Astrophysics Data System (ADS)
Lee, Chul Min; Yildirim, Serdar; Bulut, Murtaza; Busso, Carlos; Kazemzadeh, Abe; Lee, Sungbok; Narayanan, Shrikanth
2004-10-01
This study investigates the effects of emotion on different phoneme classes using short-term spectral features. In the research on emotion in speech, most studies have focused on prosodic features of speech. In this study, based on the hypothesis that different emotions have varying effects on the properties of the different speech sounds, we investigate the usefulness of phoneme-class level acoustic modeling for automatic emotion classification. Hidden Markov models (HMM) based on short-term spectral features for five broad phonetic classes are used for this purpose using data obtained from recordings of two actresses. Each speaker produces 211 sentences with four different emotions (neutral, sad, angry, happy). Using the speech material we trained and compared the performances of two sets of HMM classifiers: a generic set of ``emotional speech'' HMMs (one for each emotion) and a set of broad phonetic-class based HMMs (vowel, glide, nasal, stop, fricative) for each emotion type considered. Comparison of classification results indicates that different phoneme classes were affected differently by emotional change and that the vowel sounds are the most important indicator of emotions in speech. Detailed results and their implications on the underlying speech articulation will be discussed.
Achieving pattern uniformity in plasmonic lithography by spatial frequency selection
NASA Astrophysics Data System (ADS)
Liang, Gaofeng; Chen, Xi; Zhao, Qing; Guo, L. Jay
2018-01-01
The effects of the surface roughness of thin films and defects on photomasks are investigated in two representative plasmonic lithography systems: thin silver film-based superlens and multilayer-based hyperbolic metamaterial (HMM). Superlens can replicate arbitrary patterns because of its broad evanescent wave passband, which also makes it inherently vulnerable to the roughness of the thin film and imperfections of the mask. On the other hand, the HMM system has spatial frequency filtering characteristics and its pattern formation is based on interference, producing uniform and stable periodic patterns. In this work, we show that the HMM system is more immune to such imperfections due to its function of spatial frequency selection. The analyses are further verified by an interference lithography system incorporating the photoresist layer as an optical waveguide to improve the aspect ratio of the pattern. It is concluded that a system capable of spatial frequency selection is a powerful method to produce deep-subwavelength periodic patterns with high degree of uniformity and fidelity.
Lall, Rahul K; Syed, Deeba N; Khan, Mohammad Imran; Adhami, Vaqar M; Gong, Yuansheng; Lucey, John A; Mukhtar, Hasan
2016-09-01
We and others have shown previously that fisetin, a plant flavonoid, has therapeutic potential against many cancer types. Here, we examined the probable mechanism of its action in prostate cancer (PCa) using a global metabolomics approach. HPLC-ESI-MS analysis of tumor xenografts from fisetin-treated animals identified several metabolic targets with hyaluronan (HA) as the most affected. Efficacy of fisetin on HA was then evaluated in vitro and also in vivo in the transgenic TRAMP mouse model of PCa. Size exclusion chromatography-multiangle laser light scattering (SEC-MALS) was performed to analyze the molar mass (Mw) distribution of HA. Fisetin treatment downregulated intracellular and secreted HA levels both in vitro and in vivo Fisetin inhibited HA synthesis and degradation enzymes, which led to cessation of HA synthesis and also repressed the degradation of the available high-molecular-mass (HMM)-HA. SEC-MALS analysis of intact HA fragment size revealed that cells and animals have more abundance of HMM-HA and less of low-molecular-mass (LMM)-HA upon fisetin treatment. Elevated HA levels have been shown to be associated with disease progression in certain cancer types. Biological responses triggered by HA mainly depend on the HA polymer length where HMM-HA represses mitogenic signaling and has anti-inflammatory properties whereas LMM-HA promotes proliferation and inflammation. Similarly, Mw analysis of secreted HA fragment size revealed less HMM-HA is secreted that allowed more HMM-HA to be retained within the cells and tissues. Our findings establish that fisetin is an effective, non-toxic, potent HA synthesis inhibitor, which increases abundance of antiangiogenic HMM-HA and could be used for the management of PCa. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Automated EEG sleep staging in the term-age baby using a generative modelling approach.
Pillay, Kirubin; Dereymaeker, Anneleen; Jansen, Katrien; Naulaers, Gunnar; Van Huffel, Sabine; De Vos, Maarten
2018-06-01
We develop a method for automated four-state sleep classification of preterm and term-born babies at term-age of 38-40 weeks postmenstrual age (the age since the last menstrual cycle of the mother) using multichannel electroencephalogram (EEG) recordings. At this critical age, EEG differentiates from broader quiet sleep (QS) and active sleep (AS) stages to four, more complex states, and the quality and timing of this differentiation is indicative of the level of brain development. However, existing methods for automated sleep classification remain focussed only on QS and AS sleep classification. EEG features were calculated from 16 EEG recordings, in 30 s epochs, and personalized feature scaling used to correct for some of the inter-recording variability, by standardizing each recording's feature data using its mean and standard deviation. Hidden Markov models (HMMs) and Gaussian mixture models (GMMs) were trained, with the HMM incorporating knowledge of the sleep state transition probabilities. Performance of the GMM and HMM (with and without scaling) were compared, and Cohen's kappa agreement calculated between the estimates and clinicians' visual labels. For four-state classification, the HMM proved superior to the GMM. With the inclusion of personalized feature scaling, mean kappa (±standard deviation) was 0.62 (±0.16) compared to the GMM value of 0.55 (±0.15). Without feature scaling, kappas for the HMM and GMM dropped to 0.56 (±0.18) and 0.51 (±0.15), respectively. This is the first study to present a successful method for the automated staging of four states in term-age sleep using multichannel EEG. Results suggested a benefit in incorporating transition information using an HMM, and correcting for inter-recording variability through personalized feature scaling. Determining the timing and quality of these states are indicative of developmental delays in both preterm and term-born babies that may lead to learning problems by school age.
Automated EEG sleep staging in the term-age baby using a generative modelling approach
NASA Astrophysics Data System (ADS)
Pillay, Kirubin; Dereymaeker, Anneleen; Jansen, Katrien; Naulaers, Gunnar; Van Huffel, Sabine; De Vos, Maarten
2018-06-01
Objective. We develop a method for automated four-state sleep classification of preterm and term-born babies at term-age of 38-40 weeks postmenstrual age (the age since the last menstrual cycle of the mother) using multichannel electroencephalogram (EEG) recordings. At this critical age, EEG differentiates from broader quiet sleep (QS) and active sleep (AS) stages to four, more complex states, and the quality and timing of this differentiation is indicative of the level of brain development. However, existing methods for automated sleep classification remain focussed only on QS and AS sleep classification. Approach. EEG features were calculated from 16 EEG recordings, in 30 s epochs, and personalized feature scaling used to correct for some of the inter-recording variability, by standardizing each recording’s feature data using its mean and standard deviation. Hidden Markov models (HMMs) and Gaussian mixture models (GMMs) were trained, with the HMM incorporating knowledge of the sleep state transition probabilities. Performance of the GMM and HMM (with and without scaling) were compared, and Cohen’s kappa agreement calculated between the estimates and clinicians’ visual labels. Main results. For four-state classification, the HMM proved superior to the GMM. With the inclusion of personalized feature scaling, mean kappa (±standard deviation) was 0.62 (±0.16) compared to the GMM value of 0.55 (±0.15). Without feature scaling, kappas for the HMM and GMM dropped to 0.56 (±0.18) and 0.51 (±0.15), respectively. Significance. This is the first study to present a successful method for the automated staging of four states in term-age sleep using multichannel EEG. Results suggested a benefit in incorporating transition information using an HMM, and correcting for inter-recording variability through personalized feature scaling. Determining the timing and quality of these states are indicative of developmental delays in both preterm and term-born babies that may lead to learning problems by school age.
Noda-García, Lianet; Juárez-Vázquez, Ana L; Ávila-Arcos, María C; Verduzco-Castro, Ernesto A; Montero-Morán, Gabriela; Gaytán, Paul; Carrillo-Tripp, Mauricio; Barona-Gómez, Francisco
2015-06-10
Current sequence-based approaches to identify enzyme functional shifts, such as enzyme promiscuity, have proven to be highly dependent on a priori functional knowledge, hampering our ability to reconstruct evolutionary history behind these mechanisms. Hidden Markov Model (HMM) profiles, broadly used to classify enzyme families, can be useful to distinguish between closely related enzyme families with different specificities. The (βα)8-isomerase HisA/PriA enzyme family, involved in L-histidine (HisA, mono-substrate) biosynthesis in most bacteria and plants, but also in L-tryptophan (HisA/TrpF or PriA, dual-substrate) biosynthesis in most Actinobacteria, has been used as model system to explore evolutionary hypotheses and therefore has a considerable amount of evolutionary, functional and structural knowledge available. We searched for functional evolutionary intermediates between the HisA and PriA enzyme families in order to understand the functional divergence between these families. We constructed a HMM profile that correctly classifies sequences of unknown function into the HisA and PriA enzyme sub-families. Using this HMM profile, we mined a large metagenome to identify plausible evolutionary intermediate sequences between HisA and PriA. These sequences were used to perform phylogenetic reconstructions and to identify functionally conserved amino acids. Biochemical characterization of one selected enzyme (CAM1) with a mutation within the functionally essential N-terminus phosphate-binding site, namely, an alanine instead of a glycine in HisA or a serine in PriA, showed that this evolutionary intermediate has dual-substrate specificity. Moreover, site-directed mutagenesis of this alanine residue, either backwards into a glycine or forward into a serine, revealed the robustness of this enzyme. None of these mutations, presumably upon functionally essential amino acids, significantly abolished its enzyme activities. A truncated version of this enzyme (CAM2) predicted to adopt a (βα)6-fold, and thus entirely lacking a C-terminus phosphate-binding site, was identified and shown to have HisA activity. As expected, reconstruction of the evolution of PriA from HisA with HMM profiles suggest that functional shifts involve mutations in evolutionarily intermediate enzymes of otherwise functionally essential residues or motifs. These results are in agreement with a link between promiscuous enzymes and intragenic epistasis. HMM provides a convenient approach for gaining insights into these evolutionary processes.
Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei
2014-09-01
In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Capturing the state transitions of seizure-like events using Hidden Markov models.
Guirgis, Mirna; Serletis, Demitre; Carlen, Peter L; Bardakjian, Berj L
2011-01-01
The purpose of this study was to investigate the number of states present in the progression of a seizure-like event (SLE). Of particular interest is to determine if there are more than two clearly defined states, as this would suggest that there is a distinct state preceding an SLE. Whole-intact hippocampus from C57/BL mice was used to model epileptiform activity induced by the perfusion of a low Mg(2+)/high K(+) solution while extracellular field potentials were recorded from CA3 pyramidal neurons. Hidden Markov models (HMM) were used to model the state transitions of the recorded SLEs by incorporating various features of the Hilbert transform into the training algorithm; specifically, 2- and 3-state HMMs were explored. Although the 2-state model was able to distinguish between SLE and nonSLE behavior, it provided no improvements compared to visual inspection alone. However, the 3-state model was able to capture two distinct nonSLE states that visual inspection failed to discriminate. Moreover, by developing an HMM based system a priori knowledge of the state transitions was not required making this an ideal platform for seizure prediction algorithms.
NASA Astrophysics Data System (ADS)
Zhang, Wei; Jiang, Ling; Han, Lei
2018-04-01
Convective storm nowcasting refers to the prediction of the convective weather initiation, development, and decay in a very short term (typically 0 2 h) .Despite marked progress over the past years, severe convective storm nowcasting still remains a challenge. With the boom of machine learning, it has been well applied in various fields, especially convolutional neural network (CNN). In this paper, we build a servere convective weather nowcasting system based on CNN and hidden Markov model (HMM) using reanalysis meteorological data. The goal of convective storm nowcasting is to predict if there is a convective storm in 30min. In this paper, we compress the VDRAS reanalysis data to low-dimensional data by CNN as the observation vector of HMM, then obtain the development trend of strong convective weather in the form of time series. It shows that, our method can extract robust features without any artificial selection of features, and can capture the development trend of strong convective storm.
A method of hidden Markov model optimization for use with geophysical data sets
NASA Technical Reports Server (NTRS)
Granat, R. A.
2003-01-01
Geophysics research has been faced with a growing need for automated techniques with which to process large quantities of data. A successful tool must meet a number of requirements: it should be consistent, require minimal parameter tuning, and produce scientifically meaningful results in reasonable time. We introduce a hidden Markov model (HMM)-based method for analysis of geophysical data sets that attempts to address these issues.
Dancker, P
1975-01-01
1. The dependence on ATP concentration of ATPase activity and light scattering decrease of acto-HMM could be described at very low ionic strength by one hyperbolic adsorption isotherm with a dissociation constant of 3 X 10(-6)M. Hence the increase of ATP ase activity was paralleled by a decrease in light scattering. At higher values of ionic strength ATPase activity stopped rising before HMM was completely saturated with ATP. Higher ionic strength prevented ATPase activity from further increasing when the rigor links (links between actin and nucleotide-free myosin), which have formerly protected the ATPase against the suppressing action of higher ionic strength have fallen below a certain amount. This protecting influence of rigor links did not require tropomyosin-troponin. 2. For complete activation of ATPase activity by actin less actin was needed when HMM was incompletely saturated with ATP than when it was completely saturated with ATP. 3. The apparent affinity of ATP to regulated acto-HMM (which contained tropomyosin-troponin) was lower than to unregulated acto-HMM (which was devoid of tropomyosin-troponin). In the presence of rigor complexes (indicated by an incomplete decrease of light scattering) the ATPase activity of regulated acto-HMM was higher than that of unregulated acto-HMM. At increasing ATP concentrations the ATPase activity of regulated acto-HMM stopped rising at a similar degree of saturation with ATP as the ATPase activity of unregulated acto-HMM at the same ionic strength.
NASA Astrophysics Data System (ADS)
Suvorova, S.; Clearwater, P.; Melatos, A.; Sun, L.; Moran, W.; Evans, R. J.
2017-11-01
A hidden Markov model (HMM) scheme for tracking continuous-wave gravitational radiation from neutron stars in low-mass x-ray binaries (LMXBs) with wandering spin is extended by introducing a frequency-domain matched filter, called the J -statistic, which sums the signal power in orbital sidebands coherently. The J -statistic is similar but not identical to the binary-modulated F -statistic computed by demodulation or resampling. By injecting synthetic LMXB signals into Gaussian noise characteristic of the Advanced Laser Interferometer Gravitational-wave Observatory (Advanced LIGO), it is shown that the J -statistic HMM tracker detects signals with characteristic wave strain h0≥2 ×10-26 in 370 d of data from two interferometers, divided into 37 coherent blocks of equal length. When applied to data from Stage I of the Scorpius X-1 Mock Data Challenge organized by the LIGO Scientific Collaboration, the tracker detects all 50 closed injections (h0≥6.84 ×10-26), recovering the frequency with a root-mean-square accuracy of ≤1.95 ×10-5 Hz . Of the 50 injections, 43 (with h0≥1.09 ×10-25) are detected in a single, coherent 10 d block of data. The tracker employs an efficient, recursive HMM solver based on the Viterbi algorithm, which requires ˜105 CPU-hours for a typical broadband (0.5 kHz) LMXB search.
Speech recognition for embedded automatic positioner for laparoscope
NASA Astrophysics Data System (ADS)
Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin
2014-07-01
In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
NASA Astrophysics Data System (ADS)
Khebbab, Mohamed; Feliachi, Mouloud; El Hadi Latreche, Mohamed
2018-03-01
In this present paper, a simulation of eddy current non-destructive testing (EC NDT) on unidirectional carbon fiber reinforced polymer is performed; for this magneto-dynamic formulation in term of magnetic vector potential is solved using finite element heterogeneous multi-scale method (FE HMM). FE HMM has as goal to compute the homogenized solution without calculating the homogenized tensor explicitly, the solution is based only on the physical characteristic known in micro domain. This feature is well adapted to EC NDT to evaluate defect in carbon composite material in microscopic scale, where the defect detection is performed by coil impedance measurement; the measurement value is intimately linked to material characteristic in microscopic level. Based on this, our model can handle different defects such as: cracks, inclusion, internal electrical conductivity changes, heterogeneities, etc. The simulation results were compared with the solution obtained with homogenized material using mixture law, a good agreement was found.
An online handwriting recognition system for Turkish
NASA Astrophysics Data System (ADS)
Vural, Esra; Erdogan, Hakan; Oflazer, Kemal; Yanikoglu, Berrin A.
2004-12-01
Despite recent developments in Tablet PC technology, there has not been any applications for recognizing handwritings in Turkish. In this paper, we present an online handwritten text recognition system for Turkish, developed using the Tablet PC interface. However, even though the system is developed for Turkish, the addressed issues are common to online handwriting recognition systems in general. Several dynamic features are extracted from the handwriting data for each recorded point and Hidden Markov Models (HMM) are used to train letter and word models. We experimented with using various features and HMM model topologies, and report on the effects of these experiments. We started with first and second derivatives of the x and y coordinates and relative change in the pen pressure as initial features. We found that using two more additional features, that is, number of neighboring points and relative heights of each point with respect to the base-line improve the recognition rate. In addition, extracting features within strokes and using a skipping state topology improve the system performance as well. The improved system performance is 94% in recognizing handwritten words from a 1000-word lexicon.
An online handwriting recognition system for Turkish
NASA Astrophysics Data System (ADS)
Vural, Esra; Erdogan, Hakan; Oflazer, Kemal; Yanikoglu, Berrin A.
2005-01-01
Despite recent developments in Tablet PC technology, there has not been any applications for recognizing handwritings in Turkish. In this paper, we present an online handwritten text recognition system for Turkish, developed using the Tablet PC interface. However, even though the system is developed for Turkish, the addressed issues are common to online handwriting recognition systems in general. Several dynamic features are extracted from the handwriting data for each recorded point and Hidden Markov Models (HMM) are used to train letter and word models. We experimented with using various features and HMM model topologies, and report on the effects of these experiments. We started with first and second derivatives of the x and y coordinates and relative change in the pen pressure as initial features. We found that using two more additional features, that is, number of neighboring points and relative heights of each point with respect to the base-line improve the recognition rate. In addition, extracting features within strokes and using a skipping state topology improve the system performance as well. The improved system performance is 94% in recognizing handwritten words from a 1000-word lexicon.
Kamoun, Choumouss; Payen, Thibaut; Hua-Van, Aurélie; Filée, Jonathan
2013-10-11
Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches. In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods. Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.
Kim, Kye-Young; Kawamoto, Sachiyo; Bao, Jianjun; Sellers, James R.; Adelstein, Robert S.
2008-01-01
We report the initial biochemical characterization of an alternatively spliced isoform of nonmuscle heavy meromyosin (HMM) II-B2 and compare it with HMM II-B0, the non-spliced isoform. HMM II-B2 is the HMM derivative of an alternatively spliced isoform of endogenous nonmuscle myosin (NM) II-B, which has 21-amino acids inserted into loop 2, near the actin-binding region. NM II-B2 is expressed in the Purkinje cells of the cerebellum as well as in other neuronal cells (Ma et al., Mol. Biol. Cell 15 (2006) 2138-2149). In contrast to any of the previously described isoforms of NM II (II-A, II-B0, II-B1, II-C0 and II-C1) or to smooth muscle myosin, the actin-activated MgATPase activity of HMM II-B2 is not significantly increased from a low, basal level by phosphorylation of the 20 kDa myosin light chain (MLC-20). Moreover, although HMM II-B2 can bind to actin in the absence of ATP and is released in its presence, it cannot propel actin in the sliding actin filament assay following MLC-20 phosphorylation. Unlike HMM II-B2, the actin-activated MgATPase activity of a chimeric HMM with the 21-amino acids II-B2 sequence inserted into the homologous location in the heavy chain of HMM II-C is increased following MLC-20 phosphorylation. This indicates that the effect of the II-B2 insert is myosin heavy chain specific. PMID:18060863
Kao, Jonathan C; Nuyujukian, Paul; Ryu, Stephen I; Shenoy, Krishna V
2017-04-01
Communication neural prostheses aim to restore efficient communication to people with motor neurological injury or disease by decoding neural activity into control signals. These control signals are both analog (e.g., the velocity of a computer mouse) and discrete (e.g., clicking an icon with a computer mouse) in nature. Effective, high-performing, and intuitive-to-use communication prostheses should be capable of decoding both analog and discrete state variables seamlessly. However, to date, the highest-performing autonomous communication prostheses rely on precise analog decoding and typically do not incorporate high-performance discrete decoding. In this report, we incorporated a hidden Markov model (HMM) into an intracortical communication prosthesis to enable accurate and fast discrete state decoding in parallel with analog decoding. In closed-loop experiments with nonhuman primates implanted with multielectrode arrays, we demonstrate that incorporating an HMM into a neural prosthesis can increase state-of-the-art achieved bitrate by 13.9% and 4.2% in two monkeys ( ). We found that the transition model of the HMM is critical to achieving this performance increase. Further, we found that using an HMM resulted in the highest achieved peak performance we have ever observed for these monkeys, achieving peak bitrates of 6.5, 5.7, and 4.7 bps in Monkeys J, R, and L, respectively. Finally, we found that this neural prosthesis was robustly controllable for the duration of entire experimental sessions. These results demonstrate that high-performance discrete decoding can be beneficially combined with analog decoding to achieve new state-of-the-art levels of performance.
NASA Astrophysics Data System (ADS)
Frost, Andrew J.; Thyer, Mark A.; Srikanthan, R.; Kuczera, George
2007-07-01
SummaryMulti-site simulation of hydrological data are required for drought risk assessment of large multi-reservoir water supply systems. In this paper, a general Bayesian framework is presented for the calibration and evaluation of multi-site hydrological data at annual timescales. Models included within this framework are the hidden Markov model (HMM) and the widely used lag-1 autoregressive (AR(1)) model. These models are extended by the inclusion of a Box-Cox transformation and a spatial correlation function in a multi-site setting. Parameter uncertainty is evaluated using Markov chain Monte Carlo techniques. Models are evaluated by their ability to reproduce a range of important extreme statistics and compared using Bayesian model selection techniques which evaluate model probabilities. The case study, using multi-site annual rainfall data situated within catchments which contribute to Sydney's main water supply, provided the following results: Firstly, in terms of model probabilities and diagnostics, the inclusion of the Box-Cox transformation was preferred. Secondly the AR(1) and HMM performed similarly, while some other proposed AR(1)/HMM models with regionally pooled parameters had greater posterior probability than these two models. The practical significance of parameter and model uncertainty was illustrated using a case study involving drought security analysis for urban water supply. It was shown that ignoring parameter uncertainty resulted in a significant overestimate of reservoir yield and an underestimation of system vulnerability to severe drought.
Application of hidden Markov models to biological data mining: a case study
NASA Astrophysics Data System (ADS)
Yin, Michael M.; Wang, Jason T.
2000-04-01
In this paper we present an example of biological data mining: the detection of splicing junction acceptors in eukaryotic genes. Identification or prediction of transcribed sequences from within genomic DNA has been a major rate-limiting step in the pursuit of genes. Programs currently available are far from being powerful enough to elucidate the gene structure completely. Here we develop a hidden Markov model (HMM) to represent the degeneracy features of splicing junction acceptor sites in eukaryotic genes. The HMM system is fully trained using an expectation maximization (EM) algorithm and the system performance is evaluated using the 10-way cross- validation method. Experimental results show that our HMM system can correctly classify more than 94% of the candidate sequences (including true and false acceptor sites) into right categories. About 90% of the true acceptor sites and 96% of the false acceptor sites in the test data are classified correctly. These results are very promising considering that only the local information in DNA is used. The proposed model will be a very important component of an effective and accurate gene structure detection system currently being developed in our lab.
Optical character recognition of handwritten Arabic using hidden Markov models
NASA Astrophysics Data System (ADS)
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.
2011-04-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.
Optical character recognition of handwritten Arabic using hidden Markov models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.
2011-01-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language ismore » initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.« less
Shinozaki, Takahiro
2018-01-01
Human-computer interface systems whose input is based on eye movements can serve as a means of communication for patients with locked-in syndrome. Eye-writing is one such system; users can input characters by moving their eyes to follow the lines of the strokes corresponding to characters. Although this input method makes it easy for patients to get started because of their familiarity with handwriting, existing eye-writing systems suffer from slow input rates because they require a pause between input characters to simplify the automatic recognition process. In this paper, we propose a continuous eye-writing recognition system that achieves a rapid input rate because it accepts characters eye-written continuously, with no pauses. For recognition purposes, the proposed system first detects eye movements using electrooculography (EOG), and then a hidden Markov model (HMM) is applied to model the EOG signals and recognize the eye-written characters. Additionally, this paper investigates an EOG adaptation that uses a deep neural network (DNN)-based HMM. Experiments with six participants showed an average input speed of 27.9 character/min using Japanese Katakana as the input target characters. A Katakana character-recognition error rate of only 5.0% was achieved using 13.8 minutes of adaptation data. PMID:29425248
HMM-ModE: implementation, benchmarking and validation with HMMER3
2014-01-01
Background HMM-ModE is a computational method that generates family specific profile HMMs using negative training sequences. The method optimizes the discrimination threshold using 10 fold cross validation and modifies the emission probabilities of profiles to reduce common fold based signals shared with other sub-families. The protocol depends on the program HMMER for HMM profile building and sequence database searching. The recent release of HMMER3 has improved database search speed by several orders of magnitude, allowing for the large scale deployment of the method in sequence annotation projects. We have rewritten our existing scripts both at the level of parsing the HMM profiles and modifying emission probabilities to upgrade HMM-ModE using HMMER3 that takes advantage of its probabilistic inference with high computational speed. The method is benchmarked and tested on GPCR dataset as an accurate and fast method for functional annotation. Results The implementation of this method, which now works with HMMER3, is benchmarked with the earlier version of HMMER, to show that the effect of local-local alignments is marked only in the case of profiles containing a large number of discontinuous match states. The method is tested on a gold standard set of families and we have reported a significant reduction in the number of false positive hits over the default HMM profiles. When implemented on GPCR sequences, the results showed an improvement in the accuracy of classification compared with other methods used to classify the familyat different levels of their classification hierarchy. Conclusions The present findings show that the new version of HMM-ModE is a highly specific method used to differentiate between fold (superfamily) and function (family) specific signals, which helps in the functional annotation of protein sequences. The use of modified profile HMMs of GPCR sequences provides a simple yet highly specific method for classification of the family, being able to predict the sub-family specific sequences with high accuracy even though sequences share common physicochemical characteristics between sub-families. PMID:25073805
Detecting Seismic Events Using a Supervised Hidden Markov Model
NASA Astrophysics Data System (ADS)
Burks, L.; Forrest, R.; Ray, J.; Young, C.
2017-12-01
We explore the use of supervised hidden Markov models (HMMs) to detect seismic events in streaming seismogram data. Current methods for seismic event detection include simple triggering algorithms, such as STA/LTA and the Z-statistic, which can lead to large numbers of false positives that must be investigated by an analyst. The hypothesis of this study is that more advanced detection methods, such as HMMs, may decreases false positives while maintaining accuracy similar to current methods. We train a binary HMM classifier using 2 weeks of 3-component waveform data from the International Monitoring System (IMS) that was carefully reviewed by an expert analyst to pick all seismic events. Using an ensemble of simple and discrete features, such as the triggering of STA/LTA, the HMM predicts the time at which transition occurs from noise to signal. Compared to the STA/LTA detection algorithm, the HMM detects more true events, but the false positive rate remains unacceptably high. Future work to potentially decrease the false positive rate may include using continuous features, a Gaussian HMM, and multi-class HMMs to distinguish between types of seismic waves (e.g., P-waves and S-waves). Acknowledgement: Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525.SAND No: SAND2017-8154 A
Development of a Fault Monitoring Technique for Wind Turbines Using a Hidden Markov Model.
Shin, Sung-Hwan; Kim, SangRyul; Seo, Yun-Ho
2018-06-02
Regular inspection for the maintenance of the wind turbines is difficult because of their remote locations. For this reason, condition monitoring systems (CMSs) are typically installed to monitor their health condition. The purpose of this study is to propose a fault detection algorithm for the mechanical parts of the wind turbine. To this end, long-term vibration data were collected over two years by a CMS installed on a 3 MW wind turbine. The vibration distribution at a specific rotating speed of main shaft is approximated by the Weibull distribution and its cumulative distribution function is utilized for determining the threshold levels that indicate impending failure of mechanical parts. A Hidden Markov model (HMM) is employed to propose the statistical fault detection algorithm in the time domain and the method whereby the input sequence for HMM is extracted is also introduced by considering the threshold levels and the correlation between the signals. Finally, it was demonstrated that the proposed HMM algorithm achieved a greater than 95% detection success rate by using the long-term signals.
Cao, Beiming; Kim, Myungjong; Mau, Ted; Wang, Jun
2017-01-01
Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance. PMID:29423453
Incorporating advanced language models into the P300 speller using particle filtering
NASA Astrophysics Data System (ADS)
Speier, W.; Arnold, C. W.; Deshpande, A.; Knall, J.; Pouratian, N.
2015-08-01
Objective. The P300 speller is a common brain-computer interface (BCI) application designed to communicate language by detecting event related potentials in a subject’s electroencephalogram signal. Information about the structure of natural language can be valuable for BCI communication, but attempts to use this information have thus far been limited to rudimentary n-gram models. While more sophisticated language models are prevalent in natural language processing literature, current BCI analysis methods based on dynamic programming cannot handle their complexity. Approach. Sampling methods can overcome this complexity by estimating the posterior distribution without searching the entire state space of the model. In this study, we implement sequential importance resampling, a commonly used particle filtering (PF) algorithm, to integrate a probabilistic automaton language model. Main result. This method was first evaluated offline on a dataset of 15 healthy subjects, which showed significant increases in speed and accuracy when compared to standard classification methods as well as a recently published approach using a hidden Markov model (HMM). An online pilot study verified these results as the average speed and accuracy achieved using the PF method was significantly higher than that using the HMM method. Significance. These findings strongly support the integration of domain-specific knowledge into BCI classification to improve system performance.
Adolescents and Heavy Metal Music: From the Mouths of Metalheads.
ERIC Educational Resources Information Center
Arnett, Jeffrey
1991-01-01
Attitudes and characteristics of adolescents who like heavy metal music (HMM) were explored in a study of 52 adolescents (largely White males) who liked HMM and 123 who did not in suburban Atlanta (Georgia). HMM is discussed as a reflection of, rather than a cause of, adolescent alienation. (SLD)
Daily Rainfall Simulation Using Climate Variables and Nonhomogeneous Hidden Markov Model
NASA Astrophysics Data System (ADS)
Jung, J.; Kim, H. S.; Joo, H. J.; Han, D.
2017-12-01
Markov chain is an easy method to handle when we compare it with other ones for the rainfall simulation. However, it also has limitations in reflecting seasonal variability of rainfall or change on rainfall patterns caused by climate change. This study applied a Nonhomogeneous Hidden Markov Model(NHMM) to consider these problems. The NHMM compared with a Hidden Markov Model(HMM) for the evaluation of a goodness of the model. First, we chose Gum river basin in Korea to apply the models and collected daily rainfall data from the stations. Also, the climate variables of geopotential height, temperature, zonal wind, and meridional wind date were collected from NCEP/NCAR reanalysis data to consider external factors affecting the rainfall event. We conducted a correlation analysis between rainfall and climate variables then developed a linear regression equation using the climate variables which have high correlation with rainfall. The monthly rainfall was obtained by the regression equation and it became input data of NHMM. Finally, the daily rainfall by NHMM was simulated and we evaluated the goodness of fit and prediction capability of NHMM by comparing with those of HMM. As a result of simulation by HMM, the correlation coefficient and root mean square error of daily/monthly rainfall were 0.2076 and 10.8243/131.1304mm each. In case of NHMM, the correlation coefficient and root mean square error of daily/monthly rainfall were 0.6652 and 10.5112/100.9865mm each. We could verify that the error of daily and monthly rainfall simulated by NHMM was improved by 2.89% and 22.99% compared with HMM. Therefore, it is expected that the results of the study could provide more accurate data for hydrologic analysis. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT & Future Planning(2017R1A2B3005695)
Smart Sensor-Based Motion Detection System for Hand Movement Training in Open Surgery.
Sun, Xinyao; Byrns, Simon; Cheng, Irene; Zheng, Bin; Basu, Anup
2017-02-01
We introduce a smart sensor-based motion detection technique for objective measurement and assessment of surgical dexterity among users at different experience levels. The goal is to allow trainees to evaluate their performance based on a reference model shared through communication technology, e.g., the Internet, without the physical presence of an evaluating surgeon. While in the current implementation we used a Leap Motion Controller to obtain motion data for analysis, our technique can be applied to motion data captured by other smart sensors, e.g., OptiTrack. To differentiate motions captured from different participants, measurement and assessment in our approach are achieved using two strategies: (1) low level descriptive statistical analysis, and (2) Hidden Markov Model (HMM) classification. Based on our surgical knot tying task experiment, we can conclude that finger motions generated from users with different surgical dexterity, e.g., expert and novice performers, display differences in path length, number of movements and task completion time. In order to validate the discriminatory ability of HMM for classifying different movement patterns, a non-surgical task was included in our analysis. Experimental results demonstrate that our approach had 100 % accuracy in discriminating between expert and novice performances. Our proposed motion analysis technique applied to open surgical procedures is a promising step towards the development of objective computer-assisted assessment and training systems.
Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold
Li, Weizhong; Lopez, Rodrigo
2017-01-01
Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999
Social networks, cooperative breeding, and the human milk microbiome.
Meehan, Courtney L; Lackey, Kimberly A; Hagen, Edward H; Williams, Janet E; Roulette, Jennifer; Helfrecht, Courtney; McGuire, Mark A; McGuire, Michelle K
2018-04-26
We present the first available data on the human milk microbiome (HMM) from small-scale societies (hunter-gatherers and horticulturalists in the Central African Republic [CAR]) and explore relationships among subsistence type and seasonality on HMM diversity and composition. Additionally, as humans are cooperative breeders and, throughout our evolutionary history and today, we rear offspring within social networks, we examine associations between the social environment and the HMM. Childrearing and breastfeeding exist in a biosocial nexus, which we hypothesize influences the HMM. Milk samples from hunter-gatherer and horticultural mothers (n = 41) collected over two seasons, were analyzed for their microbial composition. A subsample of these women's infants (n = 33) also participated in detailed naturalistic behavioral observations which identified the breadth of infants' social and caregiving networks and the frequency of contact they had with caregivers. Analyses of milk produced by CAR women indicated that HMM diversity and community composition were related to the size of the mother-infant dyad's social network and frequency of care that infants receive. The abundance of some microbial taxa also varied significantly across populations and seasons. Alpha diversity, however, was not related to subsistence type or seasonality. While the origins of the HMM are not fully understood, our results provide evidence regarding possible feedback loops among the infant, the mother, and the mother's social network that might influence HMM composition. © 2018 Wiley Periodicals, Inc.
Chen, Wenxi; Kitazawa, Masumi; Togawa, Tatsuo
2009-09-01
This paper proposes a method to estimate a woman's menstrual cycle based on the hidden Markov model (HMM). A tiny device was developed that attaches around the abdominal region to measure cutaneous temperature at 10-min intervals during sleep. The measured temperature data were encoded as a two-dimensional image (QR code, i.e., quick response code) and displayed in the LCD window of the device. A mobile phone captured the QR code image, decoded the information and transmitted the data to a database server. The collected data were analyzed by three steps to estimate the biphasic temperature property in a menstrual cycle. The key step was an HMM-based step between preprocessing and postprocessing. A discrete Markov model, with two hidden phases, was assumed to represent higher- and lower-temperature phases during a menstrual cycle. The proposed method was verified by the data collected from 30 female participants, aged from 14 to 46, over six consecutive months. By comparing the estimated results with individual records from the participants, 71.6% of 190 menstrual cycles were correctly estimated. The sensitivity and positive predictability were 91.8 and 96.6%, respectively. This objective evaluation provides a promising approach for managing premenstrual syndrome and birth control.
HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts.
Bharath, A; Madhvanath, Sriganesh
2012-04-01
Research for recognizing online handwritten words in Indic scripts is at its early stages when compared to Latin and Oriental scripts. In this paper, we address this problem specifically for two major Indic scripts--Devanagari and Tamil. In contrast to previous approaches, the techniques we propose are largely data driven and script independent. We propose two different techniques for word recognition based on Hidden Markov Models (HMM): lexicon driven and lexicon free. The lexicon-driven technique models each word in the lexicon as a sequence of symbol HMMs according to a standard symbol writing order derived from the phonetic representation. The lexicon-free technique uses a novel Bag-of-Symbols representation of the handwritten word that is independent of symbol order and allows rapid pruning of the lexicon. On handwritten Devanagari word samples featuring both standard and nonstandard symbol writing orders, a combination of lexicon-driven and lexicon-free recognizers significantly outperforms either of them used in isolation. In contrast, most Tamil word samples feature the standard symbol order, and the lexicon-driven recognizer outperforms the lexicon free one as well as their combination. The best recognition accuracies obtained for 20,000 word lexicons are 87.13 percent for Devanagari when the two recognizers are combined, and 91.8 percent for Tamil using the lexicon-driven technique.
Long-range propagation of plasmon and phonon polaritons in hyperbolic-metamaterial waveguides
NASA Astrophysics Data System (ADS)
Babicheva, Viktoriia E.
2017-12-01
We study photonic multilayer waveguides that include layers of materials and metamaterials with a hyperbolic dispersion (HMM). We consider the long-range propagation of plasmon and phonon polaritons at the dielectric-HMM interface in different waveguide geometries (single boundary or different layers of symmetric cladding). In contrast to the traditional analysis of geometrical parameters, we make an emphasis on the optical properties of constituent materials: solving dispersion equations, we analyze how dielectric and HMM permittivities affect propagation length and mode size of waveguide eigenmodes. We derive figures of merit that should be used for each waveguide in a broad range of permittivity values as well as compare them with plasmonic waveguides. We show that the conventional plasmonic quality factor, which is the ratio of real to imaginary parts of permittivity, is not applicable to the case of waveguides with complex structure. Both telecommunication wavelengths and mid-infrared spectral ranges are of interest considering recent advances in van der Waals materials, such as hexagonal boron nitride. We evaluate the performance of the waveguides with hexagonal boron nitride in the range where it possesses hyperbolic dispersion (wavelength 6.3-7.3 μm), and we show that these waveguides with natural hyperbolic properties have higher propagation lengths than metal-based HMM waveguides.
Golla, Gowtham Kumar; Carlson, Jordan A; Huan, Jun; Kerr, Jacqueline; Mitchell, Tarrah; Borner, Kelsey
2016-10-01
Sedentary behavior of youth is an important determinant of health. However, better measures are needed to improve understanding of this relationship and the mechanisms at play, as well as to evaluate health promotion interventions. Wearable accelerometers are considered as the standard for assessing physical activity in research, but do not perform well for assessing posture (i.e., sitting vs. standing), a critical component of sedentary behavior. The machine learning algorithms that we propose for assessing sedentary behavior will allow us to re-examine existing accelerometer data to better understand the association between sedentary time and health in various populations. We collected two datasets, a laboratory-controlled dataset and a free-living dataset. We trained machine learning classifiers separately on each dataset and compared performance across datasets. The classifiers predict five postures: sit, stand, sit-stand, stand-sit, and stand\\walk. We compared a manually constructed Hidden Markov model (HMM) with an automated HMM from existing software. The manually constructed HMM gave more F1-Macro score on both datasets.
Daniels, Noah M; Hosur, Raghavendra; Berger, Bonnie; Cowen, Lenore J
2012-05-01
One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile-profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions. Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/
Zoledronic acid overcomes chemoresistance and immunosuppression of malignant mesothelioma
Kopecka, Joanna; Gazzano, Elena; Sara, Orecchia; Ghigo, Dario; Riganti, Chiara
2015-01-01
The human malignant mesothelioma (HMM) is characterized by a chemoresistant and immunosuppressive phenotype. An effective strategy to restore chemosensitivity and immune reactivity against HMM is lacking. We investigated whether the use of zoledronic acid is an effective chemo-immunosensitizing strategy. We compared primary HMM samples with non-transformed mesothelial cells. HMM cells had higher rate of cholesterol and isoprenoid synthesis, constitutive activation of Ras/extracellular signal-regulated kinase1/2 (ERK1/2)/hypoxia inducible factor-1α (HIF-1α) pathway and up-regulation of the drug efflux transporter P-glycoprotein (Pgp). By decreasing the isoprenoid supply, zoledronic acid down-regulated the Ras/ERK1/2/HIF-1α/Pgp axis and chemosensitized the HMM cells to Pgp substrates. The HMM cells also produced higher amounts of kynurenine, decreased the proliferation of T-lymphocytes and expanded the number of T-regulatory (Treg) cells. Kynurenine synthesis was due to the transcription of the indoleamine 1,2 dioxygenase (IDO) enzyme, consequent to the activation of the signal transducer and activator of transcription-3 (STAT3). By reducing the activity of the Ras/ERK1/2/STAT3/IDO axis, zoledronic acid lowered the kyurenine synthesis and the expansion of Treg cells, and increased the proliferation of T-lymphocytes. Thanks to its ability to decrease Ras/ERK1/2 activity, which is responsible for both Pgp-mediated chemoresistance and IDO-mediated immunosuppression, zoledronic acid is an effective chemo-immunosensitizing agent in HMM cells. PMID:25544757
NASA Astrophysics Data System (ADS)
Juesas, P.; Ramasso, E.
2016-12-01
Condition monitoring aims at ensuring system safety which is a fundamental requirement for industrial applications and that has become an inescapable social demand. This objective is attained by instrumenting the system and developing data analytics methods such as statistical models able to turn data into relevant knowledge. One difficulty is to be able to correctly estimate the parameters of those methods based on time-series data. This paper suggests the use of the Weighted Distribution Theory together with the Expectation-Maximization algorithm to improve parameter estimation in statistical models with latent variables with an application to health monotonic under uncertainty. The improvement of estimates is made possible by incorporating uncertain and possibly noisy prior knowledge on latent variables in a sound manner. The latent variables are exploited to build a degradation model of dynamical system represented as a sequence of discrete states. Examples on Gaussian Mixture Models, Hidden Markov Models (HMM) with discrete and continuous outputs are presented on both simulated data and benchmarks using the turbofan engine datasets. A focus on the application of a discrete HMM to health monitoring under uncertainty allows to emphasize the interest of the proposed approach in presence of different operating conditions and fault modes. It is shown that the proposed model depicts high robustness in presence of noisy and uncertain prior.
Performance enhancement for audio-visual speaker identification using dynamic facial muscle model.
Asadpour, Vahid; Towhidkhah, Farzad; Homayounpour, Mohammad Mehdi
2006-10-01
Science of human identification using physiological characteristics or biometry has been of great concern in security systems. However, robust multimodal identification systems based on audio-visual information has not been thoroughly investigated yet. Therefore, the aim of this work to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently, imitation of valid speakers could be reduced to a large extent. These parameters are applied to a hidden Markov model (HMM) audio-visual identification system. In this work, a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. Noise robust audio features such as Mel-frequency cepstral coefficients (MFCC), spectral subtraction (SS), and relative spectra perceptual linear prediction (J-RASTA-PLP) have been used to evaluate the performance of the multimodal system once efficient audio feature extraction methods have been utilized. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. To evaluate the robustness of algorithms, some experiments were performed on genetically identical twins. Furthermore, changes in speaker voice were simulated with drug inhalation tests. In 3 dB signal to noise ratio (SNR), the dynamic muscle model improved the identification rate of the audio-visual system from 91 to 98%. Results on identical twins revealed that there was an apparent improvement on the performance for the dynamic muscle model-based system, in which the identification rate of the audio-visual system was enhanced from 87 to 96%.
Time-lapse microscopy and image processing for stem cell research: modeling cell migration
NASA Astrophysics Data System (ADS)
Gustavsson, Tomas; Althoff, Karin; Degerman, Johan; Olsson, Torsten; Thoreson, Ann-Catrin; Thorlin, Thorleif; Eriksson, Peter
2003-05-01
This paper presents hardware and software procedures for automated cell tracking and migration modeling. A time-lapse microscopy system equipped with a computer controllable motorized stage was developed. The performance of this stage was improved by incorporating software algorithms for stage motion displacement compensation and auto focus. The microscope is suitable for in-vitro stem cell studies and allows for multiple cell culture image sequence acquisition. This enables comparative studies concerning rate of cell splits, average cell motion velocity, cell motion as a function of cell sample density and many more. Several cell segmentation procedures are described as well as a cell tracking algorithm. Statistical methods for describing cell migration patterns are presented. In particular, the Hidden Markov Model (HMM) was investigated. Results indicate that if the cell motion can be described as a non-stationary stochastic process, then the HMM can adequately model aspects of its dynamic behavior.
Rosen, J; Solazzo, M; Hannaford, B; Sinanan, M
2001-01-01
Laparoscopic surgical skills evaluation of surgery residents is usually a subjective process, carried out in the operating room by senior surgeons. By its nature, this process is performed using fuzzy criteria. The objective of the current study was to develop and assess an objective laparoscopic surgical skill scale using Hidden Markov Models (HMM) based on haptic information, tool/tissue interactions and visual task decomposition. Eight subjects (six surgical trainees: first year surgical residents 2 x R1, third year surgical residents 2 x R3 fifth year surgical residents 2 x R5; and two expert laparoscopic surgeons: 2 x ES) performed laparoscopic cholecystectomy following a specific 7 steps protocol on a pig. An instrumented laparoscopic grasper equipped with a three-axis force/torque sensor located at the proximal end with an additional force sensor located on the handle, was used to measure the forces and torques. The hand/tool interface force/torque data was synchronized with a video of the tool operative maneuvers. A synthesis of frame-by-frame video analysis was used to define 14 different types of tool/tissue interactions, each one associated with unique force/torque (F/T) signatures. HMMs were developed for each subject representing the surgical skills by defining the various tool/tissue interactions as states and the associated F/T signatures as observations. The statistical distance between the HMMs representing residents at different levels of their training and the HMMs of expert surgeons were calculated in order to generate a learning curve of selected steps during laparoscopic cholecystectomy. Comparison of HMM's between groups showed significant differences between all skill levels, supporting the objective definition of a learning curve. The major differences between skill levels were: (i) magnitudes of F/T applied (ii) types of tool/tissue interactions used and the transition between them and (iii) time intervals spent in each tool/tissue interaction and the overall completion time. The objective HMM analysis showed that the greatest difference in performance was between R1 and R3 groups and then decreased as the level of expertise increased, suggesting that significant laparoscopic surgical capability develops between the first and the third years of their residency training. The power of the methodology using HMM for objective surgical skill assessment arises from the fact that it compiles enormous amount of data regarding different aspects of surgical skill into a very compact model that can be translated into a single number representing the distance from expert performance. Moreover, the methodology is not limited to in-vivo condition as demonstrated in the current study. It can be extended to other modalities such as measuring performance in surgical simulators and robotic systems.
NASA Astrophysics Data System (ADS)
Biswas, Subir; Quwaider, Muhannad
2008-04-01
The physical safety and well being of the soldiers in a battlefield is the highest priority of Incident Commanders. Currently, the ability to track and monitor soldiers rely on visual and verbal communication which can be somewhat limited in scenarios where the soldiers are deployed inside buildings and enclosed areas that are out of visual range of the commanders. Also, the need for being stealth can often prevent a battling soldier to send verbal clues to a commander about his or her physical well being. Sensor technologies can remotely provide various data about the soldiers including physiological monitoring and personal alert safety system functionality. This paper presents a networked sensing solution in which a body area wireless network of multi-modal sensors can monitor the body movement and other physiological parameters for statistical identification of a soldier's body posture, which can then be indicative of the physical conditions and safety alerts of the soldier in question. The specific concept is to leverage on-body proximity sensing and a Hidden Markov Model (HMM) based mechanism that can be applied for stochastic identification of human body postures using a wearable sensor network. The key idea is to collect relative proximity information between wireless sensors that are strategically placed over a subject's body to monitor the relative movements of the body segments, and then to process that using HMM in order to identify the subject's body postures. The key novelty of this approach is a departure from the traditional accelerometry based approaches in which the individual body segment movements, rather than their relative proximity, is used for activity monitoring and posture detection. Through experiments with body mounted sensors we demonstrate that while the accelerometry based approaches can be used for differentiating activity intensive postures such as walking and running, they are not very effective for identification and differentiation between low activity postures such as sitting and standing. We develop a wearable sensor network that monitors relative proximity using Radio Signal Strength indication (RSSI), and then construct a HMM system for posture identification in the presence of sensing errors. Controlled experiments using human subjects were carried out for evaluating the accuracy of the HMM identified postures compared to a naÃve threshold based mechanism, and its variations over different human subjects. A large spectrum of target human postures, including lie down, sit (straight and reclined), stand, walk, run, sprint and stair climbing, are used for validating the proposed system.
Construction of language models for an handwritten mail reading system
NASA Astrophysics Data System (ADS)
Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle
2012-01-01
This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.
Polur, Prasad D; Miller, Gerald E
2005-01-01
Computer speech recognition of individuals with dysarthria, such as cerebral palsy patients, requires a robust technique that can handle conditions of very high variability and limited training data. In this study, a hidden Markov model (HMM) was constructed and conditions investigated that would provide improved performance for a dysarthric speech (isolated word) recognition system intended to act as an assistive/control tool. In particular, we investigated the effect of high-frequency spectral components on the recognition rate of the system to determine if they contributed useful additional information to the system. A small-size vocabulary spoken by three cerebral palsy subjects was chosen. Mel-frequency cepstral coefficients extracted with the use of 15 ms frames served as training input to an ergodic HMM setup. Subsequent results demonstrated that no significant useful information was available to the system for enhancing its ability to discriminate dysarthric speech above 5.5 kHz in the current set of dysarthric data. The level of variability in input dysarthric speech patterns limits the reliability of the system. However, its application as a rehabilitation/control tool to assist dysarthric motor-impaired individuals such as cerebral palsy subjects holds sufficient promise.
Koua, Dominique; Kuhn-Nentwig, Lucia
2017-01-01
Spider venoms are rich cocktails of bioactive peptides, proteins, and enzymes that are being intensively investigated over the years. In order to provide a better comprehension of that richness, we propose a three-level family classification system for spider venom components. This classification is supported by an exhaustive set of 219 new profile hidden Markov models (HMMs) able to attribute a given peptide to its precise peptide type, family, and group. The proposed classification has the advantages of being totally independent from variable spider taxonomic names and can easily evolve. In addition to the new classifiers, we introduce and demonstrate the efficiency of hmmcompete, a new standalone tool that monitors HMM-based family classification and, after post-processing the result, reports the best classifier when multiple models produce significant scores towards given peptide queries. The combined used of hmmcompete and the new spider venom component-specific classifiers demonstrated 96% sensitivity to properly classify all known spider toxins from the UniProtKB database. These tools are timely regarding the important classification needs caused by the increasing number of peptides and proteins generated by transcriptomic projects. PMID:28786958
Hideen Markov Models and Neural Networks for Fault Detection in Dynamic Systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic
1994-01-01
None given. (From conclusion): Neural networks plus Hidden Markov Models(HMM)can provide excellene detection and false alarm rate performance in fault detection applications. Modified models allow for novelty detection. Also covers some key contributions of neural network model, and application status.
Linear Arrangement of Motor Protein on a Mechanically Deposited Fluoropolymer Thin Film
NASA Astrophysics Data System (ADS)
Suzuki, Hitoshi; Oiwa, Kazuhiro; Yamada, Akira; Sakakibara, Hitoshi; Nakayama, Haruto; Mashiko, Shinro
1995-07-01
Motor protein molecules such as heavy meromyosin (HMM), one of the major components of skeletal muscle, were arranged linearly on a mechanically deposited fluoropolymer thin film substrate in order to regulate the direction of movement generated by the motor protein. The fluoropolymer film consisted of many linear parallel ridges whose heights and widths were 10 to 20 nm and 10 to 100 nm, respectively. The fluoropolymer ridges adsorbed HMM molecules that were applied onto the film. Actin filaments labeled with rhodamine-phalloidin were observed under a fluorescence microscope moving linearly on the HMM-coated ridges. The observation indicates that HMM molecules were aligned on the fluoropolymer ridges while retaining their function. The velocity of actin movement was measured in this system.
2012-09-01
regulated by miR-99a/let7c/125b-2 cluster. Using bioinformatic prediction algorithm TargetScan, we identified 7 genes that are commonly targeted by miR-99a...HPeak, a Hidden Markov Model (HMM)-based peak identifying algorithm (http://www.sph.umich.edu/csg/qin/HPeak/). Seven AR binding sites were reported by...and ARBS2 by ALGGEN- PROMO, a matrix algorithm for predicting transcription factor binding sites based on TRANSFAC (http://alggen.lsi.upc.es/cgi- bin
Physical Human Activity Recognition Using Wearable Sensors.
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-12-11
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.
Physical Human Activity Recognition Using Wearable Sensors
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-01-01
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450
Multi-category micro-milling tool wear monitoring with continuous hidden Markov models
NASA Astrophysics Data System (ADS)
Zhu, Kunpeng; Wong, Yoke San; Hong, Geok Soon
2009-02-01
In-process monitoring of tool conditions is important in micro-machining due to the high precision requirement and high tool wear rate. Tool condition monitoring in micro-machining poses new challenges compared to conventional machining. In this paper, a multi-category classification approach is proposed for tool flank wear state identification in micro-milling. Continuous Hidden Markov models (HMMs) are adapted for modeling of the tool wear process in micro-milling, and estimation of the tool wear state given the cutting force features. For a noise-robust approach, the HMM outputs are connected via a medium filter to minimize the tool state before entry into the next state due to high noise level. A detailed study on the selection of HMM structures for tool condition monitoring (TCM) is presented. Case studies on the tool state estimation in the micro-milling of pure copper and steel demonstrate the effectiveness and potential of these methods.
Plasmonic Lithography Utilizing Epsilon Near Zero Hyperbolic Metamaterial.
Chen, Xi; Zhang, Cheng; Yang, Fan; Liang, Gaofeng; Li, Qiaochu; Guo, L Jay
2017-10-24
In this work, a special hyperbolic metamaterial (HMM) metamaterial is investigated for plasmonic lithography of period reduction patterns. It is a type II HMM (ϵ ∥ < 0 and ϵ ⊥ > 0) whose tangential component of the permittivity ϵ ∥ is close to zero. Due to the high anisotropy of the type II epsilon-near-zero (ENZ) HMM, only one plasmonic mode can propagate horizontally with low loss in a waveguide system with ENZ HMM as its core. This work takes the advantage of a type II ENZ HMM composed of aluminum/aluminum oxide films and the associated unusual mode to expose a photoresist layer in a specially designed lithography system. Periodic patterns with a half pitch of 58.3 nm were achieved due to the interference of third-order diffracted light of the grating. The lines were 1/6 of the mask with a period of 700 nm and ∼1/7 of the wavelength of the incident light. Moreover, the theoretical analyses performed are widely applicable to structures made of different materials such as silver as well as systems working at deep ultraviolet wavelengths including 193, 248, and 365 nm.
Regime switching model for financial data: Empirical risk analysis
NASA Astrophysics Data System (ADS)
Salhi, Khaled; Deaconu, Madalina; Lejay, Antoine; Champagnat, Nicolas; Navet, Nicolas
2016-11-01
This paper constructs a regime switching model for the univariate Value-at-Risk estimation. Extreme value theory (EVT) and hidden Markov models (HMM) are combined to estimate a hybrid model that takes volatility clustering into account. In the first stage, HMM is used to classify data in crisis and steady periods, while in the second stage, EVT is applied to the previously classified data to rub out the delay between regime switching and their detection. This new model is applied to prices of numerous stocks exchanged on NYSE Euronext Paris over the period 2001-2011. We focus on daily returns for which calibration has to be done on a small dataset. The relative performance of the regime switching model is benchmarked against other well-known modeling techniques, such as stable, power laws and GARCH models. The empirical results show that the regime switching model increases predictive performance of financial forecasting according to the number of violations and tail-loss tests. This suggests that the regime switching model is a robust forecasting variant of power laws model while remaining practical to implement the VaR measurement.
Diagnosis of the OCD Patients using Drawing Features of the Bender Gestalt Shapes
Boostani, R.; Asadi, F.; Mohammadi, N.
2017-01-01
Background: Since psychological tests such as questionnaire or drawing tests are almost qualitative, their results carry a degree of uncertainty and sometimes subjectivity. The deficiency of all drawing tests is that the assessment is carried out after drawing the objects and lots of information such as pen angle, speed, curvature and pressure are missed through the test. In other words, the psychologists cannot assess their patients while running the tests. One of the famous drawing tests to measure the degree of Obsession Compulsion Disorder (OCD) is the Bender Gestalt, though its reliability is not promising. Objective: The main objective of this study is to make the Bender Gestalt test quantitative; therefore, an optical pen along with a digital tablet is utilized to preserve the key drawing features of OCD patients during the test. Materials and Methods: Among a large population of patients who referred to a special clinic of OCD, 50 under therapy subjects voluntarily took part in this study. In contrast, 50 subjects with no sign of OCD performed the test as a control group. This test contains 9 shapes and the participants were not constraint to draw the shapes in a certain interval of time; consequently, to classify the stream of feature vectors (samples through drawing) Hidden Markov Model (HMM) is employed and its flexibility increased by incorporating the fuzzy technique into its learning scheme. Results: Applying fuzzy HMM classifier to the data stream of subjects could classify two groups up to 95.2% accuracy, whereas the results by applying the standard HMM resulted in 94.5%. In addition, multi-layer perceptron (MLP), as a strong static classifier, is applied to the features and resulted in 86.6% accuracy. Conclusion: Applying the pair of T-test to the results implies a significant supremacy of the fuzzy HMM to the standard HMM and MLP classifiers. PMID:28462208
Diagnosis of the OCD Patients using Drawing Features of the Bender Gestalt Shapes.
Boostani, R; Asadi, F; Mohammadi, N
2017-03-01
Since psychological tests such as questionnaire or drawing tests are almost qualitative, their results carry a degree of uncertainty and sometimes subjectivity. The deficiency of all drawing tests is that the assessment is carried out after drawing the objects and lots of information such as pen angle, speed, curvature and pressure are missed through the test. In other words, the psychologists cannot assess their patients while running the tests. One of the famous drawing tests to measure the degree of Obsession Compulsion Disorder (OCD) is the Bender Gestalt, though its reliability is not promising. The main objective of this study is to make the Bender Gestalt test quantitative; therefore, an optical pen along with a digital tablet is utilized to preserve the key drawing features of OCD patients during the test. Among a large population of patients who referred to a special clinic of OCD, 50 under therapy subjects voluntarily took part in this study. In contrast, 50 subjects with no sign of OCD performed the test as a control group. This test contains 9 shapes and the participants were not constraint to draw the shapes in a certain interval of time; consequently, to classify the stream of feature vectors (samples through drawing) Hidden Markov Model (HMM) is employed and its flexibility increased by incorporating the fuzzy technique into its learning scheme. Applying fuzzy HMM classifier to the data stream of subjects could classify two groups up to 95.2% accuracy, whereas the results by applying the standard HMM resulted in 94.5%. In addition, multi-layer perceptron (MLP), as a strong static classifier, is applied to the features and resulted in 86.6% accuracy. Applying the pair of T-test to the results implies a significant supremacy of the fuzzy HMM to the standard HMM and MLP classifiers.
Hyperbolic metamaterial nanostructures to tune charge-transfer dynamics (Conference Presentation)
NASA Astrophysics Data System (ADS)
Lee, Kwang Jin; Xiao, Yiming; Woo, Jae Heun; Kim, Eun Sun; Kreher, David; Attias, André-Jean; Mathevet, Fabrice; Ribierre, Jean-Charles; Wu, Jeong Weon; André, Pascal
2016-09-01
Charge transfer (CT) is an essential phenomenon relevant to numerous fields including biology, physics and chemistry.1-5 Here, we demonstrate that multi-layered hyperbolic metamaterial (HMM) substrates alter organic semiconductor CT dynamics.6 With triphenylene:perylene diimide dyad supramolecular self-assemblies prepared on HMM substrates, we show that both charge separation (CS) and charge recombination (CR) characteristic times are increased by factors of 2.5 and 1.6, respectively, resulting in longer-lived CT states. We successfully rationalize the experimental data by extending Marcus theory framework with dipole image interactions tuning the driving force. The number of metal-dielectric pairs alters the HMM interfacial effective dielectric constant and becomes a solid analogue to solvent polarizability. Based on the experimental results and extended Marcus theory framework, we find that CS and CR processes are located in normal and inverted regions on Marcus parabola diagram, respectively. The model and further PH3T:PCBM data show that the phenomenon is general and that molecular and substrate engineering offer a wide range of kinetic tailoring opportunities. This work opens the path toward novel artificial substrates designed to control CT dynamics with potential applications in fields including optoelectronics, organic solar cells and chemistry. 1. Marcus, Rev. Mod. Phys., 1993, 65, 599. 2. Marcus, Phys. Chem. Chem. Phys., 2012, 14, 13729. 3. Lambert, et al., Nat. Phys., 2012, 9, 10. 4. C. Clavero, Nat. Photon., 2014, 8, 95. 5. A. Canaguier-Durand, et al., Angew. Chem. Int. Ed., 2013, 52, 10533. 6. K. J. Lee, et al., Submitted, 2015, arxiv.org/abs/1510.08574.
Vocal classification of vocalizations of a pair of Asian small-clawed otters to determine stress.
Scheifele, Peter M; Johnson, Michael T; Fry, Michelle; Hamel, Benjamin; Laclede, Kathryn
2015-07-01
Asian Small-Clawed Otters (Aonyx cinerea) are a small, protected but threatened species living in freshwater. They are gregarious and live in monogamous pairs for their lifetimes, communicating via scent and acoustic vocalizations. This study utilized a hidden Markov model (HMM) to classify stress versus non-stress calls from a sibling pair under professional care. Vocalizations were expertly annotated by keepers into seven contextual categories. Four of these-aggression, separation anxiety, pain, and prefeeding-were identified as stressful contexts, and three of them-feeding, training, and play-were identified as non-stressful contexts. The vocalizations were segmented, manually categorized into broad vocal type call types, and analyzed to determine signal to noise ratios. From this information, vocalizations from the most common contextual categories were used to implement HMM-based automatic classification experiments, which included individual identification, stress vs non-stress, and individual context classification. Results indicate that both individual identity and stress vs non-stress were distinguishable, with accuracies above 90%, but that individual contexts within the stress category were not easily separable.
Hidden Markov model for dependent mark loss and survival estimation
Laake, Jeffrey L.; Johnson, Devin S.; Diefenbach, Duane R.; Ternent, Mark A.
2014-01-01
Mark-recapture estimators assume no loss of marks to provide unbiased estimates of population parameters. We describe a hidden Markov model (HMM) framework that integrates a mark loss model with a Cormack–Jolly–Seber model for survival estimation. Mark loss can be estimated with single-marked animals as long as a sub-sample of animals has a permanent mark. Double-marking provides an estimate of mark loss assuming independence but dependence can be modeled with a permanently marked sub-sample. We use a log-linear approach to include covariates for mark loss and dependence which is more flexible than existing published methods for integrated models. The HMM approach is demonstrated with a dataset of black bears (Ursus americanus) with two ear tags and a subset of which were permanently marked with tattoos. The data were analyzed with and without the tattoo. Dropping the tattoos resulted in estimates of survival that were reduced by 0.005–0.035 due to tag loss dependence that could not be modeled. We also analyzed the data with and without the tattoo using a single tag. By not using.
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model
2017-01-01
The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them. PMID:28497059
Combination of dynamic Bayesian network classifiers for the recognition of degraded characters
NASA Astrophysics Data System (ADS)
Likforman-Sulem, Laurence; Sigelle, Marc
2009-01-01
We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independent classifiers are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers are then combined linearly at the decision level. We compare the different classifiers -independent, coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our results show that coupled DBNs perform better on degraded characters than the linear combination of independent HMM scores. Our results also show that the best classifier is obtained by linearly combining the scores of the best coupled DBN and the best independent HMM.
UUCD: a family-based database of ubiquitin and ubiquitin-like conjugation.
Gao, Tianshun; Liu, Zexian; Wang, Yongbo; Cheng, Han; Yang, Qing; Guo, Anyuan; Ren, Jian; Xue, Yu
2013-01-01
In this work, we developed a family-based database of UUCD (http://uucd.biocuckoo.org) for ubiquitin and ubiquitin-like conjugation, which is one of the most important post-translational modifications responsible for regulating a variety of cellular processes, through a similar E1 (ubiquitin-activating enzyme)-E2 (ubiquitin-conjugating enzyme)-E3 (ubiquitin-protein ligase) enzyme thioester cascade. Although extensive experimental efforts have been taken, an integrative data resource is still not available. From the scientific literature, 26 E1s, 105 E2s, 1003 E3s and 148 deubiquitination enzymes (DUBs) were collected and classified into 1, 3, 19 and 7 families, respectively. To computationally characterize potential enzymes in eukaryotes, we constructed 1, 1, 15 and 6 hidden Markov model (HMM) profiles for E1s, E2s, E3s and DUBs at the family level, separately. Moreover, the ortholog searches were conducted for E3 and DUB families without HMM profiles. Then the UUCD database was developed with 738 E1s, 2937 E2s, 46 631 E3s and 6647 DUBs of 70 eukaryotic species. The detailed annotations and classifications were also provided. The online service of UUCD was implemented in PHP + MySQL + JavaScript + Perl.
Lami, Mariam; Singh, Harsimrat; Dilley, James H; Ashraf, Hajra; Edmondon, Matthew; Orihuela-Espina, Felipe; Hoare, Jonathan; Darzi, Ara; Sodergren, Mikael H
2018-02-07
The adenoma detection rate (ADR) is an important quality indicator in colonoscopy. The aim of this study was to evaluate the changes in visual gaze patterns (VGPs) with increasing polyp detection rate (PDR), a surrogate marker of ADR. 18 endoscopists participated in the study. VGPs were measured using eye-tracking technology during the withdrawal phase of colonoscopy. VGPs were characterized using two analyses - screen and anatomy. Eye-tracking parameters were used to characterize performance, which was further substantiated using hidden Markov model (HMM) analysis. Subjects with higher PDRs spent more time viewing the outer ring of the 3 × 3 grid for both analyses (screen-based: r = 0.56, P = 0.02; anatomy: r = 0.62, P < 0.01). Fixation distribution to the "bottom U" of the screen in screen-based analysis was positively correlated with PDR (r = 0.62, P = 0.01). HMM demarcated the VGPs into three PDR groups. This study defined distinct VGPs that are associated with expert behavior. These data may allow introduction of visual gaze training within structured training programs, and have implications for adoption in higher-level assessment. © Georg Thieme Verlag KG Stuttgart · New York.
NASA Astrophysics Data System (ADS)
Johanson, I. A.; Miklius, A.; Poland, M. P.
2016-12-01
A sequence of magmatic events in April-May 2015 at Kīlauea Volcano produced a complex deformation pattern that can be described by multiple deforming sources, active simultaneously. The 2015 intrusive sequence began with inflation in the volcano's summit caldera near Halema`uma`u (HMM) Crater, which continued over a few weeks, followed by rapid deflation of the HMM source and inflation of a source in the south caldera region during the next few days. In Kīlauea Volcano's summit area, multiple deformation centers are active at varying times, and all contribute to the overall pattern observed with GPS, tiltmeters, and InSAR. Isolating the contribution of different signals related to each source is a challenge and complicates the determination of optimal source geometry for the underlying magma bodies. We used principle component analysis of continuous GPS time series from the 2015 intrusion sequence to determine three basis vectors which together account for 83% of the variance in the data set. The three basis vectors are non-orthogonal and not strictly the principle components of the data set. In addition to separating deformation sources in the continuous GPS data, the basis vectors provide a means to scale the contribution of each source in a given interferogram. This provides an additional constraint in a joint model of GPS and InSAR data (COSMO-SkyMed and Sentinel-1A) to determine source geometry. The first basis vector corresponds with inflation in the south caldera region, an area long recognized as the location of a long-term storage reservoir. The second vector represents deformation of the HMM source, which is in the same location as a previously modeled shallow reservoir, however InSAR data suggest a more complicated source. Preliminary modeling of the deformation attributed to the third basis vector shows that it is consistent with inflation of a steeply dipping ellipsoid centered below Keanakāko`i crater, southeast of HMM. Keanakāko`i crater is the locus of a known, intermittently active deformation source, which was not previously recognized to have been active during the 2015 event.
Yi, Li-Tao; Xu, Qun; Li, Yu-Cheng; Yang, Lei; Kong, Ling-Dong
2009-06-15
Magnolia bark and ginger rhizome is a drug pair in many prescriptions for treatment of mental disorders in traditional Chinese medicine (TCM). However, compatibility and synergism mechanism of two herbs on antidepressant actions have not been reported. The aim of this study was to approach the rationale of the drug pair in TCM. We evaluated antidepressant-like effects of mixture of honokiol and magnolol (HMM), polysaccharides (PMB) from magnolia bark, essential oil (OGR) and polysaccharides (PGR) from ginger rhizome alone, and the possibility of synergistic interactions in their combinations in the mouse forced swimming test (FST) and tail suspension test (TST). Serotonin (5-HT) and noradrenaline (NE) levels in prefrontal cortex, hippocampus and striatum were also examined. 30 mg/kg HMM decreased immobility in the FST and TST in mice after one- and two-week treatment. OGR (19.5 or 39 mg/kg) alone was ineffective. The combination of an ineffective dose of 39 mg/kg OGR with 15 mg/kg HMM was the most effective and produced a synergistic action on behaviors after two-week treatment. Significant increase in 5-HT and synergistic increase in NE in prefrontal cortex were observed after co-administration of HMM with OGR. These results demonstrated that HMM was the principal component of this drug pair, whereas OGR served as adjuvant fraction. Compatibility of HMM with OGR was suggested to exert synergistic antidepressant actions by attenuating abnormalities in serotonergic and noradrenergic system functions. Therefore, we confirmed the rationality of drug pair in clinical application and provided a novel perspective in drug pair of TCM researches.
Genotype calling from next-generation sequencing data using haplotype information of reads
Zhi, Degui; Wu, Jihua; Liu, Nianjun; Zhang, Kui
2012-01-01
Motivation: Low coverage sequencing provides an economic strategy for whole genome sequencing. When sequencing a set of individuals, genotype calling can be challenging due to low sequencing coverage. Linkage disequilibrium (LD) based refinement of genotyping calling is essential to improve the accuracy. Current LD-based methods use read counts or genotype likelihoods at individual potential polymorphic sites (PPSs). Reads that span multiple PPSs (jumping reads) can provide additional haplotype information overlooked by current methods. Results: In this article, we introduce a new Hidden Markov Model (HMM)-based method that can take into account jumping reads information across adjacent PPSs and implement it in the HapSeq program. Our method extends the HMM in Thunder and explicitly models jumping reads information as emission probabilities conditional on the states of adjacent PPSs. Our simulation results show that, compared to Thunder, HapSeq reduces the genotyping error rate by 30%, from 0.86% to 0.60%. The results from the 1000 Genomes Project show that HapSeq reduces the genotyping error rate by 12 and 9%, from 2.24% and 2.76% to 1.97% and 2.50% for individuals with European and African ancestry, respectively. We expect our program can improve genotyping qualities of the large number of ongoing and planned whole genome sequencing projects. Contact: dzhi@ms.soph.uab.edu; kzhang@ms.soph.uab.edu Availability: The software package HapSeq and its manual can be found and downloaded at www.ssg.uab.edu/hapseq/. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22285565
Korucu, M Kemal; Kaplan, Özgür; Büyük, Osman; Güllü, M Kemal
2016-10-01
In this study, we investigate the usability of sound recognition for source separation of packaging wastes in reverse vending machines (RVMs). For this purpose, an experimental setup equipped with a sound recording mechanism was prepared. Packaging waste sounds generated by three physical impacts such as free falling, pneumatic hitting and hydraulic crushing were separately recorded using two different microphones. To classify the waste types and sizes based on sound features of the wastes, a support vector machine (SVM) and a hidden Markov model (HMM) based sound classification systems were developed. In the basic experimental setup in which only free falling impact type was considered, SVM and HMM systems provided 100% classification accuracy for both microphones. In the expanded experimental setup which includes all three impact types, material type classification accuracies were 96.5% for dynamic microphone and 97.7% for condenser microphone. When both the material type and the size of the wastes were classified, the accuracy was 88.6% for the microphones. The modeling studies indicated that hydraulic crushing impact type recordings were very noisy for an effective sound recognition application. In the detailed analysis of the recognition errors, it was observed that most of the errors occurred in the hitting impact type. According to the experimental results, it can be said that the proposed novel approach for the separation of packaging wastes could provide a high classification performance for RVMs. Copyright © 2016 Elsevier Ltd. All rights reserved.
Classification of a set of vectors using self-organizing map- and rule-based technique
NASA Astrophysics Data System (ADS)
Ae, Tadashi; Okaniwa, Kaishirou; Nosaka, Kenzaburou
2005-02-01
There exist various objects, such as pictures, music, texts, etc., around our environment. We have a view for these objects by looking, reading or listening. Our view is concerned with our behaviors deeply, and is very important to understand our behaviors. We have a view for an object, and decide the next action (data selection, etc.) with our view. Such a series of actions constructs a sequence. Therefore, we propose a method which acquires a view as a vector from several words for a view, and apply the vector to sequence generation. We focus on sequences of the data of which a user selects from a multimedia database containing pictures, music, movie, etc... These data cannot be stereotyped because user's view for them changes by each user. Therefore, we represent the structure of the multimedia database as the vector representing user's view and the stereotyped vector, and acquire sequences containing the structure as elements. Such a vector can be classified by SOM (Self-Organizing Map). Hidden Markov Model (HMM) is a method to generate sequences. Therefore, we use HMM of which a state corresponds to the representative vector of user's view, and acquire sequences containing the change of user's view. We call it Vector-state Markov Model (VMM). We introduce the rough set theory as a rule-base technique, which plays a role of classifying the sets of data such as the sets of "Tour".
Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.
Martindale, Christine F; Hoenig, Florian; Strohrmann, Christina; Eskofier, Bjoern M
2017-10-13
Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.
Recognition of degraded handwritten digits using dynamic Bayesian networks
NASA Astrophysics Data System (ADS)
Likforman-Sulem, Laurence; Sigelle, Marc
2007-01-01
We investigate in this paper the application of dynamic Bayesian networks (DBNs) to the recognition of handwritten digits. The main idea is to couple two separate HMMs into various architectures. First, a vertical HMM and a horizontal HMM are built observing the evolving streams of image columns and image rows respectively. Then, two coupled architectures are proposed to model interactions between these two streams and to capture the 2D nature of character images. Experiments performed on the MNIST handwritten digit database show that coupled architectures yield better recognition performances than non-coupled ones. Additional experiments conducted on artificially degraded (broken) characters demonstrate that coupled architectures better cope with such degradation than non coupled ones and than discriminative methods such as SVMs.
Modeling carbachol-induced hippocampal network synchronization using hidden Markov models
NASA Astrophysics Data System (ADS)
Dragomir, Andrei; Akay, Yasemin M.; Akay, Metin
2010-10-01
In this work we studied the neural state transitions undergone by the hippocampal neural network using a hidden Markov model (HMM) framework. We first employed a measure based on the Lempel-Ziv (LZ) estimator to characterize the changes in the hippocampal oscillation patterns in terms of their complexity. These oscillations correspond to different modes of hippocampal network synchronization induced by the cholinergic agonist carbachol in the CA1 region of mice hippocampus. HMMs are then used to model the dynamics of the LZ-derived complexity signals as first-order Markov chains. Consequently, the signals corresponding to our oscillation recordings can be segmented into a sequence of statistically discriminated hidden states. The segmentation is used for detecting transitions in neural synchronization modes in data recorded from wild-type and triple transgenic mice models (3xTG) of Alzheimer's disease (AD). Our data suggest that transition from low-frequency (delta range) continuous oscillation mode into high-frequency (theta range) oscillation, exhibiting repeated burst-type patterns, occurs always through a mode resembling a mixture of the two patterns, continuous with burst. The relatively random patterns of oscillation during this mode may reflect the fact that the neuronal network undergoes re-organization. Further insight into the time durations of these modes (retrieved via the HMM segmentation of the LZ-derived signals) reveals that the mixed mode lasts significantly longer (p < 10-4) in 3xTG AD mice. These findings, coupled with the documented cholinergic neurotransmission deficits in the 3xTG mice model, may be highly relevant for the case of AD.
Ong, Lee-Ling S; Xinghua Zhang; Kundukad, Binu; Dauwels, Justin; Doyle, Patrick; Asada, H Harry
2016-08-01
An approach to automatically detect bacteria division with temporal models is presented. To understand how bacteria migrate and proliferate to form complex multicellular behaviours such as biofilms, it is desirable to track individual bacteria and detect cell division events. Unlike eukaryotic cells, prokaryotic cells such as bacteria lack distinctive features, causing bacteria division difficult to detect in a single image frame. Furthermore, bacteria may detach, migrate close to other bacteria and may orientate themselves at an angle to the horizontal plane. Our system trains a hidden conditional random field (HCRF) model from tracked and aligned bacteria division sequences. The HCRF model classifies a set of image frames as division or otherwise. The performance of our HCRF model is compared with a Hidden Markov Model (HMM). The results show that a HCRF classifier outperforms a HMM classifier. From 2D bright field microscopy data, it is a challenge to separate individual bacteria and associate observations to tracks. Automatic detection of sequences with bacteria division will improve tracking accuracy.
Selective radiative heating of nanostructures using hyperbolic metamaterials
Ding, Ding; Minnich, Austin J
2015-01-01
Hyperbolic metamaterials (HMM) are of great interest due to their ability to break the diffraction limit for imaging and enhance near-field radiative heat transfer. Here we demonstrate that an annular, transparent HMM enables selective heating of a sub-wavelength plasmonic nanowire by controlling the angular mode number of a plasmonic resonance. A nanowire emitter, surrounded by an HMM, appears dark to incoming radiation from an adjacent nanowire emitter unless the second emitter is surrounded by an identical lens such that the wavelength and angular mode of the plasmonic resonance match. Our result can find applications in radiative thermal management.
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment
Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P.; Patterson, Nick; Price, Alkes L.
2014-01-01
Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1–5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case–control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of χ2 association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Availability and implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:24990607
1997-09-01
first PC-based, very large vocabulary dictation system with a continuous natural language free flow approach to speech recognition. (This system allows...indicating the likelihood that a particular stored HMM reference model is the best match for the input. This approach is called the Baum-Welch...InfoCentral, and Envoy 1.0; and Lotus Development Corp.’s SmartSuite 3, Approach 3.0, and Organizer. 2. IBM At a press conference in New York in June 1997, IBM
Abby, Sophie S.; Néron, Bertrand; Ménager, Hervé; Touchon, Marie; Rocha, Eduardo P. C.
2014-01-01
Motivation Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose. Results Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate “Cas-finder” using publicly available protein profiles. Availability and Implementation MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The “Cas-finder” (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information. PMID:25330359
Tine, Roger C K; Faye, Babacar; Ndour, Cheikh T; Ndiaye, Jean L; Ndiaye, Magatte; Bassene, Charlemagne; Magnussen, Pascal; Bygbjerg, Ib C; Sylla, Khadim; Ndour, Jacques D; Gaye, Oumar
2011-12-13
Current malaria control strategies recommend (i) early case detection using rapid diagnostic tests (RDT) and treatment with artemisinin combination therapy (ACT), (ii) pre-referral rectal artesunate, (iii) intermittent preventive treatment and (iv) impregnated bed nets. However, these individual malaria control interventions provide only partial protection in most epidemiological situations. Therefore, there is a need to investigate the potential benefits of integrating several malaria interventions to reduce malaria prevalence and morbidity. A randomized controlled trial was carried out to assess the impact of combining seasonal intermittent preventive treatment in children (IPTc) with home-based management of malaria (HMM) by community health workers (CHWs) in Senegal. Eight CHWs in eight villages covered by the Bonconto health post, (South Eastern part of Senegal) were trained to diagnose malaria using RDT, provide prompt treatment with artemether-lumefantrine for uncomplicated malaria cases and pre-referral rectal artesunate for complicated malaria occurring in children under 10 years. Four CHWs were randomized to also administer monthly IPTc as single dose of sulphadoxine-pyrimethamine (SP) plus three doses of amodiaquine (AQ) in the malaria transmission season, October and November 2010. Primary end point was incidence of single episode of malaria attacks over 8 weeks of follow up. Secondary end points included prevalence of malaria parasitaemia, and prevalence of anaemia at the end of the transmission season. Primary analysis was by intention to treat. The study protocol was approved by the Senegalese National Ethical Committee (approval 0027/MSP/DS/CNRS, 18/03/2010). A total of 1,000 children were enrolled. The incidence of malaria episodes was 7.1/100 child months at risk [95% CI (3.7-13.7)] in communities with IPTc + HMM compared to 35.6/100 child months at risk [95% CI (26.7-47.4)] in communities with only HMM (aOR = 0.20; 95% CI 0.09-0.41; p = 0.04). At the end of the transmission season, malaria parasitaemia prevalence was lower in communities with IPTc + HMM (2.05% versus 4.6% p = 0.03). Adjusted for age groups, sex, Plasmodium falciparum carriage and prevalence of malnutrition, IPTc + HMM showed a significant protective effect against anaemia (aOR = 0.59; 95% CI 0.42-0.82; p = 0.02). Combining IPTc and HMM can provide significant additional benefit in preventing clinical episodes of malaria as well as anaemia among children in Senegal.
Behavior analysis for elderly care using a network of low-resolution visual sensors
NASA Astrophysics Data System (ADS)
Eldib, Mohamed; Deboeverie, Francis; Philips, Wilfried; Aghajan, Hamid
2016-07-01
Recent advancements in visual sensor technologies have made behavior analysis practical for in-home monitoring systems. The current in-home monitoring systems face several challenges: (1) visual sensor calibration is a difficult task and not practical in real-life because of the need for recalibration when the visual sensors are moved accidentally by a caregiver or the senior citizen, (2) privacy concerns, and (3) the high hardware installation cost. We propose to use a network of cheap low-resolution visual sensors (30×30 pixels) for long-term behavior analysis. The behavior analysis starts by visual feature selection based on foreground/background detection to track the motion level in each visual sensor. Then a hidden Markov model (HMM) is used to estimate the user's locations without calibration. Finally, an activity discovery approach is proposed using spatial and temporal contexts. We performed experiments on 10 months of real-life data. We show that the HMM approach outperforms the k-nearest neighbor classifier against ground truth for 30 days. Our framework is able to discover 13 activities of daily livings (ADL parameters). More specifically, we analyze mobility patterns and some of the key ADL parameters to detect increasing or decreasing health conditions.
Optimizing Likelihood Models for Particle Trajectory Segmentation in Multi-State Systems.
Young, Dylan Christopher; Scrimgeour, Jan
2018-06-19
Particle tracking offers significant insight into the molecular mechanics that govern the behav- ior of living cells. The analysis of molecular trajectories that transition between different motive states, such as diffusive, driven and tethered modes, is of considerable importance, with even single trajectories containing significant amounts of information about a molecule's environment and its interactions with cellular structures. Hidden Markov models (HMM) have been widely adopted to perform the segmentation of such complex tracks. In this paper, we show that extensive analysis of hidden Markov model outputs using data derived from multi-state Brownian dynamics simulations can be used both for the optimization of the likelihood models used to describe the states of the system and for characterization of the technique's failure mechanisms. This analysis was made pos- sible by the implementation of parallelized adaptive direct search algorithm on a Nvidia graphics processing unit. This approach provides critical information for the visualization of HMM failure and successful design of particle tracking experiments where trajectories contain multiple mobile states. © 2018 IOP Publishing Ltd.
Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio
2010-01-01
Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone. PMID:20830209
Tsigelny, Igor; Sharikov, Yuriy; Ten Eyck, Lynn F
2002-05-01
HMMSPECTR is a tool for finding putative structural homologs for proteins with known primary sequences. HMMSPECTR contains four major components: a data warehouse with the hidden Markov models (HMM) and alignment libraries; a search program which compares the initial protein sequences with the libraries of HMMs; a secondary structure prediction and comparison program; and a dominant protein selection program that prepares the set of 10-15 "best" proteins from the chosen HMMs. The data warehouse contains four libraries of HMMs. The first two libraries were constructed using different HHM preparation options of the HAMMER program. The third library contains parts ("partial HMM") of initial alignments. The fourth library contains trained HMMs. We tested our program against all of the protein targets proposed in the CASP4 competition. The data warehouse included libraries of structural alignments and HMMs constructed on the basis of proteins publicly available in the Protein Data Bank before the CASP4 meeting. The newest fully automated versions of HMMSPECTR 1.02 and 1.02ss produced better results than the best result reported at CASP4 either by r.m.s.d. or by length (or both) in 64% (HMMSPECTR 1.02) and 79% (HMMSPECTR 1.02ss) of the cases. The improvement is most notable for the targets with complexity 4 (difficult fold recognition cases).
Li, Kan; Príncipe, José C.
2018-01-01
This paper presents a novel real-time dynamic framework for quantifying time-series structure in spoken words using spikes. Audio signals are converted into multi-channel spike trains using a biologically-inspired leaky integrate-and-fire (LIF) spike generator. These spike trains are mapped into a function space of infinite dimension, i.e., a Reproducing Kernel Hilbert Space (RKHS) using point-process kernels, where a state-space model learns the dynamics of the multidimensional spike input using gradient descent learning. This kernelized recurrent system is very parsimonious and achieves the necessary memory depth via feedback of its internal states when trained discriminatively, utilizing the full context of the phoneme sequence. A main advantage of modeling nonlinear dynamics using state-space trajectories in the RKHS is that it imposes no restriction on the relationship between the exogenous input and its internal state. We are free to choose the input representation with an appropriate kernel, and changing the kernel does not impact the system nor the learning algorithm. Moreover, we show that this novel framework can outperform both traditional hidden Markov model (HMM) speech processing as well as neuromorphic implementations based on spiking neural network (SNN), yielding accurate and ultra-low power word spotters. As a proof of concept, we demonstrate its capabilities using the benchmark TI-46 digit corpus for isolated-word automatic speech recognition (ASR) or keyword spotting. Compared to HMM using Mel-frequency cepstral coefficient (MFCC) front-end without time-derivatives, our MFCC-KAARMA offered improved performance. For spike-train front-end, spike-KAARMA also outperformed state-of-the-art SNN solutions. Furthermore, compared to MFCCs, spike trains provided enhanced noise robustness in certain low signal-to-noise ratio (SNR) regime. PMID:29666568
Li, Kan; Príncipe, José C
2018-01-01
This paper presents a novel real-time dynamic framework for quantifying time-series structure in spoken words using spikes. Audio signals are converted into multi-channel spike trains using a biologically-inspired leaky integrate-and-fire (LIF) spike generator. These spike trains are mapped into a function space of infinite dimension, i.e., a Reproducing Kernel Hilbert Space (RKHS) using point-process kernels, where a state-space model learns the dynamics of the multidimensional spike input using gradient descent learning. This kernelized recurrent system is very parsimonious and achieves the necessary memory depth via feedback of its internal states when trained discriminatively, utilizing the full context of the phoneme sequence. A main advantage of modeling nonlinear dynamics using state-space trajectories in the RKHS is that it imposes no restriction on the relationship between the exogenous input and its internal state. We are free to choose the input representation with an appropriate kernel, and changing the kernel does not impact the system nor the learning algorithm. Moreover, we show that this novel framework can outperform both traditional hidden Markov model (HMM) speech processing as well as neuromorphic implementations based on spiking neural network (SNN), yielding accurate and ultra-low power word spotters. As a proof of concept, we demonstrate its capabilities using the benchmark TI-46 digit corpus for isolated-word automatic speech recognition (ASR) or keyword spotting. Compared to HMM using Mel-frequency cepstral coefficient (MFCC) front-end without time-derivatives, our MFCC-KAARMA offered improved performance. For spike-train front-end, spike-KAARMA also outperformed state-of-the-art SNN solutions. Furthermore, compared to MFCCs, spike trains provided enhanced noise robustness in certain low signal-to-noise ratio (SNR) regime.
Classification of Anticipatory Signals for Grasp and Release from Surface Electromyography.
Siu, Ho Chit; Shah, Julie A; Stirling, Leia A
2016-10-25
Surface electromyography (sEMG) is a technique for recording natural muscle activation signals, which can serve as control inputs for exoskeletons and prosthetic devices. Previous experiments have incorporated these signals using both classical and pattern-recognition control methods in order to actuate such devices. We used the results of an experiment incorporating grasp and release actions with object contact to develop an intent-recognition system based on Gaussian mixture models (GMM) and continuous-emission hidden Markov models (HMM) of sEMG data. We tested this system with data collected from 16 individuals using a forearm band with distributed sEMG sensors. The data contain trials with shifted band alignments to assess robustness to sensor placement. This study evaluated and found that pattern-recognition-based methods could classify transient anticipatory sEMG signals in the presence of shifted sensor placement and object contact. With the best-performing classifier, the effect of label lengths in the training data was also examined. A mean classification accuracy of 75.96% was achieved through a unigram HMM method with five mixture components. Classification accuracy on different sub-movements was found to be limited by the length of the shortest sub-movement, which means that shorter sub-movements within dynamic sequences require larger training sets to be classified correctly. This classification of user intent is a potential control mechanism for a dynamic grasping task involving user contact with external objects and noise. Further work is required to test its performance as part of an exoskeleton controller, which involves contact with actuated external surfaces.
Classification of Anticipatory Signals for Grasp and Release from Surface Electromyography
Siu, Ho Chit; Shah, Julie A.; Stirling, Leia A.
2016-01-01
Surface electromyography (sEMG) is a technique for recording natural muscle activation signals, which can serve as control inputs for exoskeletons and prosthetic devices. Previous experiments have incorporated these signals using both classical and pattern-recognition control methods in order to actuate such devices. We used the results of an experiment incorporating grasp and release actions with object contact to develop an intent-recognition system based on Gaussian mixture models (GMM) and continuous-emission hidden Markov models (HMM) of sEMG data. We tested this system with data collected from 16 individuals using a forearm band with distributed sEMG sensors. The data contain trials with shifted band alignments to assess robustness to sensor placement. This study evaluated and found that pattern-recognition-based methods could classify transient anticipatory sEMG signals in the presence of shifted sensor placement and object contact. With the best-performing classifier, the effect of label lengths in the training data was also examined. A mean classification accuracy of 75.96% was achieved through a unigram HMM method with five mixture components. Classification accuracy on different sub-movements was found to be limited by the length of the shortest sub-movement, which means that shorter sub-movements within dynamic sequences require larger training sets to be classified correctly. This classification of user intent is a potential control mechanism for a dynamic grasping task involving user contact with external objects and noise. Further work is required to test its performance as part of an exoskeleton controller, which involves contact with actuated external surfaces. PMID:27792155
2014-01-01
Background Locating the protein-coding genes in novel genomes is essential to understanding and exploiting the genomic information but it is still difficult to accurately predict all the genes. The recent availability of detailed information about transcript structure from high-throughput sequencing of messenger RNA (RNA-Seq) delineates many expressed genes and promises increased accuracy in gene prediction. Computational gene predictors have been intensively developed for and tested in well-studied animal genomes. Hundreds of fungal genomes are now or will soon be sequenced. The differences of fungal genomes from animal genomes and the phylogenetic sparsity of well-studied fungi call for gene-prediction tools tailored to them. Results SnowyOwl is a new gene prediction pipeline that uses RNA-Seq data to train and provide hints for the generation of Hidden Markov Model (HMM)-based gene predictions and to evaluate the resulting models. The pipeline has been developed and streamlined by comparing its predictions to manually curated gene models in three fungal genomes and validated against the high-quality gene annotation of Neurospora crassa; SnowyOwl predicted N. crassa genes with 83% sensitivity and 65% specificity. SnowyOwl gains sensitivity by repeatedly running the HMM gene predictor Augustus with varied input parameters and selectivity by choosing the models with best homology to known proteins and best agreement with the RNA-Seq data. Conclusions SnowyOwl efficiently uses RNA-Seq data to produce accurate gene models in both well-studied and novel fungal genomes. The source code for the SnowyOwl pipeline (in Python) and a web interface (in PHP) is freely available from http://sourceforge.net/projects/snowyowl/. PMID:24980894
Bayesian structural inference for hidden processes.
Strelioff, Christopher C; Crutchfield, James P
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ε-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ε-machines, irrespective of estimated transition probabilities. Properties of ε-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Bayesian structural inference for hidden processes
NASA Astrophysics Data System (ADS)
Strelioff, Christopher C.; Crutchfield, James P.
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Modeling strategic use of human computer interfaces with novel hidden Markov models
Mariano, Laura J.; Poore, Joshua C.; Krum, David M.; Schwartz, Jana L.; Coskren, William D.; Jones, Eric M.
2015-01-01
Immersive software tools are virtual environments designed to give their users an augmented view of real-world data and ways of manipulating that data. As virtual environments, every action users make while interacting with these tools can be carefully logged, as can the state of the software and the information it presents to the user, giving these actions context. This data provides a high-resolution lens through which dynamic cognitive and behavioral processes can be viewed. In this report, we describe new methods for the analysis and interpretation of such data, utilizing a novel implementation of the Beta Process Hidden Markov Model (BP-HMM) for analysis of software activity logs. We further report the results of a preliminary study designed to establish the validity of our modeling approach. A group of 20 participants were asked to play a simple computer game, instrumented to log every interaction with the interface. Participants had no previous experience with the game's functionality or rules, so the activity logs collected during their naïve interactions capture patterns of exploratory behavior and skill acquisition as they attempted to learn the rules of the game. Pre- and post-task questionnaires probed for self-reported styles of problem solving, as well as task engagement, difficulty, and workload. We jointly modeled the activity log sequences collected from all participants using the BP-HMM approach, identifying a global library of activity patterns representative of the collective behavior of all the participants. Analyses show systematic relationships between both pre- and post-task questionnaires, self-reported approaches to analytic problem solving, and metrics extracted from the BP-HMM decomposition. Overall, we find that this novel approach to decomposing unstructured behavioral data within software environments provides a sensible means for understanding how users learn to integrate software functionality for strategic task pursuit. PMID:26191026
Noury, N; Hadidi, T
2012-12-01
We propose a simulator of human activities collected with presence sensors in our experimental Health Smart Home "Habitat Intelligent pour la Sante (HIS)". We recorded 1492 days of data on several experimental HIS during the French national project "AILISA". On these real data, we built a mathematical model of the behavior of the data series, based on "Hidden Markov Models" (HMM). The model is then played on a computer to produce simulated data series with added flexibility to adjust the parameters in various scenarios. We also tested several methods to measure the similarity between our real and simulated data. Our simulator can produce large data base which can be further used to evaluate the algorithms to raise an alarm in case of loss in autonomy. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Attaluri, Pavan K.; Chen, Zhengxin; Weerakoon, Aruna M.; Lu, Guoqing
Multiple criteria decision making (MCDM) has significant impact in bioinformatics. In the research reported here, we explore the integration of decision tree (DT) and Hidden Markov Model (HMM) for subtype prediction of human influenza A virus. Infection with influenza viruses continues to be an important public health problem. Viral strains of subtype H3N2 and H1N1 circulates in humans at least twice annually. The subtype detection depends mainly on the antigenic assay, which is time-consuming and not fully accurate. We have developed a Web system for accurate subtype detection of human influenza virus sequences. The preliminary experiment showed that this system is easy-to-use and powerful in identifying human influenza subtypes. Our next step is to examine the informative positions at the protein level and extend its current functionality to detect more subtypes. The web functions can be accessed at http://glee.ist.unomaha.edu/.
Hayakawa, Toru; Yoshida, Yuri; Yasui, Masanori; Ito, Toshiaki; Wakamatsu, Jun-ichi; Hattori, Akihito; Nishimura, Takanori
2015-08-01
The gelation of myosin has a very important role in meat products. We have already shown that myosin in low ionic strength solution containing L-histidine forms a transparent gel after heating. To clarify the mechanism of this unique gelation, we investigated the changes in the nature of myosin subfragments during heating in solutions with low and high ionic strengths with and without L-histidine. The hydrophobicity of myosin and heavy meromyosin (HMM) in low ionic strength solution containing L-histidine was lower than in high ionic strength solution. The SH contents of myosin and HMM in low ionic strength solution containing l-histidine did not change during the heating process, whereas in high ionic strength solution they decreased slightly. The heat-induced globular masses of HMM in low ionic strength solution containing L-histidine were smaller than those in high ionic strength solution. These findings suggested that the polymerization of HMM molecules by heating was suppressed in low ionic strength solution containing L-histidine, resulting in formation of the unique gel. © 2015 Institute of Food Technologists®
NASA Astrophysics Data System (ADS)
Kang, Seung-Ho; Lee, Sang-Hee; Chon, Tae-Soo
2012-02-01
In recent decades, the behavior of Caenorhabditis elegans ( C. elegans) has been extensively studied to understand the respective roles of neural control and biomechanics. Thus far, however, only a few studies on the simulation modeling of C. elegans swimming behavior have been conducted because it is mathematically difficult to describe its complicated behavior. In this study, we built two hidden Markov models (HMMs), corresponding to the movements of C. elegans in a controlled environment with no chemical treatment and in a formaldehyde-treated environment (0.1 ppm), respectively. The movement was characterized by a series of shape patterns of the organism, taken every 0.25 s for 40 min. All shape patterns were quantified by branch length similarity (BLS) entropy and classified into seven patterns by using the self-organizing map (SOM) and the k-means clustering algorithm. The HMM coupled with the SOM was successful in accurately explaining the organism's behavior. In addition, we briefly discussed the possibility of using the HMM together with BLS entropy to develop bio-monitoring systems for real-time applications to determine water quality.
Comparison of RF spectrum prediction methods for dynamic spectrum access
NASA Astrophysics Data System (ADS)
Kovarskiy, Jacob A.; Martone, Anthony F.; Gallagher, Kyle A.; Sherbondy, Kelly D.; Narayanan, Ram M.
2017-05-01
Dynamic spectrum access (DSA) refers to the adaptive utilization of today's busy electromagnetic spectrum. Cognitive radio/radar technologies require DSA to intelligently transmit and receive information in changing environments. Predicting radio frequency (RF) activity reduces sensing time and energy consumption for identifying usable spectrum. Typical spectrum prediction methods involve modeling spectral statistics with Hidden Markov Models (HMM) or various neural network structures. HMMs describe the time-varying state probabilities of Markov processes as a dynamic Bayesian network. Neural Networks model biological brain neuron connections to perform a wide range of complex and often non-linear computations. This work compares HMM, Multilayer Perceptron (MLP), and Recurrent Neural Network (RNN) algorithms and their ability to perform RF channel state prediction. Monte Carlo simulations on both measured and simulated spectrum data evaluate the performance of these algorithms. Generalizing spectrum occupancy as an alternating renewal process allows Poisson random variables to generate simulated data while energy detection determines the occupancy state of measured RF spectrum data for testing. The results suggest that neural networks achieve better prediction accuracy and prove more adaptable to changing spectral statistics than HMMs given sufficient training data.
Probabilistic Multi-Sensor Fusion Based Indoor Positioning System on a Mobile Device
He, Xiang; Aloi, Daniel N.; Li, Jia
2015-01-01
Nowadays, smart mobile devices include more and more sensors on board, such as motion sensors (accelerometer, gyroscope, magnetometer), wireless signal strength indicators (WiFi, Bluetooth, Zigbee), and visual sensors (LiDAR, camera). People have developed various indoor positioning techniques based on these sensors. In this paper, the probabilistic fusion of multiple sensors is investigated in a hidden Markov model (HMM) framework for mobile-device user-positioning. We propose a graph structure to store the model constructed by multiple sensors during the offline training phase, and a multimodal particle filter to seamlessly fuse the information during the online tracking phase. Based on our algorithm, we develop an indoor positioning system on the iOS platform. The experiments carried out in a typical indoor environment have shown promising results for our proposed algorithm and system design. PMID:26694387
Probabilistic Multi-Sensor Fusion Based Indoor Positioning System on a Mobile Device.
He, Xiang; Aloi, Daniel N; Li, Jia
2015-12-14
Nowadays, smart mobile devices include more and more sensors on board, such as motion sensors (accelerometer, gyroscope, magnetometer), wireless signal strength indicators (WiFi, Bluetooth, Zigbee), and visual sensors (LiDAR, camera). People have developed various indoor positioning techniques based on these sensors. In this paper, the probabilistic fusion of multiple sensors is investigated in a hidden Markov model (HMM) framework for mobile-device user-positioning. We propose a graph structure to store the model constructed by multiple sensors during the offline training phase, and a multimodal particle filter to seamlessly fuse the information during the online tracking phase. Based on our algorithm, we develop an indoor positioning system on the iOS platform. The experiments carried out in a typical indoor environment have shown promising results for our proposed algorithm and system design.
Closed-loop control of a fragile network: application to seizure-like dynamics of an epilepsy model
Ehrens, Daniel; Sritharan, Duluxan; Sarma, Sridevi V.
2015-01-01
It has recently been proposed that the epileptic cortex is fragile in the sense that seizures manifest through small perturbations in the synaptic connections that render the entire cortical network unstable. Closed-loop therapy could therefore entail detecting when the network goes unstable, and then stimulating with an exogenous current to stabilize the network. In this study, a non-linear stochastic model of a neuronal network was used to simulate both seizure and non-seizure activity. In particular, synaptic weights between neurons were chosen such that the network's fixed point is stable during non-seizure periods, and a subset of these connections (the most fragile) were perturbed to make the same fixed point unstable to model seizure events; and, the model randomly transitions between these two modes. The goal of this study was to measure spike train observations from this epileptic network and then apply a feedback controller that (i) detects when the network goes unstable, and then (ii) applies a state-feedback gain control input to the network to stabilize it. The stability detector is based on a 2-state (stable, unstable) hidden Markov model (HMM) of the network, and detects the transition from the stable mode to the unstable mode from using the firing rate of the most fragile node in the network (which is the output of the HMM). When the unstable mode is detected, a state-feedback gain is applied to generate a control input to the fragile node bringing the network back to the stable mode. Finally, when the network is detected as stable again, the feedback control input is switched off. High performance was achieved for the stability detector, and feedback control suppressed seizures within 2 s after onset. PMID:25784851
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
NASA Astrophysics Data System (ADS)
Adams, W. H.; Iyengar, Giridharan; Lin, Ching-Yung; Naphade, Milind Ramesh; Neti, Chalapathy; Nock, Harriet J.; Smith, John R.
2003-12-01
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
Family-Based Benchmarking of Copy Number Variation Detection Software.
Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael
2015-01-01
The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
Structural features based genome-wide characterization and prediction of nucleosome organization
2012-01-01
Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization. The implementation of our DLaNe method based on structural features is available online. PMID:22449207
NASA Astrophysics Data System (ADS)
Sidorovskaia, Natalia A.; Richard, Blake; Ioup, George E.; Ioup, Juliette W.
2005-09-01
The Littoral Acoustic Demonstration Center (LADC) made a series of passive broadband acoustic recordings in the Gulf of Mexico and Ligurian Sea to study noise and marine mammal phonations. The collected data contain a large amount of various types of sperm whale phonations, such as isolated clicks and communication codas. It was previously reported that the spectrograms of the extracted clicks and codas contain well-defined null patterns that seem to be unique for individuals. The null pattern is formed due to individual features of the sound production organs of an animal. These observations motivated the present studies of adapting human speech identification techniques for deep-diving marine mammal phonations. A three-state trained hidden Markov model (HMM) was used with the phonation spectra of sperm whales. The HHM-algorithm gave 75% accuracy in identifying individuals when it had been initially tested for the acoustic data set correlated with visual observations of sperm whales. A comparison of the identification accuracy based on null-pattern similarity analysis and the HMM-algorithm is presented. The results can establish the foundation for developing an acoustic identification database for sperm whales and possibly other deep-diving marine mammals that would be difficult to observe visually. [Research supported by ONR.
EMG-based speech recognition using hidden markov models with global control variables.
Lee, Ki-Seung
2008-03-01
It is well known that a strong relationship exists between human voices and the movement of articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The sequence of EMG signals for each word is modelled by a hidden Markov model (HMM) framework. The main objective of the work involves building a model for state observation density when multichannel observation sequences are given. The proposed model reflects the dependencies between each of the EMG signals, which are described by introducing a global control variable. We also develop an efficient model training method, based on a maximum likelihood criterion. In a preliminary study, 60 isolated words were used as recognition variables. EMG signals were acquired from three articulatory facial muscles. The findings indicate that such a system may have the capacity to recognize speech signals with an accuracy of up to 87.07%, which is superior to the independent probabilistic model.
Three Dimensional Object Recognition Using a Complex Autoregressive Model
1993-12-01
3.4.2 Template Matching Algorithm ...................... 3-16 3.4.3 K-Nearest-Neighbor ( KNN ) Techniques ................. 3-25 3.4.4 Hidden Markov Model...Neighbor ( KNN ) Test Results ...................... 4-13 4.2.1 Single-Look 1-NN Testing .......................... 4-14 4.2.2 Multiple-Look 1-NN Testing...4-15 4.2.3 Discussion of KNN Test Results ...................... 4-15 4.3 Hidden Markov Model (HMM) Test Results
Performance testing and results of the first Etec CORE-2564
NASA Astrophysics Data System (ADS)
Franks, C. Edward; Shikata, Asao; Baker, Catherine A.
1993-03-01
In order to be able to write 64 megabit DRAM reticles, to prepare to write 256 megabit DRAM reticles and in general to meet the current and next generation mask and reticle quality requirements, Hoya Micro Mask (HMM) installed in 1991 the first CORE-2564 Laser Reticle Writer from Etec Systems, Inc. The system was delivered as a CORE-2500XP and was subsequently upgraded to a 2564. The CORE (Custom Optical Reticle Engraver) system produces photomasks with an exposure strategy similar to that employed by an electron beam system, but it uses a laser beam to deliver the photoresist exposure energy. Since then the 2564 has been tested by Etec's standard Acceptance Test Procedure and by several supplementary HMM techniques to insure performance to all the Etec advertised specifications and certain additional HMM requirements that were more demanding and/or more thorough than the advertised specifications. The primary purpose of the HMM tests was to more closely duplicate mask usage. The performance aspects covered by the tests include registration accuracy and repeatability; linewidth accuracy, uniformity and linearity; stripe butting; stripe and scan linearity; edge quality; system cleanliness; minimum geometry resolution; minimum address size and plate loading accuracy and repeatability.
NASA Astrophysics Data System (ADS)
Yuan, Y.; Meng, Y.; Chen, Y. X.; Jiang, C.; Yue, A. Z.
2018-04-01
In this study, we proposed a method to map urban encroachment onto farmland using satellite image time series (SITS) based on the hierarchical hidden Markov model (HHMM). In this method, the farmland change process is decomposed into three hierarchical levels, i.e., the land cover level, the vegetation phenology level, and the SITS level. Then a three-level HHMM is constructed to model the multi-level semantic structure of farmland change process. Once the HHMM is established, a change from farmland to built-up could be detected by inferring the underlying state sequence that is most likely to generate the input time series. The performance of the method is evaluated on MODIS time series in Beijing. Results on both simulated and real datasets demonstrate that our method improves the change detection accuracy compared with the HMM-based method.
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.
Pasaniuc, Bogdan; Zaitlen, Noah; Shi, Huwenbo; Bhatia, Gaurav; Gusev, Alexander; Pickrell, Joseph; Hirschhorn, Joel; Strachan, David P; Patterson, Nick; Price, Alkes L
2014-10-15
Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of [Formula: see text] association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses. Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/. bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu Supplementary materials are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kakai, Rose; Menya, Diana; Odero, Wilson
2009-08-30
Home management of malaria (HMM) has been shown to be an effective strategy for reducing childhood mortality from malaria. The direct and especially indirect costs of seeking health care from formal facilities may be substantial, providing a major barrier for many households. Further evaluations of HMM and community-based utilization of available options will help to optimize treatment strategies and maximize health benefits. The purpose of this study was to determine the effect of education, occupation, and family income on the choice of health care options for malaria. This was a cross-sectional, community-based study conducted between November 2007 and December 2007, using quantitative data collection methods. Mothers of children aged younger than five years were interviewed using a questionnaire to elicit responses on the mothers' level of education, occupation, income and malaria health care options. A total of 240 mothers of children aged younger than 5 years were interviewed between November and December, 2007. There was a direct relationship between formal education and occupation. The mean monthly family income was highest among those employed (KSh. 14,421) followed by businesswomen (KSh. 3,106) and farmers (KSh. 1,827) respectively (p<0.01). Those employed were more likely to take their ill children to a health facility (p = 0.05) or choose an antimalarial drug for home treatment. Supporting formal education may scale up the income of family health care providers and improve the quality of HMM among children living in rural communities.
2010-01-01
Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480
Algorithms for Hidden Markov Models Restricted to Occurrences of Regular Expressions
Tataru, Paula; Sand, Andreas; Hobolth, Asger; Mailund, Thomas; Pedersen, Christian N. S.
2013-01-01
Hidden Markov Models (HMMs) are widely used probabilistic models, particularly for annotating sequential data with an underlying hidden structure. Patterns in the annotation are often more relevant to study than the hidden structure itself. A typical HMM analysis consists of annotating the observed data using a decoding algorithm and analyzing the annotation to study patterns of interest. For example, given an HMM modeling genes in DNA sequences, the focus is on occurrences of genes in the annotation. In this paper, we define a pattern through a regular expression and present a restriction of three classical algorithms to take the number of occurrences of the pattern in the hidden sequence into account. We present a new algorithm to compute the distribution of the number of pattern occurrences, and we extend the two most widely used existing decoding algorithms to employ information from this distribution. We show experimentally that the expectation of the distribution of the number of pattern occurrences gives a highly accurate estimate, while the typical procedure can be biased in the sense that the identified number of pattern occurrences does not correspond to the true number. We furthermore show that using this distribution in the decoding algorithms improves the predictive power of the model. PMID:24833225
Hidden Markov model approach for identifying the modular framework of the protein backbone.
Camproux, A C; Tuffery, P; Chevrolat, J P; Boisvieux, J F; Hazout, S
1999-12-01
The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-alpha-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.
COACH: profile-profile alignment of protein families using hidden Markov models.
Edgar, Robert C; Sjölander, Kimmen
2004-05-22
Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed and have been shown to improve sensitivity and alignment quality compared with sequence-sequence methods (such as BLAST) and profile-sequence methods (e.g. PSI-BLAST). Here we present a new approach to profile-profile alignment we call Comparison of Alignments by Constructing Hidden Markov Models (HMMs) (COACH). COACH aligns two multiple sequence alignments by constructing a profile HMM from one alignment and aligning the other to that HMM. We compare the alignment accuracy of COACH with two recently published methods: Yona and Levitt's prof_sim and Sadreyev and Grishin's COMPASS. On two sets of reference alignments selected from the FSSP database, we find that COACH is able, on average, to produce alignments giving the best coverage or the fewest errors, depending on the chosen parameter settings. COACH is freely available from www.drive5.com/lobster
A proposed OB-fold with a protein-interaction surface in Candida albicans telomerase protein Est3
Yu, Eun Young; Wang, Feng; Lei, Ming; Lue, Neal F
2008-01-01
Ever shorter telomeres 3 (Est3) is an essential telomerase regulatory subunit thought to be unique to budding yeasts. Here we use multiple sequence alignment and hidden Markov model–hidden Markov model (HMM-HMM) comparison to uncover potential similarities between Est3 and the mammalian telomeric protein Tpp1. Analysis of site-specific mutants of Candida albicans Est3 revealed functional distinctions between residues that are conserved between Est3 and Tpp1 and those that are unique to Est3. Although both types of residues are important for telomere maintenance in vivo, only the former contributes to telomerase activity in vitro and facilitates the association of Est3 with telomerase core components. Consistent with a function in protein-protein interaction, the residues common to Est3 and Tpp1 map to one face of an OB-fold model structure, away from the canonical nucleic acid binding surface. We propose that Est3 and the OB-fold domain of Tpp1 mediate a conserved function in telomerase regulation. PMID:19172753
Naive scoring of human sleep based on a hidden Markov model of the electroencephalogram.
Yaghouby, Farid; Modur, Pradeep; Sunderam, Sridhar
2014-01-01
Clinical sleep scoring involves tedious visual review of overnight polysomnograms by a human expert. Many attempts have been made to automate the process by training computer algorithms such as support vector machines and hidden Markov models (HMMs) to replicate human scoring. Such supervised classifiers are typically trained on scored data and then validated on scored out-of-sample data. Here we describe a methodology based on HMMs for scoring an overnight sleep recording without the benefit of a trained initial model. The number of states in the data is not known a priori and is optimized using a Bayes information criterion. When tested on a 22-subject database, this unsupervised classifier agreed well with human scores (mean of Cohen's kappa > 0.7). The HMM also outperformed other unsupervised classifiers (Gaussian mixture models, k-means, and linkage trees), that are capable of naive classification but do not model dynamics, by a significant margin (p < 0.05).
Rafii-Tari, Hedyeh; Liu, Jindong; Payne, Christopher J; Bicknell, Colin; Yang, Guang-Zhong
2014-01-01
Despite increased use of remote-controlled steerable catheter navigation systems for endovascular intervention, most current designs are based on master configurations which tend to alter natural operator tool interactions. This introduces problems to both ergonomics and shared human-robot control. This paper proposes a novel cooperative robotic catheterization system based on learning-from-demonstration. By encoding the higher-level structure of a catheterization task as a sequence of primitive motions, we demonstrate how to achieve prospective learning for complex tasks whilst incorporating subject-specific variations. A hierarchical Hidden Markov Model is used to model each movement primitive as well as their sequential relationship. This model is applied to generation of motion sequences, recognition of operator input, and prediction of future movements for the robot. The framework is validated by comparing catheter tip motions against the manual approach, showing significant improvements in the quality of catheterization. The results motivate the design of collaborative robotic systems that are intuitive to use, while reducing the cognitive workload of the operator.
Soft context clustering for F0 modeling in HMM-based speech synthesis
NASA Astrophysics Data System (ADS)
Khorram, Soheil; Sameti, Hossein; King, Simon
2015-12-01
This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional `hard' decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in synthesizing natural-sounding high-quality speech. Conventionally, hard decision tree-clustered hidden Markov models (HMMs) are used, in which each model parameter is assigned to a single leaf node. However, this `divide-and-conquer' approach leads to data sparsity, with the consequence that it suffers from poor generalization, meaning that it is unable to accurately predict parameters for models of unseen contexts: the hard decision tree is a weak function approximator. To alleviate this, we propose the soft decision tree, which is a binary decision tree with soft decisions at the internal nodes. In this soft clustering method, internal nodes select both their children with certain membership degrees; therefore, each node can be viewed as a fuzzy set with a context-dependent membership function. The soft decision tree improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves. In order to use such a soft decision tree to predict the parameters of the HMM output probability distribution, we derive the smoothest (maximum entropy) distribution which captures all partial first-order moments and a global second-order moment of the training samples. Employing such a soft decision tree architecture with maximum entropy distributions, a novel speech synthesis system is trained using maximum likelihood (ML) parameter re-estimation and synthesis is achieved via maximum output probability parameter generation. In addition, a soft decision tree construction algorithm optimizing a log-likelihood measure is developed. Both subjective and objective evaluations were conducted and indicate a considerable improvement over the conventional method.
NASA Astrophysics Data System (ADS)
Lim, Mikyung; Song, Jaeman; Kim, Jihoon; Lee, Seung S.; Lee, Ikjin; Lee, Bong Jae
2018-05-01
The present work successfully achieves a strong enhancement in performance of a near-field thermophotovoltaic (TPV) system operating at low temperature and large-vacuum-gap width by introducing a hyperbolic-metamaterial (HMM) emitter, multilayered graphene, and an Au-backside reflector. Design variables for the HMM emitter and the multilayered-graphene-covered TPV cell are optimized for maximizing the power output of the near-field TPV system with the genetic algorithm. The near-field TPV system with the optimized configuration results in 24.2 times of enhancement in power output compared with that of the system with a bulk emitter and a bare TPV cell. Through the analysis of the radiative heat transfer together with surface-plasmon-polariton (SPP) dispersion curves, it is found that coupling of SPPs generated from both the HMM emitter and the multilayered-graphene-covered TPV cell plays a key role in a substantial increase in the heat transfer even at a 200-nm vacuum gap. Further, the backside reflector at the bottom of the TPV cell significantly increases not only the conversion efficiency, but also the power output by generating additional polariton modes which can be readily coupled with the existing SPPs of the HMM emitter and the multilayered-graphene-covered TPV cell.
Sarment: Python modules for HMM analysis and partitioning of sequences.
Guéguen, Laurent
2005-08-15
Sarment is a package of Python modules for easy building and manipulation of sequence segmentations. It provides efficient implementation of usual algorithms for hidden Markov Model computation, as well as for maximal predictive partitioning. Owing to its very large variety of criteria for computing segmentations, Sarment can handle many kinds of models. Because of object-oriented programming, the results of the segmentation are very easy tomanipulate.
Polur, Prasad D; Miller, Gerald E
2006-10-01
Computer speech recognition of individuals with dysarthria, such as cerebral palsy patients requires a robust technique that can handle conditions of very high variability and limited training data. In this study, application of a 10 state ergodic hidden Markov model (HMM)/artificial neural network (ANN) hybrid structure for a dysarthric speech (isolated word) recognition system, intended to act as an assistive tool, was investigated. A small size vocabulary spoken by three cerebral palsy subjects was chosen. The effect of such a structure on the recognition rate of the system was investigated by comparing it with an ergodic hidden Markov model as a control tool. This was done in order to determine if this modified technique contributed to enhanced recognition of dysarthric speech. The speech was sampled at 11 kHz. Mel frequency cepstral coefficients were extracted from them using 15 ms frames and served as training input to the hybrid model setup. The subsequent results demonstrated that the hybrid model structure was quite robust in its ability to handle the large variability and non-conformity of dysarthric speech. The level of variability in input dysarthric speech patterns sometimes limits the reliability of the system. However, its application as a rehabilitation/control tool to assist dysarthric motor impaired individuals holds sufficient promise.
Taghvaei, Sajjad; Jahanandish, Mohammad Hasan; Kosuge, Kazuhiro
2017-01-01
Population aging of the societies requires providing the elderly with safe and dependable assistive technologies in daily life activities. Improving the fall detection algorithms can play a major role in achieving this goal. This article proposes a real-time fall prediction algorithm based on the acquired visual data of a user with walking assistive system from a depth sensor. In the lack of a coupled dynamic model of the human and the assistive walker a hybrid "system identification-machine learning" approach is used. An autoregressive-moving-average (ARMA) model is fitted on the time-series walking data to forecast the upcoming states, and a hidden Markov model (HMM) based classifier is built on the top of the ARMA model to predict falling in the upcoming time frames. The performance of the algorithm is evaluated through experiments with four subjects including an experienced physiotherapist while using a walker robot in five different falling scenarios; namely, fall forward, fall down, fall back, fall left, and fall right. The algorithm successfully predicts the fall with a rate of 84.72%.
HMM for hyperspectral spectrum representation and classification with endmember entropy vectors
NASA Astrophysics Data System (ADS)
Arabi, Samir Y. W.; Fernandes, David; Pizarro, Marco A.
2015-10-01
The Hyperspectral images due to its good spectral resolution are extensively used for classification, but its high number of bands requires a higher bandwidth in the transmission data, a higher data storage capability and a higher computational capability in processing systems. This work presents a new methodology for hyperspectral data classification that can work with a reduced number of spectral bands and achieve good results, comparable with processing methods that require all hyperspectral bands. The proposed method for hyperspectral spectra classification is based on the Hidden Markov Model (HMM) associated to each Endmember (EM) of a scene and the conditional probabilities of each EM belongs to each other EM. The EM conditional probability is transformed in EM vector entropy and those vectors are used as reference vectors for the classes in the scene. The conditional probability of a spectrum that will be classified is also transformed in a spectrum entropy vector, which is classified in a given class by the minimum ED (Euclidian Distance) among it and the EM entropy vectors. The methodology was tested with good results using AVIRIS spectra of a scene with 13 EM considering the full 209 bands and the reduced spectral bands of 128, 64 and 32. For the test area its show that can be used only 32 spectral bands instead of the original 209 bands, without significant loss in the classification process.
Kang, J O; Ito, T; Fukazawa, T
1983-01-01
The effect of frozen storage on the biochemical properties of myofibrils, and of their major constituents, actin and myosin, was investigated. Extractability of myofibrillar proteins increased slightly for 3 weeks during frozen storage of muscle, decreasing thereafter. The change in myofibrillar ATPase activity during frozen storage was consistent with that of a reconstituted acto-heavy meromyosin (HMM) complex prepared from frozen stored muscle at the same weight ratio of actin to myosin as in situ. However, myosin ATPase activity showed a different pattern of change when compared with myofibrillar ATPase activity. The maximum velocity of acto-HMM ATPase activity and the apparent dissociation constant of the acto-HMM complex decreased for 1 week during frozen storage, increasing thereafter, indicating that the affinity of actin for myosin was greatest in muscle which had been frozen for 1 week. Copyright © 1983. Published by Elsevier Ltd.
Tiono, Alfred B; Kaboré, Youssouf; Traoré, Abdoulaye; Convelbo, Nathalie; Pagnoni, Franco; Sirima, Sodiomon B
2008-10-03
Home Management of Malaria (HMM) is one of the key strategies to reduce the burden of malaria for vulnerable population in endemic countries. It is based on the evidence that well-trained communities health workers can provide prompt and adequate care to patients close to their homes. The strategy has been shown to reduce malaria mortality and severe morbidity and has been adopted by the World Health Organization as a cornerstone of malaria control in Africa. However, the potential fall-out of this community-based strategy on the work burden at the peripheral health facilities level has never been investigated. A two-arm interventional study was conducted in a rural health district of Burkina Faso. The HMM strategy has been implemented in seven community clinics catchment's area (intervention arm). For the other seven community clinics in the control arm, no HMM intervention was implemented. In each of the study arms, presumptive treatment was provided for episodes of fevers/malaria (defined operationally as malaria). The study drug was artemether-lumefantrine, which was sold at a subsidized price by community health workers/Key opinion leaders at the community level and by the pharmacists at the health facility level. The outcome measured was the proportion of malaria cases among all health facility attendance (all causes diseases) in both arms throughout the high transmission season. A total of 7,621 children were enrolled in the intervention arm and 7,605 in the control arm. During the study period, the proportions of malaria cases among all health facility attendance (all causes diseases) were 21.0%, (445/2,111, 95% CI [19.3%-22.7%]) and 70.7% (2,595/3,671, 95% CI 68.5%-71.5%), respectively in the intervention and control arms (p < 0.0001). The relative risk ratio for a fever/malaria episode to be treated at the HF level was 30% (0.30 < RR < 0.32). The number of malaria episodes treated in the intervention arm was much higher than in the control arm (6,661 vs. 2,595), with malaria accounting for 87.4% of all disease episodes recorded in the intervention area and for 34.1% in the control area (P < 0.0001). Of all the malaria cases treated in the intervention arm, only 6.7% were treated at the health facility level. These findings suggest that implementation of HMM, by reducing the workload in health facilities, might contributes to an overall increase of the performance of the peripheral health facilities.
2013-03-01
framework of orientation distribution functions and crack-induced texture o Quantify effects of temperature on damage behavior and damage monitoring...measurement model was obtained from hidden Markov modeling (HMM) of joint time-frequency (TF) features extracted from the PZT sensor signals using the...considered PZT sensor signals recorded from a bolted aluminum plate. About only 20% of the samples of a signal were first randomly selected as
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Al Khatib, Haya K; Hall, Wendy L; Creedon, Alice; Ooi, Emily; Masri, Tala; McGowan, Laura; Harding, Scott V; Darzi, Julia; Pot, Gerda K
2018-01-01
ABSTRACT Background Evidence suggests that short sleep duration may be a newly identified modifiable risk factor for obesity, yet there is a paucity of studies to investigate this. Objective We assessed the feasibility of a personalized sleep extension protocol in adults aged 18–64 y who are habitually short sleepers (5 to <7 h), with sleep primarily measured by wrist actigraphy. In addition, we collected pilot data to assess the effects of extended sleep on dietary intake and quality measured by 7-d food diaries, resting and total energy expenditure, physical activity, and markers of cardiometabolic health. Design Forty-two normal-weight healthy participants who were habitually short sleepers completed this free-living, 4-wk, parallel-design randomized controlled trial. The sleep extension group (n = 21) received a behavioral consultation session targeting sleep hygiene. The control group (n = 21) maintained habitual short sleep. Results Rates of participation, attrition, and compliance were 100%, 6.5%, and 85.7%, respectively. The sleep extension group significantly increased time in bed [0:55 hours:minutes (h:mm); 95% CI: 0:37, 1:12 h:mm], sleep period (0:47 h:mm; 95% CI: 0:29, 1:05 h:mm), and sleep duration (0:21 h:mm; 95% CI: 0:06, 0:36 h:mm) compared with the control group. Sleep extension led to reduced intake of free sugars (–9.6 g; 95% CI: –16.0, –3.1 g) compared with control (0.7 g; 95% CI: –5.7, 7.2 g) (P = 0.042). A sensitivity analysis in plausible reporters showed that the sleep extension group reduced intakes of fat (percentage), carbohydrates (grams), and free sugars (grams) in comparison to the control group. There were no significant differences between groups in markers of energy balance or cardiometabolic health. Conclusions We showed the feasibility of extending sleep in adult short sleepers. Sleep extension led to reduced free sugar intakes and may be a viable strategy to facilitate limiting excessive consumption of free sugars in an obesity-promoting environment. This trial was registered at www.clinicaltrials.gov as NCT02787577. PMID:29381788
Hierarchical structure for audio-video based semantic classification of sports video sequences
NASA Astrophysics Data System (ADS)
Kolekar, M. H.; Sengupta, S.
2005-07-01
A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.
Multimodal Speaker Diarization.
Noulas, A; Englebienne, G; Krose, B J A
2012-01-01
We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an audiovisual recording as multimodal entities that generate observations in the audio stream, the video stream, and the joint audiovisual space. The framework is very robust to different contexts, makes no assumptions about the location of the recording equipment, and does not require labeled training data as it acquires the model parameters using the Expectation Maximization (EM) algorithm. We apply the proposed model to two meeting videos and a news broadcast video, all of which come from publicly available data sets. The results acquired in speaker diarization are in favor of the proposed multimodal framework, which outperforms the single modality analysis results and improves over the state-of-the-art audio-based speaker diarization.
Hidden Markov models reveal complexity in the diving behaviour of short-finned pilot whales
Quick, Nicola J.; Isojunno, Saana; Sadykova, Dina; Bowers, Matthew; Nowacek, Douglas P.; Read, Andrew J.
2017-01-01
Diving behaviour of short-finned pilot whales is often described by two states; deep foraging and shallow, non-foraging dives. However, this simple classification system ignores much of the variation that occurs during subsurface periods. We used multi-state hidden Markov models (HMM) to characterize states of diving behaviour and the transitions between states in short-finned pilot whales. We used three parameters (number of buzzes, maximum dive depth and duration) measured in 259 dives by digital acoustic recording tags (DTAGs) deployed on 20 individual whales off Cape Hatteras, North Carolina, USA. The HMM identified a four-state model as the best descriptor of diving behaviour. The state-dependent distributions for the diving parameters showed variation between states, indicative of different diving behaviours. Transition probabilities were considerably higher for state persistence than state switching, indicating that dive types occurred in bouts. Our results indicate that subsurface behaviour in short-finned pilot whales is more complex than a simple dichotomy of deep and shallow diving states, and labelling all subsurface behaviour as deep dives or shallow dives discounts a significant amount of important variation. We discuss potential drivers of these patterns, including variation in foraging success, prey availability and selection, bathymetry, physiological constraints and socially mediated behaviour. PMID:28361954
NASA Astrophysics Data System (ADS)
Lee, Kwang Jin; Xiao, Yiming; Woo, Jae Heun; Kim, Eunsun; Kreher, David; Attias, André-Jean; Mathevet, Fabrice; Ribierre, Jean-Charles; Wu, Jeong Weon; André, Pascal
2017-07-01
Charge transfer (CT) is a fundamental and ubiquitous mechanism in biology, physics and chemistry. Here, we evidence that CT dynamics can be altered by multi-layered hyperbolic metamaterial (HMM) substrates. Taking triphenylene:perylene diimide dyad supramolecular self-assemblies as a model system, we reveal longer-lived CT states in the presence of HMM structures, with both charge separation and recombination characteristic times increased by factors of 2.4 and 1.7--that is, relative variations of 140 and 73%, respectively. To rationalize these experimental results in terms of driving force, we successfully introduce image dipole interactions in Marcus theory. The non-local effect herein demonstrated is directly linked to the number of metal-dielectric pairs, can be formalized in the dielectric permittivity, and is presented as a solid analogue to local solvent polarity effects. This model and extra PH3T:PC60BM results show the generality of this non-local phenomenon and that a wide range of kinetic tailoring opportunities can arise from substrate engineering. This work paves the way toward the design of artificial substrates to control CT dynamics of interest for applications in optoelectronics and chemistry.
Elmardi, Khalid A; Malik, Elfatih M; Abdelgadir, Tarig; Ali, Salah H; Elsyed, Abdalla H; Mudather, Mahmoud A; Elhassan, Asma H; Adam, Ishag
2009-03-09
Malaria remains a major public health problem especially in sub-Saharan Africa. Despite the efforts exerted to provide effective anti-malarial drugs, still some communities suffer from getting access to these services due to many barriers. This research aimed to assess the feasibility and acceptability of home-based management of malaria (HMM) strategy using artemisinin-based combination therapy (ACT) for treatment and rapid diagnostic test (RDT) for diagnosis. This is a study conducted in 20 villages in Um Adara area, South Kordofan state, Sudan. Two-thirds (66%) of the study community were seeking treatment from heath facilities, which were more than 5 km far from their villages with marked inaccessibility during rainy season. Volunteers (one per village) were trained on using RDTs for diagnosis and artesunate plus sulphadoxine-pyrimethamine for treating malaria patients, as well as referral of severe and non-malaria cases. A system for supply and monitoring was established based on the rural health centre, which acted as a link between the volunteers and the health system. Advocacy for the policy was done through different tools. Volunteers worked on non-monetary incentives but only a consultation fee of One Sudanese Pound (equivalent to US$0.5).Pre- and post-intervention assessment was done using household survey, focus group discussion with the community leaders, structured interview with the volunteers, and records and reports analysis. The overall adherence of volunteers to the project protocol in treating and referring cases was accepted that was only one of the 20 volunteers did not comply with the study guidelines. Although the use of RDTs seemed to have improved the level of accuracy and trust in the diagnosis, 30% of volunteers did not rely on the negative RDT results when treating fever cases. Almost all (94.7%) the volunteers felt that they were satisfied with the spiritual outcome of their new tasks. As well, volunteers have initiated advocacy campaigns supported by their village health committees which were found to have a positive role to play in the project that proved their acceptability of the HMM design. The planned system for supply was found to be effective. The project was found to improve the accessibility to ACTs from 25% to 64.7% and the treatment seeking behaviour from 83.3% to 100% before- and after the HMM implementation respectively. The evaluation of the project identified the feasibility of the planned model in Sudan's condition. Moreover, the communities as well as the volunteers found to be satisfied with and supportive to the system and the outcome. The problem of treating other febrile cases when diagnosis is not malaria and other non-fever cases needs to be addressed as well.
The Evolution and Future of Marine Corps Medical Evacuation and Casualty Evacuation Operations
2011-03-16
PERSON Marine Corps University I Comm~nd and Staff College 19b. TELEPONE NUMBER (Include area code) (703) 784-3330 (Admin Office) Standard Form...they appear in the report, e.g. 001; AFAPL304801 05. 6. AUTHOR(S). Enter name(s) of person (s) responsible for writing the report, performing the...commanders assigned-three squ~ drons (HMM- · 161, HMM -286, and HtviM-364) to a rotation to cover CASEY AC and MEDEY AC operations for all forc~s serving
Foreign Language Analysis and Recognition (FLARe)
2016-10-08
10 7 Chinese CER ...Rates ( CERs ) were obtained with each feature set: (1) 19.2%, (2) 17.3%, and (3) 15.3%. Based on these results, a GMM-HMM speech recognition system...These systems were evaluated on the HUB4 and HKUST test partitions. Table 7 shows the CER obtained on each test set. Whereas including the HKUST data
Pavlistova, Lenka; Zemanova, Zuzana; Sarova, Iveta; Lhotska, Halka; Berkova, Adela; Spicka, Ivan; Michalova, Kyra
2014-01-01
Ploidy is an important prognostic factor in the risk stratification of multiple myeloma (MM) patients. Patients with MM can be divided into two groups according to the modal number of chromosomes: nonhyperdiploid (NH-MM) and hyperdiploid (H-MM), which has a more favorable outcome. The two ploidy groups represent two different oncogenetic pathways determined at the premalignant stage. The ploidy subtype also persists during the course of the disease, even during progression after the therapy, with only very rare cases of ploidy conversion. The clinical significance of ploidy conversion and its relation to drug resistance have been previously discussed. Here, we describe a female MM patient with a rare change in her ploidy status from H-MM to NH-MM, detected by cytogenetic and molecular cytogenetic examinations of consecutive bone marrow aspirates. We hypothesize that ploidy conversion (from H-MM to NH-MM) is associated with disease progression and acquired resistance to bortezomib/lenalidomide therapy. Copyright © 2014 Elsevier Inc. All rights reserved.
Tuning subwavelength-structured focus in the hyperbolic metamaterials
NASA Astrophysics Data System (ADS)
Pan, Rong; Tang, Zhixiang; Pan, Jin; Peng, Runwu
2016-10-01
In this paper, we have systematically investigated light propagating in the hyperbolic metamaterials (HMMs) covered by a subwavelength grating. Based on the equal-frequency contour analyses, light in the HMM is predicted to propagate along a defined direction because of its hyperbolic dispersion, which is similar to the self-collimating effects in photonic crystals. By using the finite-difference time-domain, numerical simulations demonstrate a subwavelength bright spot at the intersection of the adjacent directional beams. Different from the images in homogeneous media, the magnetic fields and electric fields at the spot are layered, especially for the electric fields Ez that is polarized to the propagating direction, i.e., the layer normal direction. Moreover, the Ez is hollow in the layer plane and is stronger than the other electric field component Ex. Therefore, the whole electric field is structured and its pattern can be tuned by the HMM's effective anisotropic electromagnetic parameters. Our results may be useful for generating subwavelength structured light.
NASA Technical Reports Server (NTRS)
2001-01-01
Howmet Research Corporation was the first to commercialize an innovative cast metal technology developed at Auburn University, Auburn, Alabama. With funding assistance from NASA's Marshall Space Flight Center, Auburn University's Solidification Design Center (a NASA Commercial Space Center), developed accurate nickel-based superalloy data for casting molten metals. Through a contract agreement, Howmet used the data to develop computer model predictions of molten metals and molding materials in cast metal manufacturing. Howmet Metal Mold (HMM), part of Howmet Corporation Specialty Products, of Whitehall, Michigan, utilizes metal molds to manufacture net shape castings in various alloys and amorphous metal (metallic glass). By implementing the thermophysical property data from by Auburn researchers, Howmet employs its newly developed computer model predictions to offer customers high-quality, low-cost, products with significantly improved mechanical properties. Components fabricated with this new process replace components originally made from forgings or billet. Compared with products manufactured through traditional casting methods, Howmet's computer-modeled castings come out on top.
Accelerometry-based classification of human activities using Markov modeling.
Mannini, Andrea; Sabatini, Angelo Maria
2011-01-01
Accelerometers are a popular choice as body-motion sensors: the reason is partly in their capability of extracting information that is useful for automatically inferring the physical activity in which the human subject is involved, beside their role in feeding biomechanical parameters estimators. Automatic classification of human physical activities is highly attractive for pervasive computing systems, whereas contextual awareness may ease the human-machine interaction, and in biomedicine, whereas wearable sensor systems are proposed for long-term monitoring. This paper is concerned with the machine learning algorithms needed to perform the classification task. Hidden Markov Model (HMM) classifiers are studied by contrasting them with Gaussian Mixture Model (GMM) classifiers. HMMs incorporate the statistical information available on movement dynamics into the classification process, without discarding the time history of previous outcomes as GMMs do. An example of the benefits of the obtained statistical leverage is illustrated and discussed by analyzing two datasets of accelerometer time series.
Automated Error Detection in Physiotherapy Training.
Jovanović, Marko; Seiffarth, Johannes; Kutafina, Ekaterina; Jonas, Stephan M
2018-01-01
Manual skills teaching, such as physiotherapy education, requires immediate teacher feedback for the students during the learning process, which to date can only be performed by expert trainers. A machine-learning system trained only on correct performances to classify and score performed movements, to identify sources of errors in the movement and give feedback to the learner. We acquire IMU and sEMG sensor data from a commercial-grade wearable device and construct an HMM-based model for gesture classification, scoring and feedback giving. We evaluate the model on publicly available and self-generated data of an exemplary movement pattern executions. The model achieves an overall accuracy of 90.71% on the public dataset and 98.9% on our dataset. An AUC of 0.99 for the ROC of the scoring method could be achieved to discriminate between correct and untrained incorrect executions. The proposed system demonstrated its suitability for scoring and feedback in manual skills training.
Krishnamoorthy, Ramasamy; Kim, Chang-Gi; Subramanian, Parthiban; Kim, Ki-Yoon; Selvakumar, Gopal; Sa, Tong-Min
2015-01-01
Arbuscular Mycorrhizal Fungi (AMF) play major roles in ecosystem functioning such as carbon sequestration, nutrient cycling, and plant growth promotion. It is important to know how this ecologically important soil microbial player is affected by soil abiotic factors particularly heavy metal and metalloid (HMM). The objective of this study was to understand the impact of soil HMM concentration on AMF abundance and community structure in the contaminated sites of South Korea. Soil samples were collected from the vicinity of an abandoned smelter and the samples were subjected to three complementary methods such as spore morphology, terminal restriction fragment length polymorphism (T-RFLP) and denaturing gradient gel electrophoresis (DGGE) for diversity analysis. Spore density was found to be significantly higher in highly contaminated soil compared to less contaminated soil. Spore morphological study revealed that Glomeraceae family was more abundant followed by Acaulosporaceae and Gigasporaceae in the vicinity of the smelter. T-RFLP and DGGE analysis confirmed the dominance of Funneliformis mosseae and Rhizophagus intraradices in all the study sites. Claroideoglomus claroideum, Funneliformis caledonium, Rhizophagus clarus and Funneliformis constrictum were found to be sensitive to high concentration of soil HMM. Richness and diversity of Glomeraceae family increased with significant increase in soil arsenic, cadmium and zinc concentrations. Our results revealed that the soil HMM has a vital impact on AMF community structure, especially with Glomeraceae family abundance, richness and diversity. PMID:26035444
Multi-Observation Continuous Density Hidden Markov Models for Anomaly Detection in Full Motion Video
2012-06-01
response profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Method for measuring angular movement versus average direction...of movement 49 3.6 Method for calculating Angular Deviation, Θ . . . . . . . . . . . . . . . . . . 50 4.1 HMM produced by K Means Learning for agent H... Angular Deviation. A random variable, the difference in heading (in degrees) from the overall direction of movement over the sequence • S : Speed. A
Maritime Threat Detection Using Probabilistic Graphical Models
2012-01-01
CRF, unlike an HMM, can represent local features, and does not require feature concatenation. MLNs For MLNs, we used Alchemy ( Alchemy 2011), an...open source statistical relational learning and probabilistic inferencing package. Alchemy supports generative and discriminative weight learning, and...that Alchemy creates a new formula for every possible combination of the values for a1 and a2 that fit the type specified in their predicate
Ensemble Learning Method for Hidden Markov Models
2014-12-01
Ensemble HMM landmine detector Mine signatures vary according to the mine type, mine size , and burial depth. Similarly, clutter signatures vary with soil ...approaches for the di erent K groups depending on their size and homogeneity. In particular, we investigate the maximum likelihood (ML), the minimum...propose using and optimizing various training approaches for the different K groups depending on their size and homogeneity. In particular, we
Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2011-06-20
One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
2011-01-01
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
The Discontinuous Galerkin Method for the Multiscale Modeling of Dynamics of Crystalline Solids
2007-08-26
number. 1. REPORT DATE 26 AUG 2007 2 . REPORT TYPE 3. DATES COVERED 00-00-2007 to 00-00-2007 4. TITLE AND SUBTITLE The Discontinuous Galerkin...Dynamics method (MAAD) [ 2 ], the bridging scale method [47], the bridging domain methods [48], the heterogeneous multiscale method (HMM) [23, 36, 24], and...method consists of three components, 1. a macro solver for the continuum model, 2 . a micro solver to equilibrate the atomistic system locally to the appro
Theory, Characterization and Applications of Infrared Hyperbolic Metamaterials
NASA Astrophysics Data System (ADS)
Fullager, Daniel B.
Hyperbolic Metamaterials (HMMs) are engineered structures capable of supporting lightmatter interactions that are not normally observed in naturally occuring material systems. These unusual responses are enabled by an enhancement of the photonic density of states (PDOS) in the material. The PDOS enhancement is a result of deliberately introduced anisotropy via a permittivity sign-change in HMM structures which increases the number and frequency spread of possible wave vectors that propagate in the material. Subwavelength structural features allow effective medium theories to be invoked to construct the k-space isofrequency quadratic curves that, for HMMs, result in the k-space isofrequency contour transitioning from being a bounded surface to an unbounded one. Since the PDOS is the integral of the differential volume between k-space contours, unbounded manifolds lead to the implication of an infinite or otherwise drastically enhanced PDOS. Since stored heat can be thought of as a set of non-radiative electromagnetic modes, in this dissertation we demonstrate that HMMs provide an ideal platform to attempt to modify the thermal/IR emissivity of a material. We also show that HMMs provide a platform for broadband plasmonic sensing. The advent of commercial two photon polymerization tools has enabled the rapid production of nano- and microstructures which can be used as scaffolds for directive infrared scatterers. We describe how such directive components can be used to address thermal management needs in vacuum environments in order to maximize radiative thermal transfer. In this context, the fundamental limitations of enhanced spon- taneous emission due to conjugate impedance matched scatterers are also explored. The HMM/conjugate scatterer system's performance is strongly correlated with the dielectric function of the negative permittivity component of the HMM. In order to fully understand the significance of these engineered materials, we examine in detail the electromagnetic response of one ternary material system, aluminium-doped zinc oxide (AZO), whose tuneable plasma frequency makes it ideal for HMM and thermal transfer applications. This study draws upon first principle calculations from the open literature utilizing a Hubbard-U corrected model for the non-local interaction of charge carriers in AZO crystalline systems. We present the first complete dielectric function of industrially produced AZO samples from DC to 30,000 cm -1 and conclude with an assessment of this material's suitability fo the applications described.
HMMER web server: 2018 update.
Potter, Simon C; Luciani, Aurélien; Eddy, Sean R; Park, Youngmi; Lopez, Rodrigo; Finn, Robert D
2018-06-14
The HMMER webserver [http://www.ebi.ac.uk/Tools/hmmer] is a free-to-use service which provides fast searches against widely used sequence databases and profile hidden Markov model (HMM) libraries using the HMMER software suite (http://hmmer.org). The results of a sequence search may be summarized in a number of ways, allowing users to view and filter the significant hits by domain architecture or taxonomy. For large scale usage, we provide an application programmatic interface (API) which has been expanded in scope, such that all result presentations are available via both HTML and API. Furthermore, we have refactored our JavaScript visualization library to provide standalone components for different result representations. These consume the aforementioned API and can be integrated into third-party websites. The range of databases that can be searched against has been expanded, adding four sequence datasets (12 in total) and one profile HMM library (6 in total). To help users explore the biological context of their results, and to discover new data resources, search results are now supplemented with cross references to other EMBL-EBI databases.
Dittmar, W James; McIver, Lauren; Michalak, Pawel; Garner, Harold R; Valdez, Gregorio
2014-07-01
The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen
2010-07-01
We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.
Web of Objects Based Ambient Assisted Living Framework for Emergency Psychiatric State Prediction
Alam, Md Golam Rabiul; Abedin, Sarder Fakhrul; Al Ameen, Moshaddique; Hong, Choong Seon
2016-01-01
Ambient assisted living can facilitate optimum health and wellness by aiding physical, mental and social well-being. In this paper, patients’ psychiatric symptoms are collected through lightweight biosensors and web-based psychiatric screening scales in a smart home environment and then analyzed through machine learning algorithms to provide ambient intelligence in a psychiatric emergency. The psychiatric states are modeled through a Hidden Markov Model (HMM), and the model parameters are estimated using a Viterbi path counting and scalable Stochastic Variational Inference (SVI)-based training algorithm. The most likely psychiatric state sequence of the corresponding observation sequence is determined, and an emergency psychiatric state is predicted through the proposed algorithm. Moreover, to enable personalized psychiatric emergency care, a service a web of objects-based framework is proposed for a smart-home environment. In this framework, the biosensor observations and the psychiatric rating scales are objectified and virtualized in the web space. Then, the web of objects of sensor observations and psychiatric rating scores are used to assess the dweller’s mental health status and to predict an emergency psychiatric state. The proposed psychiatric state prediction algorithm reported 83.03 percent prediction accuracy in an empirical performance study. PMID:27608023
Nonintrusive Load Monitoring Based on Advanced Deep Learning and Novel Signature.
Kim, Jihyun; Le, Thi-Thu-Huong; Kim, Howon
2017-01-01
Monitoring electricity consumption in the home is an important way to help reduce energy usage. Nonintrusive Load Monitoring (NILM) is existing technique which helps us monitor electricity consumption effectively and costly. NILM is a promising approach to obtain estimates of the electrical power consumption of individual appliances from aggregate measurements of voltage and/or current in the distribution system. Among the previous studies, Hidden Markov Model (HMM) based models have been studied very much. However, increasing appliances, multistate of appliances, and similar power consumption of appliances are three big issues in NILM recently. In this paper, we address these problems through providing our contributions as follows. First, we proposed state-of-the-art energy disaggregation based on Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) model and additional advanced deep learning. Second, we proposed a novel signature to improve classification performance of the proposed model in multistate appliance case. We applied the proposed model on two datasets such as UK-DALE and REDD. Via our experimental results, we have confirmed that our model outperforms the advanced model. Thus, we show that our combination between advanced deep learning and novel signature can be a robust solution to overcome NILM's issues and improve the performance of load identification.
Nonintrusive Load Monitoring Based on Advanced Deep Learning and Novel Signature
Le, Thi-Thu-Huong; Kim, Howon
2017-01-01
Monitoring electricity consumption in the home is an important way to help reduce energy usage. Nonintrusive Load Monitoring (NILM) is existing technique which helps us monitor electricity consumption effectively and costly. NILM is a promising approach to obtain estimates of the electrical power consumption of individual appliances from aggregate measurements of voltage and/or current in the distribution system. Among the previous studies, Hidden Markov Model (HMM) based models have been studied very much. However, increasing appliances, multistate of appliances, and similar power consumption of appliances are three big issues in NILM recently. In this paper, we address these problems through providing our contributions as follows. First, we proposed state-of-the-art energy disaggregation based on Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) model and additional advanced deep learning. Second, we proposed a novel signature to improve classification performance of the proposed model in multistate appliance case. We applied the proposed model on two datasets such as UK-DALE and REDD. Via our experimental results, we have confirmed that our model outperforms the advanced model. Thus, we show that our combination between advanced deep learning and novel signature can be a robust solution to overcome NILM's issues and improve the performance of load identification. PMID:29118809
Reactions of Free Radicals with Nitro-Compounds and Nitrates
1981-03-31
PAGE(I/hmm a•Ia ntatemd the fragment derived from the nitrates but not from the nitro-compounds could undergo exothermic rearrangement. Product analyses...compounds could undergo exothermic rearrangement. Product analyses and computer modelling were undertaken, these provided a clear explanation of why the...Nitrate 14 Reaction of Oxygen Atoms with Nitromethane 16 Reaction of Oxygen Atoms with Nitroethane 17 Products from Nitrocompounds 18 Effect of Carbon
Xu, Qifang; Dunbrack, Roland L
2012-11-01
Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM-HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly.
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.
Mørk, Søren; Holmes, Ian
2012-03-01
Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
High-precision Pb Isotopes Reveal Two Small Magma Bodies Beneath the Summit of Kilauea Volcano
NASA Astrophysics Data System (ADS)
Pietruszka, A. J.; Heaton, D. E.; Marske, J. P.; Garcia, M. O.
2013-12-01
The summit magma storage reservoir of Kilauea Volcano is one of the most important components of the volcano's magmatic plumbing system, but its geometry is poorly known. High-precision Pb isotopic analyses of Kilauea summit lavas (1959-1982) define the minimum number of magma bodies within the summit reservoir and their volumes. The 206Pb/204Pb ratios of these lavas display a temporal decrease due to changes in the composition of the parental magma delivered to the volcano. Analyses of multiple lavas from some individual eruptions reveal small but significant differences in 206Pb/204Pb. The extra-caldera lavas from Aug. 1971 and Jul. 1974 display lower Pb isotope ratios and higher MgO contents (10 wt. %) than the intra-caldera lavas (MgO ~7-8 wt. %) from each eruption. From 1971 to 1982, the 206Pb/204Pb ratios of the lavas define two separate decreasing temporal trends. The intra-caldera lavas from 1971, 1974, 1975, Apr. 1982 and the lower MgO lavas from Sep. 1982 have higher 206Pb/204Pb ratios at a given time (compared to the extra-caldera lavas and the higher MgO lavas from Sep. 1982). These trends require that the intra- and extra-caldera lavas (and the Sep. 1982 lavas) were supplied from two separate, partially isolated magma bodies. Numerous studies (Fiske and Kinoshita, 1969; Klein et al., 1987) have long identified the locus of Kilauea's summit reservoir ~2 km southeast of Halemaumau (HMM) at a depth of ~2-7 km, but more recent investigations have discovered a second magma body located <1 km below the east rim of HMM (Battaglia et al., 2003; Johnson et al., 2010). The association between the vent locations of the extra-caldera lavas near the southeast rim of the caldera and their higher MgO contents suggests that these lavas tapped the deeper magma body. In contrast, the lower MgO intra-caldera lavas were likely derived from the shallow magma body beneath HMM. Residence time modeling based on the Pb isotope ratios of the lavas suggests that the magma volume of the deeper body is ~0.2 km3, whereas the shallow body holds a minimum of ~0.04 km3 of magma. These estimates are smaller than a previous calculation of ~2-3 km3 for Kilauea's summit reservoir based on trace element ratios (Pietruszka and Garcia, 1999), but are similar to the volume of the magma body that underlies Piton de la Fournaise Volcano on Réunion Island (Albarède, 1993).
Sand, Andreas; Kristiansen, Martin; Pedersen, Christian N S; Mailund, Thomas
2013-11-22
Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models.Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/.
Hadlington, Lee; Murphy, Karen
2018-03-01
The current study focused on how engaging in media multitasking (MMT) and the experience of everyday cognitive failures impact on the individual's engagement in risky cybersecurity behaviors (RCsB). In total, 144 participants (32 males, 112 females) completed an online survey. The age range for participants was 18 to 43 years (M = 20.63, SD = 4.04). Participants completed three scales which included an inventory of weekly MMT, a measure of everyday cognitive failures, and RCsB. There was a significant difference between heavy media multitaskers (HMM), average media multitaskers (AMM), and light media multitaskers (LMM) in terms of RCsB, with HMM demonstrating more frequent risky behaviors than LMM or AMM. The HMM group also reported more cognitive failures in everyday life than the LMM group. A regression analysis showed that everyday cognitive failures and MMT acted as significant predictors for RCsB. These results expand our current understanding of the relationship between human factors and cybersecurity behaviors, which are useful to inform the design of training and intervention packages to mitigate RCsB.
HIV-1 Vif promotes the formation of high molecular mass APOBEC3G complexes
Goila-Gaur, Ritu; Khan, Mohammad A.; Miyagi, Eri; Kao, Sandra; Opi, Sandrine; Takeuchi, Hiroaki; Strebel, Klaus
2008-01-01
HIV-1 Vif inhibits the antiviral activity of APOBEC3G (APO3G) by inducing proteasomal degradation. Here, we studied the effects of Vif on APO3G in vitro. In this system, Vif did not cause APO3G degradation. Instead, Vif induced changes in APO3G that affected immunoprecipitation of the native protein. This effect required wt Vif and was reversed by heat-denaturation of APO3G. Sucrose gradient analysis demonstrated that wt Vif induced the gradual transition of APO3G translated in vitro or expressed in HeLa cells from a low molecular mass conformation to puromycin-sensitive high molecular mass (HMM) complexes. In the absence of Vif or the presence of biologically inactive Vif APO3G failed to form HMM complexes. Our results expose a novel function of Vif that promotes the assembly of APO3G into presumably packaging-incompetent HMM complexes and may explain how Vif can overcome the APO3G-imposed block to HIV replication under conditions of no or inefficient APO3G degradation. PMID:18023836
A Hidden Markov Model for Urban-Scale Traffic Estimation Using Floating Car Data.
Wang, Xiaomeng; Peng, Ling; Chi, Tianhe; Li, Mengzhu; Yao, Xiaojing; Shao, Jing
2015-01-01
Urban-scale traffic monitoring plays a vital role in reducing traffic congestion. Owing to its low cost and wide coverage, floating car data (FCD) serves as a novel approach to collecting traffic data. However, sparse probe data represents the vast majority of the data available on arterial roads in most urban environments. In order to overcome the problem of data sparseness, this paper proposes a hidden Markov model (HMM)-based traffic estimation model, in which the traffic condition on a road segment is considered as a hidden state that can be estimated according to the conditions of road segments having similar traffic characteristics. An algorithm based on clustering and pattern mining rather than on adjacency relationships is proposed to find clusters with road segments having similar traffic characteristics. A multi-clustering strategy is adopted to achieve a trade-off between clustering accuracy and coverage. Finally, the proposed model is designed and implemented on the basis of a real-time algorithm. Results of experiments based on real FCD confirm the applicability, accuracy, and efficiency of the model. In addition, the results indicate that the model is practicable for traffic estimation on urban arterials and works well even when more than 70% of the probe data are missing.
SVM-dependent pairwise HMM: an application to protein pairwise alignments.
Orlando, Gabriele; Raimondi, Daniele; Khan, Taushif; Lenaerts, Tom; Vranken, Wim F
2017-12-15
Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. wim.vranken@vub.be. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A study on the real-time reliability of on-board equipment of train control system
NASA Astrophysics Data System (ADS)
Zhang, Yong; Li, Shiwei
2018-05-01
Real-time reliability evaluation is conducive to establishing a condition based maintenance system for the purpose of guaranteeing continuous train operation. According to the inherent characteristics of the on-board equipment, the connotation of reliability evaluation of on-board equipment is defined and the evaluation index of real-time reliability is provided in this paper. From the perspective of methodology and practical application, the real-time reliability of the on-board equipment is discussed in detail, and the method of evaluating the realtime reliability of on-board equipment at component level based on Hidden Markov Model (HMM) is proposed. In this method the performance degradation data is used directly to realize the accurate perception of the hidden state transition process of on-board equipment, which can achieve a better description of the real-time reliability of the equipment.
Improved orthologous databases to ease protozoan targets inference.
Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R
2015-09-29
Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.
Cover song identification by sequence alignment algorithms
NASA Astrophysics Data System (ADS)
Wang, Chih-Li; Zhong, Qian; Wang, Szu-Ying; Roychowdhury, Vwani
2011-10-01
Content-based music analysis has drawn much attention due to the rapidly growing digital music market. This paper describes a method that can be used to effectively identify cover songs. A cover song is a song that preserves only the crucial melody of its reference song but different in some other acoustic properties. Hence, the beat/chroma-synchronous chromagram, which is insensitive to the variation of the timber or rhythm of songs but sensitive to the melody, is chosen. The key transposition is achieved by cyclically shifting the chromatic domain of the chromagram. By using the Hidden Markov Model (HMM) to obtain the time sequences of songs, the system is made even more robust. Similar structure or length between the cover songs and its reference are not necessary by the Smith-Waterman Alignment Algorithm.
Interactive projection for aerial dance using depth sensing camera
NASA Astrophysics Data System (ADS)
Dubnov, Tammuz; Seldess, Zachary; Dubnov, Shlomo
2014-02-01
This paper describes an interactive performance system for oor and Aerial Dance that controls visual and sonic aspects of the presentation via a depth sensing camera (MS Kinect). In order to detect, measure and track free movement in space, 3 degree of freedom (3-DOF) tracking in space (on the ground and in the air) is performed using IR markers. Gesture tracking and recognition is performed using a simpli ed HMM model that allows robust mapping of the actor's actions to graphics and sound. Additional visual e ects are achieved by segmentation of the actor body based on depth information, allowing projection of separate imagery on the performer and the backdrop. Artistic use of augmented reality performance relative to more traditional concepts of stage design and dramaturgy are discussed.
Murphy, Karen
2018-01-01
Abstract The current study focused on how engaging in media multitasking (MMT) and the experience of everyday cognitive failures impact on the individual's engagement in risky cybersecurity behaviors (RCsB). In total, 144 participants (32 males, 112 females) completed an online survey. The age range for participants was 18 to 43 years (M = 20.63, SD = 4.04). Participants completed three scales which included an inventory of weekly MMT, a measure of everyday cognitive failures, and RCsB. There was a significant difference between heavy media multitaskers (HMM), average media multitaskers (AMM), and light media multitaskers (LMM) in terms of RCsB, with HMM demonstrating more frequent risky behaviors than LMM or AMM. The HMM group also reported more cognitive failures in everyday life than the LMM group. A regression analysis showed that everyday cognitive failures and MMT acted as significant predictors for RCsB. These results expand our current understanding of the relationship between human factors and cybersecurity behaviors, which are useful to inform the design of training and intervention packages to mitigate RCsB. PMID:29638157
Intelligent classifier for dynamic fault patterns based on hidden Markov model
NASA Astrophysics Data System (ADS)
Xu, Bo; Feng, Yuguang; Yu, Jinsong
2006-11-01
It's difficult to build precise mathematical models for complex engineering systems because of the complexity of the structure and dynamics characteristics. Intelligent fault diagnosis introduces artificial intelligence and works in a different way without building the analytical mathematical model of a diagnostic object, so it's a practical approach to solve diagnostic problems of complex systems. This paper presents an intelligent fault diagnosis method, an integrated fault-pattern classifier based on Hidden Markov Model (HMM). This classifier consists of dynamic time warping (DTW) algorithm, self-organizing feature mapping (SOFM) network and Hidden Markov Model. First, after dynamic observation vector in measuring space is processed by DTW, the error vector including the fault feature of being tested system is obtained. Then a SOFM network is used as a feature extractor and vector quantization processor. Finally, fault diagnosis is realized by fault patterns classifying with the Hidden Markov Model classifier. The importing of dynamic time warping solves the problem of feature extracting from dynamic process vectors of complex system such as aeroengine, and makes it come true to diagnose complex system by utilizing dynamic process information. Simulating experiments show that the diagnosis model is easy to extend, and the fault pattern classifier is efficient and is convenient to the detecting and diagnosing of new faults.
A probabilistic union model with automatic order selection for noisy speech recognition.
Jancovic, P; Ming, J
2001-09-01
A critical issue in exploiting the potential of the sub-band-based approach to robust speech recognition is the method of combining the sub-band observations, for selecting the bands unaffected by noise. A new method for this purpose, i.e., the probabilistic union model, was recently introduced. This model has been shown to be capable of dealing with band-limited corruption, requiring no knowledge about the band position and statistical distribution of the noise. A parameter within the model, which we call its order, gives the best results when it equals the number of noisy bands. Since this information may not be available in practice, in this paper we introduce an automatic algorithm for selecting the order, based on the state duration pattern generated by the hidden Markov model (HMM). The algorithm has been tested on the TIDIGITS database corrupted by various types of additive band-limited noise with unknown noisy bands. The results have shown that the union model equipped with the new algorithm can achieve a recognition performance similar to that achieved when the number of noisy bands is known. The results show a very significant improvement over the traditional full-band model, without requiring prior information on either the position or the number of noisy bands. The principle of the algorithm for selecting the order based on state duration may also be applied to other sub-band combination methods.
Guerra, Jorge; Uddin, Jasim; Nilsen, Dawn; Mclnerney, James; Fadoo, Ammarah; Omofuma, Isirame B.; Hughes, Shatif; Agrawal, Sunil; Allen, Peter; Schambra, Heidi M.
2017-01-01
There currently exist no practical tools to identify functional movements in the upper extremities (UEs). This absence has limited the precise therapeutic dosing of patients recovering from stroke. In this proof-of-principle study, we aimed to develop an accurate approach for classifying UE functional movement primitives, which comprise functional movements. Data were generated from inertial measurement units (IMUs) placed on upper body segments of older healthy individuals and chronic stroke patients. Subjects performed activities commonly trained during rehabilitation after stroke. Data processing involved the use of a sliding window to obtain statistical descriptors, and resulting features were processed by a Hidden Markov Model (HMM). The likelihoods of the states, resulting from the HMM, were segmented by a second sliding window and their averages were calculated. The final predictions were mapped to human functional movement primitives using a Logistic Regression algorithm. Algorithm performance was assessed with a leave-one-out analysis, which determined its sensitivity, specificity, and positive and negative predictive values for all classified primitives. In healthy control and stroke participants, our approach identified functional movement primitives embedded in training activities with, on average, 80% precision. This approach may support functional movement dosing in stroke rehabilitation. PMID:28813877
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O.; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11 576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. Database URL: http://hme.riceblast.snu.ac.kr/ PMID:26055100
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes.
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11,576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. © The Author(s) 2015. Published by Oxford University Press.
A Feasibility Study of View-independent Gait Identification
2012-03-01
ice skates . For walking, the footprint records for single pixels form clusters that are well separated in space and time. (Any overlap of contact...Pattern Recognition 2007, 1-8. Cheng M-H, Ho M-F & Huang C-L (2008), "Gait Analysis for Human Identification Through Manifold Learning and HMM... Learning and Cybernetics 2005, 4516-4521 Moeslund T B & Granum E (2001), "A Survey of Computer Vision-Based Human Motion Capture", Computer Vision
2009-06-01
CA 93943, United States kemple@nps.edu * To whom correspondence should be addressed: krishna@engr.uconn.edu 1 Report Documentation Page Form...ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for...burden, to Washington Headquarters Services, Directorate for Information Operations and Reports , 1215 Jefferson Davis Highway, Suite 1204, Arlington
Effector prediction in host-pathogen interaction based on a Markov model of a ubiquitous EPIYA motif
2010-01-01
Background Effector secretion is a common strategy of pathogen in mediating host-pathogen interaction. Eight EPIYA-motif containing effectors have recently been discovered in six pathogens. Once these effectors enter host cells through type III/IV secretion systems (T3SS/T4SS), tyrosine in the EPIYA motif is phosphorylated, which triggers effectors binding other proteins to manipulate host-cell functions. The objectives of this study are to evaluate the distribution pattern of EPIYA motif in broad biological species, to predict potential effectors with EPIYA motif, and to suggest roles and biological functions of potential effectors in host-pathogen interactions. Results A hidden Markov model (HMM) of five amino acids was built for the EPIYA-motif based on the eight known effectors. Using this HMM to search the non-redundant protein database containing 9,216,047 sequences, we obtained 107,231 sequences with at least one EPIYA motif occurrence and 3115 sequences with multiple repeats of the EPIYA motif. Although the EPIYA motif exists among broad species, it is significantly over-represented in some particular groups of species. For those proteins containing at least four copies of EPIYA motif, most of them are from intracellular bacteria, extracellular bacteria with T3SS or T4SS or intracellular protozoan parasites. By combining the EPIYA motif and the adjacent SH2 binding motifs (KK, R4, Tarp and Tir), we built HMMs of nine amino acids and predicted many potential effectors in bacteria and protista by the HMMs. Some potential effectors for pathogens (such as Lawsonia intracellularis, Plasmodium falciparum and Leishmania major) are suggested. Conclusions Our study indicates that the EPIYA motif may be a ubiquitous functional site for effectors that play an important pathogenicity role in mediating host-pathogen interactions. We suggest that some intracellular protozoan parasites could secrete EPIYA-motif containing effectors through secretion systems similar to the T3SS/T4SS in bacteria. Our predicted effectors provide useful hypotheses for further studies. PMID:21143776
Ngasala, Billy E; Malmberg, Maja; Carlsson, Anja M; Ferreira, Pedro E; Petzold, Max G; Blessborn, Daniel; Bergqvist, Yngve; Gil, José P; Premji, Zul; Mårtensson, Andreas
2011-03-16
Home-management of malaria (HMM) strategy improves early access of anti-malarial medicines to high-risk groups in remote areas of sub-Saharan Africa. However, limited data are available on the effectiveness of using artemisinin-based combination therapy (ACT) within the HMM strategy. The aim of this study was to assess the effectiveness of artemether-lumefantrine (AL), presently the most favoured ACT in Africa, in under-five children with uncomplicated Plasmodium falciparum malaria in Tanzania, when provided by community health workers (CHWs) and administered unsupervised by parents or guardians at home. An open label, single arm prospective study was conducted in two rural villages with high malaria transmission in Kibaha District, Tanzania. Children presenting to CHWs with uncomplicated fever and a positive rapid malaria diagnostic test (RDT) were provisionally enrolled and provided AL for unsupervised treatment at home. Patients with microscopy confirmed P. falciparum parasitaemia were definitely enrolled and reviewed weekly by the CHWs during 42 days. Primary outcome measure was PCR corrected parasitological cure rate by day 42, as estimated by Kaplan-Meier survival analysis. This trial is registered with ClinicalTrials.gov, number NCT00454961. A total of 244 febrile children were enrolled between March-August 2007. Two patients were lost to follow up on day 14, and one patient withdrew consent on day 21. Some 141/241 (58.5%) patients had recurrent infection during follow-up, of whom 14 had recrudescence. The PCR corrected cure rate by day 42 was 93.0% (95% CI 88.3%-95.9%). The median lumefantrine concentration was statistically significantly lower in patients with recrudescence (97 ng/mL [IQR 0-234]; n = 10) compared with reinfections (205 ng/mL [114-390]; n = 92), or no parasite reappearance (217 [121-374] ng/mL; n = 70; p ≤ 0.046). Provision of AL by CHWs for unsupervised malaria treatment at home was highly effective, which provides evidence base for scaling-up implementation of HMM with AL in Tanzania.
Li, Shan; Dong, Xia; Su, Zhengchang
2013-07-30
Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamically transcribed under different conditions, and a large portion of genes and intergenic regions have antisense RNA (asRNA) and non-coding RNA (ncRNA) transcripts, respectively. Ironically, similar studies have not been conducted in the model bacterium E coli K12, thus it is unknown whether or not the bacterium possesses similar complex transcriptomes. Furthermore, although RNA-seq becomes the major method for analyzing the complexity of prokaryotic transcriptome, it is still a challenging task to accurately assemble full length transcripts using short RNA-seq reads. To fill these gaps, we have profiled the transcriptomes of E. coli K12 under different culture conditions and growth phases using a highly specific directional RNA-seq technique that can capture various types of transcripts in the bacterial cells, combined with a highly accurate and robust algorithm and tool TruHMM (http://bioinfolab.uncc.edu/TruHmm_package/) for assembling full length transcripts. We found that 46.9 ~ 63.4% of expressed operons were utilized in their putative alternative forms, 72.23 ~ 89.54% genes had putative asRNA transcripts and 51.37 ~ 72.74% intergenic regions had putative ncRNA transcripts under different culture conditions and growth phases. As has been demonstrated in many other prokaryotes, E. coli K12 also has a highly complex and dynamic transcriptomes under different culture conditions and growth phases. Such complex and dynamic transcriptomes might play important roles in the physiology of the bacterium. TruHMM is a highly accurate and robust algorithm for assembling full-length transcripts in prokaryotes using directional RNA-seq short reads.
2013-01-01
Background Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamically transcribed under different conditions, and a large portion of genes and intergenic regions have antisense RNA (asRNA) and non-coding RNA (ncRNA) transcripts, respectively. Ironically, similar studies have not been conducted in the model bacterium E coli K12, thus it is unknown whether or not the bacterium possesses similar complex transcriptomes. Furthermore, although RNA-seq becomes the major method for analyzing the complexity of prokaryotic transcriptome, it is still a challenging task to accurately assemble full length transcripts using short RNA-seq reads. Results To fill these gaps, we have profiled the transcriptomes of E. coli K12 under different culture conditions and growth phases using a highly specific directional RNA-seq technique that can capture various types of transcripts in the bacterial cells, combined with a highly accurate and robust algorithm and tool TruHMM (http://bioinfolab.uncc.edu/TruHmm_package/) for assembling full length transcripts. We found that 46.9 ~ 63.4% of expressed operons were utilized in their putative alternative forms, 72.23 ~ 89.54% genes had putative asRNA transcripts and 51.37 ~ 72.74% intergenic regions had putative ncRNA transcripts under different culture conditions and growth phases. Conclusions As has been demonstrated in many other prokaryotes, E. coli K12 also has a highly complex and dynamic transcriptomes under different culture conditions and growth phases. Such complex and dynamic transcriptomes might play important roles in the physiology of the bacterium. TruHMM is a highly accurate and robust algorithm for assembling full-length transcripts in prokaryotes using directional RNA-seq short reads. PMID:23899370
A stochastic HMM-based forecasting model for fuzzy time series.
Li, Sheng-Tun; Cheng, Yi-Chung
2010-10-01
Recently, fuzzy time series have attracted more academic attention than traditional time series due to their capability of dealing with the uncertainty and vagueness inherent in the data collected. The formulation of fuzzy relations is one of the key issues affecting forecasting results. Most of the present works adopt IF-THEN rules for relationship representation, which leads to higher computational overhead and rule redundancy. Sullivan and Woodall proposed a Markov-based formulation and a forecasting model to reduce computational overhead; however, its applicability is limited to handling one-factor problems. In this paper, we propose a novel forecasting model based on the hidden Markov model by enhancing Sullivan and Woodall's work to allow handling of two-factor forecasting problems. Moreover, in order to make the nature of conjecture and randomness of forecasting more realistic, the Monte Carlo method is adopted to estimate the outcome. To test the effectiveness of the resulting stochastic model, we conduct two experiments and compare the results with those from other models. The first experiment consists of forecasting the daily average temperature and cloud density in Taipei, Taiwan, and the second experiment is based on the Taiwan Weighted Stock Index by forecasting the exchange rate of the New Taiwan dollar against the U.S. dollar. In addition to improving forecasting accuracy, the proposed model adheres to the central limit theorem, and thus, the result statistically approximates to the real mean of the target value being forecast.
2015-09-30
playbacks Killer whale (O. orca) 10 4 8 1 2 LF pilot whale (G. melas ) 30 8 14 4 8 Sperm whale (P. Macrocephalus) 10 4 10 2 5 Humpback whale (M...exposure dataset of the long-finned pilot whale (Globicephala melas ). A hidden Markov model (HMM) approach was developed to quantify behavioral states...Experimental Exposures of Killer (Orcinus orca), Long-Finned Pilot (Globicephala melas ), and Sperm (Physeter macrocephalus) Whales to Naval Sonar. Aquat
Extracting volatility signal using maximum a posteriori estimation
NASA Astrophysics Data System (ADS)
Neto, David
2016-11-01
This paper outlines a methodology to estimate a denoised volatility signal for foreign exchange rates using a hidden Markov model (HMM). For this purpose a maximum a posteriori (MAP) estimation is performed. A double exponential prior is used for the state variable (the log-volatility) in order to allow sharp jumps in realizations and then log-returns marginal distributions with heavy tails. We consider two routes to choose the regularization and we compare our MAP estimate to realized volatility measure for three exchange rates.
Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica
2016-02-18
The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through the graphical user interface ( http://compbio.math.hr/ ). Our results show that scanning with a carefully parameterized motif-HMM is an effective approach for annotation of protein families with low sequence similarity and conserved motifs. The results of this study expand current knowledge and provide new insights into the evolution of the large GDSL-lipase family in land plants.
Continuous Human Action Recognition Using Depth-MHI-HOG and a Spotter Model
Eum, Hyukmin; Yoon, Changyong; Lee, Heejin; Park, Mignon
2015-01-01
In this paper, we propose a new method for spotting and recognizing continuous human actions using a vision sensor. The method is comprised of depth-MHI-HOG (DMH), action modeling, action spotting, and recognition. First, to effectively separate the foreground from background, we propose a method called DMH. It includes a standard structure for segmenting images and extracting features by using depth information, MHI, and HOG. Second, action modeling is performed to model various actions using extracted features. The modeling of actions is performed by creating sequences of actions through k-means clustering; these sequences constitute HMM input. Third, a method of action spotting is proposed to filter meaningless actions from continuous actions and to identify precise start and end points of actions. By employing the spotter model, the proposed method improves action recognition performance. Finally, the proposed method recognizes actions based on start and end points. We evaluate recognition performance by employing the proposed method to obtain and compare probabilities by applying input sequences in action models and the spotter model. Through various experiments, we demonstrate that the proposed method is efficient for recognizing continuous human actions in real environments. PMID:25742172
Taborri, Juri; Rossi, Stefano; Palermo, Eduardo; Patanè, Fabrizio; Cappa, Paolo
2014-09-02
In this work, we decided to apply a hierarchical weighted decision, proposed and used in other research fields, for the recognition of gait phases. The developed and validated novel distributed classifier is based on hierarchical weighted decision from outputs of scalar Hidden Markov Models (HMM) applied to angular velocities of foot, shank, and thigh. The angular velocities of ten healthy subjects were acquired via three uni-axial gyroscopes embedded in inertial measurement units (IMUs) during one walking task, repeated three times, on a treadmill. After validating the novel distributed classifier and scalar and vectorial classifiers-already proposed in the literature, with a cross-validation, classifiers were compared for sensitivity, specificity, and computational load for all combinations of the three targeted anatomical segments. Moreover, the performance of the novel distributed classifier in the estimation of gait variability in terms of mean time and coefficient of variation was evaluated. The highest values of specificity and sensitivity (>0.98) for the three classifiers examined here were obtained when the angular velocity of the foot was processed. Distributed and vectorial classifiers reached acceptable values (>0.95) when the angular velocity of shank and thigh were analyzed. Distributed and scalar classifiers showed values of computational load about 100 times lower than the one obtained with the vectorial classifier. In addition, distributed classifiers showed an excellent reliability for the evaluation of mean time and a good/excellent reliability for the coefficient of variation. In conclusion, due to the better performance and the small value of computational load, the here proposed novel distributed classifier can be implemented in the real-time application of gait phases recognition, such as to evaluate gait variability in patients or to control active orthoses for the recovery of mobility of lower limb joints.
Echegaray, Sebastian; Bakr, Shaimaa; Rubin, Daniel L; Napel, Sandy
2017-10-06
The aim of this study was to develop an open-source, modular, locally run or server-based system for 3D radiomics feature computation that can be used on any computer system and included in existing workflows for understanding associations and building predictive models between image features and clinical data, such as survival. The QIFE exploits various levels of parallelization for use on multiprocessor systems. It consists of a managing framework and four stages: input, pre-processing, feature computation, and output. Each stage contains one or more swappable components, allowing run-time customization. We benchmarked the engine using various levels of parallelization on a cohort of CT scans presenting 108 lung tumors. Two versions of the QIFE have been released: (1) the open-source MATLAB code posted to Github, (2) a compiled version loaded in a Docker container, posted to DockerHub, which can be easily deployed on any computer. The QIFE processed 108 objects (tumors) in 2:12 (h/mm) using 1 core, and 1:04 (h/mm) hours using four cores with object-level parallelization. We developed the Quantitative Image Feature Engine (QIFE), an open-source feature-extraction framework that focuses on modularity, standards, parallelism, provenance, and integration. Researchers can easily integrate it with their existing segmentation and imaging workflows by creating input and output components that implement their existing interfaces. Computational efficiency can be improved by parallelizing execution at the cost of memory usage. Different parallelization levels provide different trade-offs, and the optimal setting will depend on the size and composition of the dataset to be processed.
Tracking the visual focus of attention for a varying number of wandering people.
Smith, Kevin; Ba, Sileye O; Odobez, Jean-Marc; Gatica-Perez, Daniel
2008-07-01
We define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W), determining where the people's movement is unconstrained. VFOA-W estimation is a new and important problem with mplications for behavior understanding and cognitive science, as well as real-world applications. One such application, which we present in this article, monitors the attention passers-by pay to an outdoor advertisement. Our approach to the VFOA-W problem proposes a multi-person tracking solution based on a dynamic Bayesian network that simultaneously infers the (variable) number of people in a scene, their body locations, their head locations, and their head pose. For efficient inference in the resulting large variable-dimensional state-space we propose a Reversible Jump Markov Chain Monte Carlo (RJMCMC) sampling scheme, as well as a novel global observation model which determines the number of people in the scene and localizes them. We propose a Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM)-based VFOA-W model which use head pose and location information to determine people's focus state. Our models are evaluated for tracking performance and ability to recognize people looking at an outdoor advertisement, with results indicating good performance on sequences where a moderate number of people pass in front of an advertisement.
Kogan, J A; Margoliash, D
1998-04-01
The performance of two techniques is compared for automated recognition of bird song units from continuous recordings. The advantages and limitations of dynamic time warping (DTW) and hidden Markov models (HMMs) are evaluated on a large database of male songs of zebra finches (Taeniopygia guttata) and indigo buntings (Passerina cyanea), which have different types of vocalizations and have been recorded under different laboratory conditions. Depending on the quality of recordings and complexity of song, the DTW-based technique gives excellent to satisfactory performance. Under challenging conditions such as noisy recordings or presence of confusing short-duration calls, good performance of the DTW-based technique requires careful selection of templates that may demand expert knowledge. Because HMMs are trained, equivalent or even better performance of HMMs can be achieved based only on segmentation and labeling of constituent vocalizations, albeit with many more training examples than DTW templates. One weakness in HMM performance is the misclassification of short-duration vocalizations or song units with more variable structure (e.g., some calls, and syllables of plastic songs). To address these and other limitations, new approaches for analyzing bird vocalizations are discussed.
Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems
NASA Technical Reports Server (NTRS)
Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan
2010-01-01
A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.
Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)
2013-01-01
Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events. PMID:23369192
Bahlmann, Claus; Burkhardt, Hans
2004-03-01
In this paper, we give a comprehensive description of our writer-independent online handwriting recognition system frog on hand. The focus of this work concerns the presentation of the classification/training approach, which we call cluster generative statistical dynamic time warping (CSDTW). CSDTW is a general, scalable, HMM-based method for variable-sized, sequential data that holistically combines cluster analysis and statistical sequence modeling. It can handle general classification problems that rely on this sequential type of data, e.g., speech recognition, genome processing, robotics, etc. Contrary to previous attempts, clustering and statistical sequence modeling are embedded in a single feature space and use a closely related distance measure. We show character recognition experiments of frog on hand using CSDTW on the UNIPEN online handwriting database. The recognition accuracy is significantly higher than reported results of other handwriting recognition systems. Finally, we describe the real-time implementation of frog on hand on a Linux Compaq iPAQ embedded device.
Human activities recognition by head movement using partial recurrent neural network
NASA Astrophysics Data System (ADS)
Tan, Henry C. C.; Jia, Kui; De Silva, Liyanage C.
2003-06-01
Traditionally, human activities recognition has been achieved mainly by the statistical pattern recognition methods or the Hidden Markov Model (HMM). In this paper, we propose a novel use of the connectionist approach for the recognition of ten simple human activities: walking, sitting down, getting up, squatting down and standing up, in both lateral and frontal views, in an office environment. By means of tracking the head movement of the subjects over consecutive frames from a database of different color image sequences, and incorporating the Elman model of the partial recurrent neural network (RNN) that learns the sequential patterns of relative change of the head location in the images, the proposed system is able to robustly classify all the ten activities performed by unseen subjects from both sexes, of different race and physique, with a recognition rate as high as 92.5%. This demonstrates the potential of employing partial RNN to recognize complex activities in the increasingly popular human-activities-based applications.
NASA Astrophysics Data System (ADS)
Chaudhary, A.; Payne, T.; Kinateder, K.; Dao, P.; Beecher, E.; Boone, D.; Elliott, B.
The objective of on-line flagging in this paper is to perform interactive assessment of geosynchronous satellites anomalies such as cross-tagging of a satellites in a cluster, solar panel offset change, etc. This assessment will utilize a Bayesian belief propagation procedure and will include automated update of baseline signature data for the satellite, while accounting for the seasonal changes. Its purpose is to enable an ongoing, automated assessment of satellite behavior through its life cycle using the photometry data collected during the synoptic search performed by a ground or space-based sensor as a part of its metrics mission. The change in the satellite features will be reported along with the probabilities of Type I and Type II errors. The objective of adaptive sequential hypothesis testing in this paper is to define future sensor tasking for the purpose of characterization of fine features of the satellite. The tasking will be designed in order to maximize new information with the least number of photometry data points to be collected during the synoptic search by a ground or space-based sensor. Its calculation is based on the utilization of information entropy techniques. The tasking is defined by considering a sequence of hypotheses in regard to the fine features of the satellite. The optimal observation conditions are then ordered in order to maximize new information about a chosen fine feature. The combined objective of on-line flagging and adaptive sequential hypothesis testing is to progressively discover new information about the features of a geosynchronous satellites by leveraging the regular but sparse cadence of data collection during the synoptic search performed by a ground or space-based sensor. Automated Algorithm to Detect Changes in Geostationary Satellite's Configuration and Cross-Tagging Phan Dao, Air Force Research Laboratory/RVB By characterizing geostationary satellites based on photometry and color photometry, analysts can evaluate satellite operational status and affirm its true identity. The process of ingesting photometry data and deriving satellite physical characteristics can be directed by analysts in a batch mode, meaning using a batch of recent data, or by automated algorithms in an on-line mode in which the assessment is updated with each new data point. Tools used for detecting change to satellite's status or identity, whether performed with a human in the loop or automated algorithms, are generally not built to detect with minimum latency and traceable confidence intervals. To alleviate those deficiencies, we investigate the use of Hidden Markov Models (HMM), in a Bayesian Network framework, to infer the hidden state (changed or unchanged) of a three-axis stabilized geostationary satellite using broadband and color photometry. Unlike frequentist statistics which exploit only the stationary statistics of the observables in the database, HMM also exploits the temporal pattern of the observables as well. The algorithm also operates in “learning” mode to gradually evolve the HMM and accommodate natural changes such as due to the seasonal dependence of GEO satellite's light curve. Our technique is designed to operate with missing color data. The version that ingests both panchromatic and color data can accommodate gaps in color photometry data. That attribute is important because while color indices, e.g. Johnson R and B, enhance the belief (probability) of a hidden state, in real world situations, flux data is collected sporadically in an untasked collect, and color data is limited and sometimes absent. Fluxes are measured with experimental error whose effect on the algorithm will be studied. Photometry data in the AFRL's Geo Color Photometry Catalog and Geo Observations with Latitudinal Diversity Simultaneously (GOLDS) data sets are used to simulate a wide variety of operational changes and identity cross tags. The algorithm is tested against simulated sequences of observed magnitudes, mimicking both the cadence of untasked SSN and other ground sensors, occasional operational changes and possible occurrence of cross tags of in-cluster satellites. We would like to show that the on-line algorithm can detect change; sometimes right after the first post-change data point is analyzed, for zero latency. We also want to show the unsupervised “learning” capability that allows the HMM to evolve with time without user's assistance. For example, the users are not required to “label” the true state of the data points.
On the use of hidden Markov models for gaze pattern modeling
NASA Astrophysics Data System (ADS)
Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph
2016-05-01
Some of the conventional metrics derived from gaze patterns (on computer screens) to study visual attention, engagement and fatigue are saccade counts, nearest neighbor index (NNI) and duration of dwells/fixations. Each of these metrics has drawbacks in modeling the behavior of gaze patterns; one such drawback comes from the fact that some portions on the screen are not as important as some other portions on the screen. This is addressed by computing the eye gaze metrics corresponding to important areas of interest (AOI) on the screen. There are some challenges in developing accurate AOI based metrics: firstly, the definition of AOI is always fuzzy; secondly, it is possible that the AOI may change adaptively over time. Hence, there is a need to introduce eye-gaze metrics that are aware of the AOI in the field of view; at the same time, the new metrics should be able to automatically select the AOI based on the nature of the gazes. In this paper, we propose a novel way of computing NNI based on continuous hidden Markov models (HMM) that model the gazes as 2D Gaussian observations (x-y coordinates of the gaze) with the mean at the center of the AOI and covariance that is related to the concentration of gazes. The proposed modeling allows us to accurately compute the NNI metric in the presence of multiple, undefined AOI on the screen in the presence of intermittent casual gazing that is modeled as random gazes on the screen.
Identification of related gene/protein names based on an HMM of name variations.
Yeganova, L; Smith, L; Wilbur, W J
2004-04-01
Gene and protein names follow few, if any, true naming conventions and are subject to great variation in different occurrences of the same name. This gives rise to two important problems in natural language processing. First, can one locate the names of genes or proteins in free text, and second, can one determine when two names denote the same gene or protein? The first of these problems is a special case of the problem of named entity recognition, while the second is a special case of the problem of automatic term recognition (ATR). We study the second problem, that of gene or protein name variation. Here we describe a system which, given a query gene or protein name, identifies related gene or protein names in a large list. The system is based on a dynamic programming algorithm for sequence alignment in which the mutation matrix is allowed to vary under the control of a fully trainable hidden Markov model.
Modern Computational Techniques for the HMMER Sequence Analysis
2013-01-01
This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
Sequence-based heuristics for faster annotation of non-coding RNA families.
Weinberg, Zasha; Ruzzo, Walter L
2006-01-01
Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that--unlike family-specific solutions--can scale to hundreds of ncRNA families. The source code is available under GNU Public License at the supplementary web site.
Context-Sensitive Markov Models for Peptide Scoring and Identification from Tandem Mass Spectrometry
Grover, Himanshu; Wallstrom, Garrick; Wu, Christine C.
2013-01-01
Abstract Peptide and protein identification via tandem mass spectrometry (MS/MS) lies at the heart of proteomic characterization of biological samples. Several algorithms are able to search, score, and assign peptides to large MS/MS datasets. Most popular methods, however, underutilize the intensity information available in the tandem mass spectrum due to the complex nature of the peptide fragmentation process, thus contributing to loss of potential identifications. We present a novel probabilistic scoring algorithm called Context-Sensitive Peptide Identification (CSPI) based on highly flexible Input-Output Hidden Markov Models (IO-HMM) that capture the influence of peptide physicochemical properties on their observed MS/MS spectra. We use several local and global properties of peptides and their fragment ions from literature. Comparison with two popular algorithms, Crux (re-implementation of SEQUEST) and X!Tandem, on multiple datasets of varying complexity, shows that peptide identification scores from our models are able to achieve greater discrimination between true and false peptides, identifying up to ∼25% more peptides at a False Discovery Rate (FDR) of 1%. We evaluated two alternative normalization schemes for fragment ion-intensities, a global rank-based and a local window-based. Our results indicate the importance of appropriate normalization methods for learning superior models. Further, combining our scores with Crux using a state-of-the-art procedure, Percolator, we demonstrate the utility of using scoring features from intensity-based models, identifying ∼4-8 % additional identifications over Percolator at 1% FDR. IO-HMMs offer a scalable and flexible framework with several modeling choices to learn complex patterns embedded in MS/MS data. PMID:23289783
A novel Bayesian change-point algorithm for genome-wide analysis of diverse ChIPseq data types.
Xing, Haipeng; Liao, Willey; Mo, Yifan; Zhang, Michael Q
2012-12-10
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein(1). For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment(2). Reliably identifying these regions was the focus of our work. Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics(3-5) to more rigorous statistical models, e.g. Hidden Markov Models (HMMs)(6-8). We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized. Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types. To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs(9), which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor(10,11) and epigenetic data(12) to illustrate its usefulness.
Clustering Multivariate Time Series Using Hidden Markov Models
Ghassempour, Shima; Girosi, Federico; Maeder, Anthony
2014-01-01
In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs), where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers. PMID:24662996
Detecting critical state before phase transition of complex systems by hidden Markov model
NASA Astrophysics Data System (ADS)
Liu, Rui; Chen, Pei; Li, Yongjun; Chen, Luonan
Identifying the critical state or pre-transition state just before the occurrence of a phase transition is a challenging task, because the state of the system may show little apparent change before this critical transition during the gradual parameter variations. Such dynamics of phase transition is generally composed of three stages, i.e., before-transition state, pre-transition state, and after-transition state, which can be considered as three different Markov processes. Thus, based on this dynamical feature, we present a novel computational method, i.e., hidden Markov model (HMM), to detect the switching point of the two Markov processes from the before-transition state (a stationary Markov process) to the pre-transition state (a time-varying Markov process), thereby identifying the pre-transition state or early-warning signals of the phase transition. To validate the effectiveness, we apply this method to detect the signals of the imminent phase transitions of complex systems based on the simulated datasets, and further identify the pre-transition states as well as their critical modules for three real datasets, i.e., the acute lung injury triggered by phosgene inhalation, MCF-7 human breast cancer caused by heregulin, and HCV-induced dysplasia and hepatocellular carcinoma.
Jeong, Hyundoo; Yoon, Byung-Jun
2017-03-14
Network querying algorithms provide computational means to identify conserved network modules in large-scale biological networks that are similar to known functional modules, such as pathways or molecular complexes. Two main challenges for network querying algorithms are the high computational complexity of detecting potential isomorphism between the query and the target graphs and ensuring the biological significance of the query results. In this paper, we propose SEQUOIA, a novel network querying algorithm that effectively addresses these issues by utilizing a context-sensitive random walk (CSRW) model for network comparison and minimizing the network conductance of potential matches in the target network. The CSRW model, inspired by the pair hidden Markov model (pair-HMM) that has been widely used for sequence comparison and alignment, can accurately assess the node-to-node correspondence between different graphs by accounting for node insertions and deletions. The proposed algorithm identifies high-scoring network regions based on the CSRW scores, which are subsequently extended by maximally reducing the network conductance of the identified subnetworks. Performance assessment based on real PPI networks and known molecular complexes show that SEQUOIA outperforms existing methods and clearly enhances the biological significance of the query results. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/SEQUOIA .
NASA Astrophysics Data System (ADS)
Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant
2014-03-01
In this paper, we propose a real-time human versus animal classification technique using a pyro-electric sensor array and Hidden Markov Model. The technique starts with the variational energy functional level set segmentation technique to separate the object from background. After segmentation, we convert the segmented object to a signal by considering column-wise pixel values and then finding the wavelet coefficients of the signal. HMMs are trained to statistically model the wavelet features of individuals through an expectation-maximization learning process. Human versus animal classifications are made by evaluating a set of new wavelet feature data against the trained HMMs using the maximum-likelihood criterion. Human and animal data acquired-using a pyro-electric sensor in different terrains are used for performance evaluation of the algorithms. Failures of the computationally effective SURF feature based approach that we develop in our previous research are because of distorted images produced when the object runs very fast or if the temperature difference between target and background is not sufficient to accurately profile the object. We show that wavelet based HMMs work well for handling some of the distorted profiles in the data set. Further, HMM achieves improved classification rate over the SURF algorithm with almost the same computational time.
A Stochastic Framework for Evaluating Seizure Prediction Algorithms Using Hidden Markov Models
Wong, Stephen; Gardner, Andrew B.; Krieger, Abba M.; Litt, Brian
2007-01-01
Responsive, implantable stimulation devices to treat epilepsy are now in clinical trials. New evidence suggests that these devices may be more effective when they deliver therapy before seizure onset. Despite years of effort, prospective seizure prediction, which could improve device performance, remains elusive. In large part, this is explained by lack of agreement on a statistical framework for modeling seizure generation and a method for validating algorithm performance. We present a novel stochastic framework based on a three-state hidden Markov model (HMM) (representing interictal, preictal, and seizure states) with the feature that periods of increased seizure probability can transition back to the interictal state. This notion reflects clinical experience and may enhance interpretation of published seizure prediction studies. Our model accommodates clipped EEG segments and formalizes intuitive notions regarding statistical validation. We derive equations for type I and type II errors as a function of the number of seizures, duration of interictal data, and prediction horizon length and we demonstrate the model’s utility with a novel seizure detection algorithm that appeared to predicted seizure onset. We propose this framework as a vital tool for designing and validating prediction algorithms and for facilitating collaborative research in this area. PMID:17021032
In Vivo Control of CpG and Non-CpG DNA Methylation by DNA Methyltransferases
Arand, Julia; Spieler, David; Karius, Tommy; Branco, Miguel R.; Meilinger, Daniela; Meissner, Alexander; Jenuwein, Thomas; Xu, Guoliang; Leonhardt, Heinrich; Wolf, Verena; Walter, Jörn
2012-01-01
The enzymatic control of the setting and maintenance of symmetric and non-symmetric DNA methylation patterns in a particular genome context is not well understood. Here, we describe a comprehensive analysis of DNA methylation patterns generated by high resolution sequencing of hairpin-bisulfite amplicons of selected single copy genes and repetitive elements (LINE1, B1, IAP-LTR-retrotransposons, and major satellites). The analysis unambiguously identifies a substantial amount of regional incomplete methylation maintenance, i.e. hemimethylated CpG positions, with variant degrees among cell types. Moreover, non-CpG cytosine methylation is confined to ESCs and exclusively catalysed by Dnmt3a and Dnmt3b. This sequence position–, cell type–, and region-dependent non-CpG methylation is strongly linked to neighboring CpG methylation and requires the presence of Dnmt3L. The generation of a comprehensive data set of 146,000 CpG dyads was used to apply and develop parameter estimated hidden Markov models (HMM) to calculate the relative contribution of DNA methyltransferases (Dnmts) for de novo and maintenance DNA methylation. The comparative modelling included wild-type ESCs and mutant ESCs deficient for Dnmt1, Dnmt3a, Dnmt3b, or Dnmt3a/3b, respectively. The HMM analysis identifies a considerable de novo methylation activity for Dnmt1 at certain repetitive elements and single copy sequences. Dnmt3a and Dnmt3b contribute de novo function. However, both enzymes are also essential to maintain symmetrical CpG methylation at distinct repetitive and single copy sequences in ESCs. PMID:22761581
Temporal stability of visual search-driven biometrics
NASA Astrophysics Data System (ADS)
Yoon, Hong-Jun; Carmichael, Tandy R.; Tourassi, Georgia
2015-03-01
Previously, we have shown the potential of using an individual's visual search pattern as a possible biometric. That study focused on viewing images displaying dot-patterns with different spatial relationships to determine which pattern can be more effective in establishing the identity of an individual. In this follow-up study we investigated the temporal stability of this biometric. We performed an experiment with 16 individuals asked to search for a predetermined feature of a random-dot pattern as we tracked their eye movements. Each participant completed four testing sessions consisting of two dot patterns repeated twice. One dot pattern displayed concentric circles shifted to the left or right side of the screen overlaid with visual noise, and participants were asked which side the circles were centered on. The second dot-pattern displayed a number of circles (between 0 and 4) scattered on the screen overlaid with visual noise, and participants were asked how many circles they could identify. Each session contained 5 untracked tutorial questions and 50 tracked test questions (200 total tracked questions per participant). To create each participant's "fingerprint", we constructed a Hidden Markov Model (HMM) from the gaze data representing the underlying visual search and cognitive process. The accuracy of the derived HMM models was evaluated using cross-validation for various time-dependent train-test conditions. Subject identification accuracy ranged from 17.6% to 41.8% for all conditions, which is significantly higher than random guessing (1/16 = 6.25%). The results suggest that visual search pattern is a promising, temporally stable personalized fingerprint of perceptual organization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoon, Hong-Jun; Carmichael, Tandy; Tourassi, Georgia
Previously, we have shown the potential of using an individual s visual search pattern as a possible biometric. That study focused on viewing images displaying dot-patterns with different spatial relationships to determine which pattern can be more effective in establishing the identity of an individual. In this follow-up study we investigated the temporal stability of this biometric. We performed an experiment with 16 individuals asked to search for a predetermined feature of a random-dot pattern as we tracked their eye movements. Each participant completed four testing sessions consisting of two dot patterns repeated twice. One dot pattern displayed concentric circlesmore » shifted to the left or right side of the screen overlaid with visual noise, and participants were asked which side the circles were centered on. The second dot-pattern displayed a number of circles (between 0 and 4) scattered on the screen overlaid with visual noise, and participants were asked how many circles they could identify. Each session contained 5 untracked tutorial questions and 50 tracked test questions (200 total tracked questions per participant). To create each participant s "fingerprint", we constructed a Hidden Markov Model (HMM) from the gaze data representing the underlying visual search and cognitive process. The accuracy of the derived HMM models was evaluated using cross-validation for various time-dependent train-test conditions. Subject identification accuracy ranged from 17.6% to 41.8% for all conditions, which is significantly higher than random guessing (1/16 = 6.25%). The results suggest that visual search pattern is a promising, fairly stable personalized fingerprint of perceptual organization.« less
A hybrid smartphone indoor positioning solution for mobile LBS.
Liu, Jingbin; Chen, Ruizhi; Pei, Ling; Guinness, Robert; Kuusniemi, Heidi
2012-12-12
Smartphone positioning is an enabling technology used to create new business in the navigation and mobile location-based services (LBS) industries. This paper presents a smartphone indoor positioning engine named HIPE that can be easily integrated with mobile LBS. HIPE is a hybrid solution that fuses measurements of smartphone sensors with wireless signals. The smartphone sensors are used to measure the user's motion dynamics information (MDI), which represent the spatial correlation of various locations. Two algorithms based on hidden Markov model (HMM) problems, the grid-based filter and the Viterbi algorithm, are used in this paper as the central processor for data fusion to resolve the position estimates, and these algorithms are applicable for different applications, e.g., real-time navigation and location tracking, respectively. HIPE is more widely applicable for various motion scenarios than solutions proposed in previous studies because it uses no deterministic motion models, which have been commonly used in previous works. The experimental results showed that HIPE can provide adequate positioning accuracy and robustness for different scenarios of MDI combinations. HIPE is a cost-efficient solution, and it can work flexibly with different smartphone platforms, which may have different types of sensors available for the measurement of MDI data. The reliability of the positioning solution was found to increase with increasing precision of the MDI data.
Plaisier, Christopher L; Bare, J Christopher; Baliga, Nitin S
2011-07-01
Transcriptome profiling studies have produced staggering numbers of gene co-expression signatures for a variety of biological systems. A significant fraction of these signatures will be partially or fully explained by miRNA-mediated targeted transcript degradation. miRvestigator takes as input lists of co-expressed genes from Caenorhabditis elegans, Drosophila melanogaster, G. gallus, Homo sapiens, Mus musculus or Rattus norvegicus and identifies the specific miRNAs that are likely to bind to 3' un-translated region (UTR) sequences to mediate the observed co-regulation. The novelty of our approach is the miRvestigator hidden Markov model (HMM) algorithm which systematically computes a similarity P-value for each unique miRNA seed sequence from the miRNA database miRBase to an overrepresented sequence motif identified within the 3'-UTR of the query genes. We have made this miRNA discovery tool accessible to the community by integrating our HMM algorithm with a proven algorithm for de novo discovery of miRNA seed sequences and wrapping these algorithms into a user-friendly interface. Additionally, the miRvestigator web server also produces a list of putative miRNA binding sites within 3'-UTRs of the query transcripts to facilitate the design of validation experiments. The miRvestigator is freely available at http://mirvestigator.systemsbiology.net.
The cellular source for APOBEC3G's incorporation into HIV-1
2011-01-01
Background Human APOBEC3G (hA3G) has been identified as a cellular inhibitor of HIV-1 infectivity. Viral incorporation of hA3G is an essential step for its antiviral activity. Although the mechanism underlying hA3G virion encapsidation has been investigated extensively, the cellular source of viral hA3G remains unclear. Results Previous studies have shown that hA3G forms low-molecular-mass (LMM) and high-molecular-mass (HMM) complexes. Our work herein provides evidence that the majority of newly-synthesized hA3G interacts with membrane lipid raft domains to form Lipid raft-associated hA3G (RA hA3G), which serve as the precursor of the mature HMM hA3G complex, while a minority of newly-synthesized hA3G remains in the cytoplasm as a soluble LMM form. The distribution of hA3G among the soluble LMM form, the RA LMM form and the mature forms of HMM is regulated by a mechanism involving the N-terminal part of the linker region and the C-terminus of hA3G. Mutagenesis studies reveal a direct correlation between the ability of hA3G to form the RA LMM complex and its viral incorporation. Conclusions Together these data suggest that the Lipid raft-associated LMM A3G complex functions as the cellular source of viral hA3G. PMID:21211018
Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin
2016-06-15
Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence-structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM-HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx : xin.gao@kaust.edu.sa Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Online Farsi digit recognition using their upper half structure
NASA Astrophysics Data System (ADS)
Ghods, Vahid; Sohrabi, Mohammad Karim
2015-03-01
In this paper, we investigated the efficiency of upper half Farsi numerical digit structure. In other words, half of data (upper half of the digit shapes) was exploited for the recognition of Farsi numerical digits. This method can be used for both offline and online recognition. Half of data is more effective in speed process, data transfer and in this application accuracy. Hidden Markov model (HMM) was used to classify online Farsi digits. Evaluation was performed by TMU dataset. This dataset contains more than 1200 samples of online handwritten Farsi digits. The proposed method yielded more accuracy in recognition rate.
Heuristic algorithm for optical character recognition of Arabic script
NASA Astrophysics Data System (ADS)
Yarman-Vural, Fatos T.; Atici, A.
1996-02-01
In this paper, a heuristic method is developed for segmentation, feature extraction and recognition of the Arabic script. The study is part of a large project for the transcription of the documents in Ottoman Archives. A geometrical and topological feature analysis method is developed for segmentation and feature extraction stages. Chain code transformation is applied to main strokes of the characters which are then classified by the hidden Markov model (HMM) in the recognition stage. Experimental results indicate that the performance of the proposed method is impressive, provided that the thinning process does not yield spurious branches.
Automatic identification of individual killer whales.
Brown, Judith C; Smaragdis, Paris; Nousek-McGregor, Anna
2010-09-01
Following the successful use of HMM and GMM models for classification of a set of 75 calls of northern resident killer whales into call types [Brown, J. C., and Smaragdis, P., J. Acoust. Soc. Am. 125, 221-224 (2009)], the use of these same methods has been explored for the identification of vocalizations from the same call type N2 of four individual killer whales. With an average of 20 vocalizations from each of the individuals the pairwise comparisons have an extremely high success rate of 80 to 100% and the identifications within the entire group yield around 78%.
Force Sensor Based Tool Condition Monitoring Using a Heterogeneous Ensemble Learning Model
Wang, Guofeng; Yang, Yinwei; Li, Zhimeng
2014-01-01
Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability. PMID:25405514
Force sensor based tool condition monitoring using a heterogeneous ensemble learning model.
Wang, Guofeng; Yang, Yinwei; Li, Zhimeng
2014-11-14
Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability.
2012-01-01
Background Hidden Markov Models (HMMs) are a powerful tool for protein domain identification. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in new sequenced organisms. In Pfam, each domain family is represented by a curated multiple sequence alignment from which a profile HMM is built. In spite of their high specificity, HMMs may lack sensitivity when searching for domains in divergent organisms. This is particularly the case for species with a biased amino-acid composition, such as P. falciparum, the main causal agent of human malaria. In this context, fitting HMMs to the specificities of the target proteome can help identify additional domains. Results Using P. falciparum as an example, we compare approaches that have been proposed for this problem, and present two alternative methods. Because previous attempts strongly rely on known domain occurrences in the target species or its close relatives, they mainly improve the detection of domains which belong to already identified families. Our methods learn global correction rules that adjust amino-acid distributions associated with the match states of HMMs. These rules are applied to all match states of the whole HMM library, thus enabling the detection of domains from previously absent families. Additionally, we propose a procedure to estimate the proportion of false positives among the newly discovered domains. Starting with the Pfam standard library, we build several new libraries with the different HMM-fitting approaches. These libraries are first used to detect new domain occurrences with low E-values. Second, by applying the Co-Occurrence Domain Discovery (CODD) procedure we have recently proposed, the libraries are further used to identify likely occurrences among potential domains with higher E-values. Conclusion We show that the new approaches allow identification of several domain families previously absent in the P. falciparum proteome and the Apicomplexa phylum, and identify many domains that are not detected by previous approaches. In terms of the number of new discovered domains, the new approaches outperform the previous ones when no close species are available or when they are used to identify likely occurrences among potential domains with high E-values. All predictions on P. falciparum have been integrated into a dedicated website which pools all known/new annotations of protein domains and functions for this organism. A software implementing the two proposed approaches is available at the same address: http://www.lirmm.fr/∼terrapon/HMMfit/ PMID:22548871
Terrapon, Nicolas; Gascuel, Olivier; Maréchal, Eric; Bréhélin, Laurent
2012-05-01
Hidden Markov Models (HMMs) are a powerful tool for protein domain identification. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in new sequenced organisms. In Pfam, each domain family is represented by a curated multiple sequence alignment from which a profile HMM is built. In spite of their high specificity, HMMs may lack sensitivity when searching for domains in divergent organisms. This is particularly the case for species with a biased amino-acid composition, such as P. falciparum, the main causal agent of human malaria. In this context, fitting HMMs to the specificities of the target proteome can help identify additional domains. Using P. falciparum as an example, we compare approaches that have been proposed for this problem, and present two alternative methods. Because previous attempts strongly rely on known domain occurrences in the target species or its close relatives, they mainly improve the detection of domains which belong to already identified families. Our methods learn global correction rules that adjust amino-acid distributions associated with the match states of HMMs. These rules are applied to all match states of the whole HMM library, thus enabling the detection of domains from previously absent families. Additionally, we propose a procedure to estimate the proportion of false positives among the newly discovered domains. Starting with the Pfam standard library, we build several new libraries with the different HMM-fitting approaches. These libraries are first used to detect new domain occurrences with low E-values. Second, by applying the Co-Occurrence Domain Discovery (CODD) procedure we have recently proposed, the libraries are further used to identify likely occurrences among potential domains with higher E-values. We show that the new approaches allow identification of several domain families previously absent in the P. falciparum proteome and the Apicomplexa phylum, and identify many domains that are not detected by previous approaches. In terms of the number of new discovered domains, the new approaches outperform the previous ones when no close species are available or when they are used to identify likely occurrences among potential domains with high E-values. All predictions on P. falciparum have been integrated into a dedicated website which pools all known/new annotations of protein domains and functions for this organism. A software implementing the two proposed approaches is available at the same address: http://www.lirmm.fr/~terrapon/HMMfit/
Taborri, Juri; Rossi, Stefano; Palermo, Eduardo; Patanè, Fabrizio; Cappa, Paolo
2014-01-01
In this work, we decided to apply a hierarchical weighted decision, proposed and used in other research fields, for the recognition of gait phases. The developed and validated novel distributed classifier is based on hierarchical weighted decision from outputs of scalar Hidden Markov Models (HMM) applied to angular velocities of foot, shank, and thigh. The angular velocities of ten healthy subjects were acquired via three uni-axial gyroscopes embedded in inertial measurement units (IMUs) during one walking task, repeated three times, on a treadmill. After validating the novel distributed classifier and scalar and vectorial classifiers-already proposed in the literature, with a cross-validation, classifiers were compared for sensitivity, specificity, and computational load for all combinations of the three targeted anatomical segments. Moreover, the performance of the novel distributed classifier in the estimation of gait variability in terms of mean time and coefficient of variation was evaluated. The highest values of specificity and sensitivity (>0.98) for the three classifiers examined here were obtained when the angular velocity of the foot was processed. Distributed and vectorial classifiers reached acceptable values (>0.95) when the angular velocity of shank and thigh were analyzed. Distributed and scalar classifiers showed values of computational load about 100 times lower than the one obtained with the vectorial classifier. In addition, distributed classifiers showed an excellent reliability for the evaluation of mean time and a good/excellent reliability for the coefficient of variation. In conclusion, due to the better performance and the small value of computational load, the here proposed novel distributed classifier can be implemented in the real-time application of gait phases recognition, such as to evaluate gait variability in patients or to control active orthoses for the recovery of mobility of lower limb joints. PMID:25184488
Detection of Splice Sites Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Varadwaj, Pritish; Purohit, Neetesh; Arora, Bhumika
Automatic identification and annotation of exon and intron region of gene, from DNA sequences has been an important research area in field of computational biology. Several approaches viz. Hidden Markov Model (HMM), Artificial Intelligence (AI) based machine learning and Digital Signal Processing (DSP) techniques have extensively and independently been used by various researchers to cater this challenging task. In this work, we propose a Support Vector Machine based kernel learning approach for detection of splice sites (the exon-intron boundary) in a gene. Electron-Ion Interaction Potential (EIIP) values of nucleotides have been used for mapping character sequences to corresponding numeric sequences. Radial Basis Function (RBF) SVM kernel is trained using EIIP numeric sequences. Furthermore this was tested on test gene dataset for detection of splice site by window (of 12 residues) shifting. Optimum values of window size, various important parameters of SVM kernel have been optimized for a better accuracy. Receiver Operating Characteristic (ROC) curves have been utilized for displaying the sensitivity rate of the classifier and results showed 94.82% accuracy for splice site detection on test dataset.
Extracting duration information in a picture category decoding task using hidden Markov Models
NASA Astrophysics Data System (ADS)
Pfeiffer, Tim; Heinze, Nicolai; Frysch, Robert; Deouell, Leon Y.; Schoenfeld, Mircea A.; Knight, Robert T.; Rose, Georg
2016-04-01
Objective. Adapting classifiers for the purpose of brain signal decoding is a major challenge in brain-computer-interface (BCI) research. In a previous study we showed in principle that hidden Markov models (HMM) are a suitable alternative to the well-studied static classifiers. However, since we investigated a rather straightforward task, advantages from modeling of the signal could not be assessed. Approach. Here, we investigate a more complex data set in order to find out to what extent HMMs, as a dynamic classifier, can provide useful additional information. We show for a visual decoding problem that besides category information, HMMs can simultaneously decode picture duration without an additional training required. This decoding is based on a strong correlation that we found between picture duration and the behavior of the Viterbi paths. Main results. Decoding accuracies of up to 80% could be obtained for category and duration decoding with a single classifier trained on category information only. Significance. The extraction of multiple types of information using a single classifier enables the processing of more complex problems, while preserving good training results even on small databases. Therefore, it provides a convenient framework for online real-life BCI utilizations.
Ni, Yepeng; Liu, Jianbo; Liu, Shan; Bai, Yaxin
2016-01-01
With the rapid development of smartphones and wireless networks, indoor location-based services have become more and more prevalent. Due to the sophisticated propagation of radio signals, the Received Signal Strength Indicator (RSSI) shows a significant variation during pedestrian walking, which introduces critical errors in deterministic indoor positioning. To solve this problem, we present a novel method to improve the indoor pedestrian positioning accuracy by embedding a fuzzy pattern recognition algorithm into a Hidden Markov Model. The fuzzy pattern recognition algorithm follows the rule that the RSSI fading has a positive correlation to the distance between the measuring point and the AP location even during a dynamic positioning measurement. Through this algorithm, we use the RSSI variation trend to replace the specific RSSI value to achieve a fuzzy positioning. The transition probability of the Hidden Markov Model is trained by the fuzzy pattern recognition algorithm with pedestrian trajectories. Using the Viterbi algorithm with the trained model, we can obtain a set of hidden location states. In our experiments, we demonstrate that, compared with the deterministic pattern matching algorithm, our method can greatly improve the positioning accuracy and shows robust environmental adaptability. PMID:27618053
Automatic classification of animal vocalizations
NASA Astrophysics Data System (ADS)
Clemins, Patrick J.
2005-11-01
Bioacoustics, the study of animal vocalizations, has begun to use increasingly sophisticated analysis techniques in recent years. Some common tasks in bioacoustics are repertoire determination, call detection, individual identification, stress detection, and behavior correlation. Each research study, however, uses a wide variety of different measured variables, called features, and classification systems to accomplish these tasks. The well-established field of human speech processing has developed a number of different techniques to perform many of the aforementioned bioacoustics tasks. Melfrequency cepstral coefficients (MFCCs) and perceptual linear prediction (PLP) coefficients are two popular feature sets. The hidden Markov model (HMM), a statistical model similar to a finite autonoma machine, is the most commonly used supervised classification model and is capable of modeling both temporal and spectral variations. This research designs a framework that applies models from human speech processing for bioacoustic analysis tasks. The development of the generalized perceptual linear prediction (gPLP) feature extraction model is one of the more important novel contributions of the framework. Perceptual information from the species under study can be incorporated into the gPLP feature extraction model to represent the vocalizations as the animals might perceive them. By including this perceptual information and modifying parameters of the HMM classification system, this framework can be applied to a wide range of species. The effectiveness of the framework is shown by analyzing African elephant and beluga whale vocalizations. The features extracted from the African elephant data are used as input to a supervised classification system and compared to results from traditional statistical tests. The gPLP features extracted from the beluga whale data are used in an unsupervised classification system and the results are compared to labels assigned by experts. The development of a framework from which to build animal vocalization classifiers will provide bioacoustics researchers with a consistent platform to analyze and classify vocalizations. A common framework will also allow studies to compare results across species and institutions. In addition, the use of automated classification techniques can speed analysis and uncover behavioral correlations not readily apparent using traditional techniques.
NASA Astrophysics Data System (ADS)
Wan, Weibing; Yuan, Lingfeng; Zhao, Qunfei; Fang, Tao
2018-01-01
Saliency detection has been applied to the target acquisition case. This paper proposes a two-dimensional hidden Markov model (2D-HMM) that exploits the hidden semantic information of an image to detect its salient regions. A spatial pyramid histogram of oriented gradient descriptors is used to extract features. After encoding the image by a learned dictionary, the 2D-Viterbi algorithm is applied to infer the saliency map. This model can predict fixation of the targets and further creates robust and effective depictions of the targets' change in posture and viewpoint. To validate the model with a human visual search mechanism, two eyetrack experiments are employed to train our model directly from eye movement data. The results show that our model achieves better performance than visual attention. Moreover, it indicates the plausibility of utilizing visual track data to identify targets.
Yin, Xiang; Long, Chang; Li, Junhao; Zhu, Hua; Chen, Lin; Guan, Jianguo; Li, Xun
2015-10-19
Microwave absorbers have important applications in various areas including stealth, camouflage, and antenna. Here, we have designed an ultra-broadband light absorber by integrating two different-sized tapered hyperbolic metamaterial (HMM) waveguides, each of which has wide but different absorption bands due to broadband slow-light response, into a unit cell. Both the numerical and experimental results demonstrate that in such a design strategy, the low absorption bands between high absorption bands with a single-sized tapered HMM waveguide array can be effectively eliminated, resulting in a largely expanded absorption bandwidth ranging from 2.3 to 40 GHz. The presented ultra-broadband light absorber is also insensitive to polarization and robust against incident angle. Our results offer a further step in developing practical artificial electromagnetic absorbers, which will impact a broad range of applications at microwave frequencies.
Arakawa, Toshiya; Tanave, Akira; Ikeuchi, Shiho; Takahashi, Aki; Kakihara, Satoshi; Kimura, Shingo; Sugimoto, Hiroki; Asada, Nobuhiko; Shiroishi, Toshihiko; Tomihara, Kazuya; Tsuchiya, Takashi; Koide, Tsuyoshi
2014-08-30
Owing to their complex nature, social interaction tests normally require the observation of video data by a human researcher, and thus are difficult to use in large-scale studies. We previously established a statistical method, a hidden Markov model (HMM), which enables the differentiation of two social states ("interaction" and "indifference"), and three social states ("sniffing", "following", and "indifference"), automatically in silico. Here, we developed freeware called DuoMouse for the rapid evaluation of social interaction behavior. This software incorporates five steps: (1) settings, (2) video recording, (3) tracking from the video data, (4) HMM analysis, and (5) visualization of the results. Using DuoMouse, we mapped a genetic locus related to social interaction. We previously reported that a consomic strain, B6-Chr6C(MSM), with its chromosome 6 substituted for one from MSM/Ms, showed more social interaction than C57BL/6 (B6). We made four subconsomic strains, C3, C5, C6, and C7, each of which has a shorter segment of chromosome 6 derived from B6-Chr6C, and conducted social interaction tests on these strains. DuoMouse indicated that C6, but not C3, C5, and C7, showed higher interaction, sniffing, and following than B6, specifically in males. The data obtained by human observation showed high concordance to those from DuoMouse. The results indicated that the MSM-derived chromosomal region present in C6-but not in C3, C5, and C7-associated with increased social behavior. This method to analyze social interaction will aid primary screening for difference in social behavior in mice. Copyright © 2014 Elsevier B.V. All rights reserved.
Borgy, Benjamin; Reboud, Xavier; Peyrard, Nathalie; Sabbadin, Régis; Gaba, Sabrina
2015-01-01
Predicting the population dynamics of annual plants is a challenge due to their hidden seed banks in the field. However, such predictions are highly valuable for determining management strategies, specifically in agricultural landscapes. In agroecosystems, most weed seeds survive during unfavourable seasons and persist for several years in the seed bank. This causes difficulties in making accurate predictions of weed population dynamics and life history traits (LHT). Consequently, it is very difficult to identify management strategies that limit both weed populations and species diversity. In this article, we present a method of assessing weed population dynamics from both standing plant time series data and an unknown seed bank. We use a Hidden Markov Model (HMM) to obtain estimates of over 3,080 botanical records for three major LHT: seed survival in the soil, plant establishment (including post-emergence mortality), and seed production of 18 common weed species. Maximum likelihood and Bayesian approaches were complementarily used to estimate LHT values. The results showed that the LHT provided by the HMM enabled fairly accurate estimates of weed populations in different crops. There was a positive correlation between estimated germination rates and an index of the specialisation to the crop type (IndVal). The relationships between estimated LHTs and that between the estimated LHTs and the ecological characteristics of weeds provided insights into weed strategies. For example, a common strategy to cope with agricultural practices in several weeds was to produce less seeds and increase germination rates. This knowledge, especially of LHT for each type of crop, should provide valuable information for developing sustainable weed management strategies.
Borgy, Benjamin; Reboud, Xavier; Peyrard, Nathalie; Sabbadin, Régis; Gaba, Sabrina
2015-01-01
Predicting the population dynamics of annual plants is a challenge due to their hidden seed banks in the field. However, such predictions are highly valuable for determining management strategies, specifically in agricultural landscapes. In agroecosystems, most weed seeds survive during unfavourable seasons and persist for several years in the seed bank. This causes difficulties in making accurate predictions of weed population dynamics and life history traits (LHT). Consequently, it is very difficult to identify management strategies that limit both weed populations and species diversity. In this article, we present a method of assessing weed population dynamics from both standing plant time series data and an unknown seed bank. We use a Hidden Markov Model (HMM) to obtain estimates of over 3,080 botanical records for three major LHT: seed survival in the soil, plant establishment (including post-emergence mortality), and seed production of 18 common weed species. Maximum likelihood and Bayesian approaches were complementarily used to estimate LHT values. The results showed that the LHT provided by the HMM enabled fairly accurate estimates of weed populations in different crops. There was a positive correlation between estimated germination rates and an index of the specialisation to the crop type (IndVal). The relationships between estimated LHTs and that between the estimated LHTs and the ecological characteristics of weeds provided insights into weed strategies. For example, a common strategy to cope with agricultural practices in several weeds was to produce less seeds and increase germination rates. This knowledge, especially of LHT for each type of crop, should provide valuable information for developing sustainable weed management strategies. PMID:26427023
Using Three-color Single-molecule FRET to Study the Correlation of Protein Interactions.
Götz, Markus; Wortmann, Philipp; Schmid, Sonja; Hugel, Thorsten
2018-01-30
Single-molecule Förster resonance energy transfer (smFRET) has become a widely used biophysical technique to study the dynamics of biomolecules. For many molecular machines in a cell proteins have to act together with interaction partners in a functional cycle to fulfill their task. The extension of two-color to multi-color smFRET makes it possible to simultaneously probe more than one interaction or conformational change. This not only adds a new dimension to smFRET experiments but it also offers the unique possibility to directly study the sequence of events and to detect correlated interactions when using an immobilized sample and a total internal reflection fluorescence microscope (TIRFM). Therefore, multi-color smFRET is a versatile tool for studying biomolecular complexes in a quantitative manner and in a previously unachievable detail. Here, we demonstrate how to overcome the special challenges of multi-color smFRET experiments on proteins. We present detailed protocols for obtaining the data and for extracting kinetic information. This includes trace selection criteria, state separation, and the recovery of state trajectories from the noisy data using a 3D ensemble Hidden Markov Model (HMM). Compared to other methods, the kinetic information is not recovered from dwell time histograms but directly from the HMM. The maximum likelihood framework allows us to critically evaluate the kinetic model and to provide meaningful uncertainties for the rates. By applying our method to the heat shock protein 90 (Hsp90), we are able to disentangle the nucleotide binding and the global conformational changes of the protein. This allows us to directly observe the cooperativity between the two nucleotide binding pockets of the Hsp90 dimer.
Edwards, Laura J; Moisés, Abú; Nzaramba, Mathias; Cassimo, Aboobacar; Silva, Laura; Mauricio, Joaquim; Wester, C William; Vermund, Sten H; Moon, Troy D
2015-03-12
Avante Zambézia is an initiative of a Non-Governmental Organization (NGO), Friends in Global Health, LLC (FGH) and the Vanderbilt Institute for Global Health (VIGH) to provide technical assistance to the Mozambican Ministry of Health (MoH) in rural Zambézia Province. Avante Zambézia developed a district level Health Management Mentorship (HMM) program to strengthen health systems in ten of Zambézia's 17 districts. Our objective was to preliminarily analyze changes in four domains of health system capacity after the HMM's first year: accounting, Human Resources (HRs), Monitoring and Evaluation (M&E), and transportation management. Quantitative metrics were developed in each domain. During district visits for weeklong, on-site mentoring, the health management mentoring teams documented each indicator as a success ratio percentage. We analyzed data using linear regressions of each indicator's mean success ratio across all districts submitting a report over time. Of the four domains, district performance in the accounting domain was the strongest and most sustained. Linear regressions of mean monthly compliance for HR objectives indicated improvement in three of six mean success ratios. The M&E capacity domain showed the least overall improvement. The one indicator analyzed for transportation management suggested progress. Our outcome evaluation demonstrates improvement in health system performance during a HMM initiative. Evaluating which elements of our mentoring program are succeeding in strengthening district level health systems is vital in preparing to transition fiscal and managerial responsibility to local authorities. © 2015 by Kerman University of Medical Sciences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haack, Jeffrey; Shohet, Gil
2016-12-02
The software implements a heterogeneous multiscale method (HMM), which involves solving a classical molecular dynamics (MD) problem and then computes the entropy production in order to compute the relaxation times towards equilibrium for use in a Bhatnagar-Gross-Krook (BGK) solver.
Colonoscopy video quality assessment using hidden Markov random fields
NASA Astrophysics Data System (ADS)
Park, Sun Young; Sargent, Dusty; Spofford, Inbar; Vosburgh, Kirby
2011-03-01
With colonoscopy becoming a common procedure for individuals aged 50 or more who are at risk of developing colorectal cancer (CRC), colon video data is being accumulated at an ever increasing rate. However, the clinically valuable information contained in these videos is not being maximally exploited to improve patient care and accelerate the development of new screening methods. One of the well-known difficulties in colonoscopy video analysis is the abundance of frames with no diagnostic information. Approximately 40% - 50% of the frames in a colonoscopy video are contaminated by noise, acquisition errors, glare, blur, and uneven illumination. Therefore, filtering out low quality frames containing no diagnostic information can significantly improve the efficiency of colonoscopy video analysis. To address this challenge, we present a quality assessment algorithm to detect and remove low quality, uninformative frames. The goal of our algorithm is to discard low quality frames while retaining all diagnostically relevant information. Our algorithm is based on a hidden Markov model (HMM) in combination with two measures of data quality to filter out uninformative frames. Furthermore, we present a two-level framework based on an embedded hidden Markov model (EHHM) to incorporate the proposed quality assessment algorithm into a complete, automated diagnostic image analysis system for colonoscopy video.
Optimized Algorithms for Prediction Within Robotic Tele-Operative Interfaces
NASA Technical Reports Server (NTRS)
Martin, Rodney A.; Wheeler, Kevin R.; Allan, Mark B.; SunSpiral, Vytas
2010-01-01
Robonaut, the humanoid robot developed at the Dexterous Robotics Labo ratory at NASA Johnson Space Center serves as a testbed for human-rob ot collaboration research and development efforts. One of the recent efforts investigates how adjustable autonomy can provide for a safe a nd more effective completion of manipulation-based tasks. A predictiv e algorithm developed in previous work was deployed as part of a soft ware interface that can be used for long-distance tele-operation. In this work, Hidden Markov Models (HMM?s) were trained on data recorded during tele-operation of basic tasks. In this paper we provide the d etails of this algorithm, how to improve upon the methods via optimization, and also present viable alternatives to the original algorithmi c approach. We show that all of the algorithms presented can be optim ized to meet the specifications of the metrics shown as being useful for measuring the performance of the predictive methods. 1
Molecular dynamics simulations in hybrid particle-continuum schemes: Pitfalls and caveats
NASA Astrophysics Data System (ADS)
Stalter, S.; Yelash, L.; Emamy, N.; Statt, A.; Hanke, M.; Lukáčová-Medvid'ová, M.; Virnau, P.
2018-03-01
Heterogeneous multiscale methods (HMM) combine molecular accuracy of particle-based simulations with the computational efficiency of continuum descriptions to model flow in soft matter liquids. In these schemes, molecular simulations typically pose a computational bottleneck, which we investigate in detail in this study. We find that it is preferable to simulate many small systems as opposed to a few large systems, and that a choice of a simple isokinetic thermostat is typically sufficient while thermostats such as Lowe-Andersen allow for simulations at elevated viscosity. We discuss suitable choices for time steps and finite-size effects which arise in the limit of very small simulation boxes. We also argue that if colloidal systems are considered as opposed to atomistic systems, the gap between microscopic and macroscopic simulations regarding time and length scales is significantly smaller. We propose a novel reduced-order technique for the coupling to the macroscopic solver, which allows us to approximate a non-linear stress-strain relation efficiently and thus further reduce computational effort of microscopic simulations.
Optimal satisfaction degree in energy harvesting cognitive radio networks
NASA Astrophysics Data System (ADS)
Li, Zan; Liu, Bo-Yang; Si, Jiang-Bo; Zhou, Fu-Hui
2015-12-01
A cognitive radio (CR) network with energy harvesting (EH) is considered to improve both spectrum efficiency and energy efficiency. A hidden Markov model (HMM) is used to characterize the imperfect spectrum sensing process. In order to maximize the whole satisfaction degree (WSD) of the cognitive radio network, a tradeoff between the average throughput of the secondary user (SU) and the interference to the primary user (PU) is analyzed. We formulate the satisfaction degree optimization problem as a mixed integer nonlinear programming (MINLP) problem. The satisfaction degree optimization problem is solved by using differential evolution (DE) algorithm. The proposed optimization problem allows the network to adaptively achieve the optimal solution based on its required quality of service (Qos). Numerical results are given to verify our analysis. Project supported by the National Natural Science Foundation of China (Grant No. 61301179), the Doctorial Programs Foundation of the Ministry of Education of China (Grant No. 20110203110011), and the 111 Project (Grant No. B08038).
A Context-Recognition-Aided PDR Localization Method Based on the Hidden Markov Model
Lu, Yi; Wei, Dongyan; Lai, Qifeng; Li, Wen; Yuan, Hong
2016-01-01
Indoor positioning has recently become an important field of interest because global navigation satellite systems (GNSS) are usually unavailable in indoor environments. Pedestrian dead reckoning (PDR) is a promising localization technique for indoor environments since it can be implemented on widely used smartphones equipped with low cost inertial sensors. However, the PDR localization severely suffers from the accumulation of positioning errors, and other external calibration sources should be used. In this paper, a context-recognition-aided PDR localization model is proposed to calibrate PDR. The context is detected by employing particular human actions or characteristic objects and it is matched to the context pre-stored offline in the database to get the pedestrian’s location. The Hidden Markov Model (HMM) and Recursive Viterbi Algorithm are used to do the matching, which reduces the time complexity and saves the storage. In addition, the authors design the turn detection algorithm and take the context of corner as an example to illustrate and verify the proposed model. The experimental results show that the proposed localization method can fix the pedestrian’s starting point quickly and improves the positioning accuracy of PDR by 40.56% at most with perfect stability and robustness at the same time. PMID:27916922
Muxstep: an open-source C ++ multiplex HMM library for making inferences on multiple data types.
Veličković, Petar; Liò, Pietro
2016-08-15
With the development of experimental methods and technology, we are able to reliably gain access to data in larger quantities, dimensions and types. This has great potential for the improvement of machine learning (as the learning algorithms have access to a larger space of information). However, conventional machine learning approaches used thus far on single-dimensional data inputs are unlikely to be expressive enough to accurately model the problem in higher dimensions; in fact, it should generally be most suitable to represent our underlying models as some form of complex networksng;nsio with nontrivial topological features. As the first step in establishing such a trend, we present MUXSTEP: , an open-source library utilising multiplex networks for the purposes of binary classification on multiple data types. The library is designed to be used out-of-the-box for developing models based on the multiplex network framework, as well as easily modifiable to suit problem modelling needs that may differ significantly from the default approach described. The full source code is available on GitHub: https://github.com/PetarV-/muxstep petar.velickovic@cl.cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A Hybrid Smartphone Indoor Positioning Solution for Mobile LBS
Liu, Jingbin; Chen, Ruizhi; Pei, Ling; Guinness, Robert; Kuusniemi, Heidi
2012-01-01
Smartphone positioning is an enabling technology used to create new business in the navigation and mobile location-based services (LBS) industries. This paper presents a smartphone indoor positioning engine named HIPE that can be easily integrated with mobile LBS. HIPE is a hybrid solution that fuses measurements of smartphone sensors with wireless signals. The smartphone sensors are used to measure the user’s motion dynamics information (MDI), which represent the spatial correlation of various locations. Two algorithms based on hidden Markov model (HMM) problems, the grid-based filter and the Viterbi algorithm, are used in this paper as the central processor for data fusion to resolve the position estimates, and these algorithms are applicable for different applications, e.g., real-time navigation and location tracking, respectively. HIPE is more widely applicable for various motion scenarios than solutions proposed in previous studies because it uses no deterministic motion models, which have been commonly used in previous works. The experimental results showed that HIPE can provide adequate positioning accuracy and robustness for different scenarios of MDI combinations. HIPE is a cost-efficient solution, and it can work flexibly with different smartphone platforms, which may have different types of sensors available for the measurement of MDI data. The reliability of the positioning solution was found to increase with increasing precision of the MDI data. PMID:23235455
Reduced kernel recursive least squares algorithm for aero-engine degradation prediction
NASA Astrophysics Data System (ADS)
Zhou, Haowen; Huang, Jinquan; Lu, Feng
2017-10-01
Kernel adaptive filters (KAFs) generate a linear growing radial basis function (RBF) network with the number of training samples, thereby lacking sparseness. To deal with this drawback, traditional sparsification techniques select a subset of original training data based on a certain criterion to train the network and discard the redundant data directly. Although these methods curb the growth of the network effectively, it should be noted that information conveyed by these redundant samples is omitted, which may lead to accuracy degradation. In this paper, we present a novel online sparsification method which requires much less training time without sacrificing the accuracy performance. Specifically, a reduced kernel recursive least squares (RKRLS) algorithm is developed based on the reduced technique and the linear independency. Unlike conventional methods, our novel methodology employs these redundant data to update the coefficients of the existing network. Due to the effective utilization of the redundant data, the novel algorithm achieves a better accuracy performance, although the network size is significantly reduced. Experiments on time series prediction and online regression demonstrate that RKRLS algorithm requires much less computational consumption and maintains the satisfactory accuracy performance. Finally, we propose an enhanced multi-sensor prognostic model based on RKRLS and Hidden Markov Model (HMM) for remaining useful life (RUL) estimation. A case study in a turbofan degradation dataset is performed to evaluate the performance of the novel prognostic approach.
Goodswen, Stephen J.; Kennedy, Paul J.; Ellis, John T.
2012-01-01
Next generation sequencing technology is advancing genome sequencing at an unprecedented level. By unravelling the code within a pathogen’s genome, every possible protein (prior to post-translational modifications) can theoretically be discovered, irrespective of life cycle stages and environmental stimuli. Now more than ever there is a great need for high-throughput ab initio gene finding. Ab initio gene finders use statistical models to predict genes and their exon-intron structures from the genome sequence alone. This paper evaluates whether existing ab initio gene finders can effectively predict genes to deduce proteins that have presently missed capture by laboratory techniques. An aim here is to identify possible patterns of prediction inaccuracies for gene finders as a whole irrespective of the target pathogen. All currently available ab initio gene finders are considered in the evaluation but only four fulfil high-throughput capability: AUGUSTUS, GeneMark_hmm, GlimmerHMM, and SNAP. These gene finders require training data specific to a target pathogen and consequently the evaluation results are inextricably linked to the availability and quality of the data. The pathogen, Toxoplasma gondii, is used to illustrate the evaluation methods. The results support current opinion that predicted exons by ab initio gene finders are inaccurate in the absence of experimental evidence. However, the results reveal some patterns of inaccuracy that are common to all gene finders and these inaccuracies may provide a focus area for future gene finder developers. PMID:23226328
Artificial Intelligence Software for Assessing Postural Stability
NASA Technical Reports Server (NTRS)
Lieberman, Erez; Forth, Katharine; Paloski, William
2013-01-01
A software package reads and analyzes pressure distributions from sensors mounted under a person's feet. Pressure data from sensors mounted in shoes, or in a platform, can be used to provide a description of postural stability (assessing competence to deficiency) and enables the determination of the person's present activity (running, walking, squatting, falling). This package has three parts: a preprocessing algorithm for reading input from pressure sensors; a Hidden Markov Model (HMM), which is used to determine the person's present activity and level of sensing-motor competence; and a suite of graphical algorithms, which allows visual representation of the person's activity and vestibular function over time.
Sharma, Ronesh; Bayarjargal, Maitsetseg; Tsunoda, Tatsuhiko; Patil, Ashwini; Sharma, Alok
2018-01-21
Intrinsically Disordered Proteins (IDPs) lack stable tertiary structure and they actively participate in performing various biological functions. These IDPs expose short binding regions called Molecular Recognition Features (MoRFs) that permit interaction with structured protein regions. Upon interaction they undergo a disorder-to-order transition as a result of which their functionality arises. Predicting these MoRFs in disordered protein sequences is a challenging task. In this study, we present MoRFpred-plus, an improved predictor over our previous proposed predictor to identify MoRFs in disordered protein sequences. Two separate independent propensity scores are computed via incorporating physicochemical properties and HMM profiles, these scores are combined to predict final MoRF propensity score for a given residue. The first score reflects the characteristics of a query residue to be part of MoRF region based on the composition and similarity of assumed MoRF and flank regions. The second score reflects the characteristics of a query residue to be part of MoRF region based on the properties of flanks associated around the given residue in the query protein sequence. The propensity scores are processed and common averaging is applied to generate the final prediction score of MoRFpred-plus. Performance of the proposed predictor is compared with available MoRF predictors, MoRFchibi, MoRFpred, and ANCHOR. Using previously collected training and test sets used to evaluate the mentioned predictors, the proposed predictor outperforms these predictors and generates lower false positive rate. In addition, MoRFpred-plus is a downloadable predictor, which makes it useful as it can be used as input to other computational tools. https://github.com/roneshsharma/MoRFpred-plus/wiki/MoRFpred-plus:-Download. Copyright © 2017 Elsevier Ltd. All rights reserved.
Towards parameter-free classification of sound effects in movies
NASA Astrophysics Data System (ADS)
Chu, Selina; Narayanan, Shrikanth; Kuo, C.-C. J.
2005-08-01
The problem of identifying intense events via multimedia data mining in films is investigated in this work. Movies are mainly characterized by dialog, music, and sound effects. We begin our investigation with detecting interesting events through sound effects. Sound effects are neither speech nor music, but are closely associated with interesting events such as car chases and gun shots. In this work, we utilize low-level audio features including MFCC and energy to identify sound effects. It was shown in previous work that the Hidden Markov model (HMM) works well for speech/audio signals. However, this technique requires a careful choice in designing the model and choosing correct parameters. In this work, we introduce a framework that will avoid such necessity and works well with semi- and non-parametric learning algorithms.
Tamaki, Yukihiro; Harakuni, Tetsuya; Yamaguchi, Rui; Miyata, Takeshi; Arakawa, Takeshi
2016-03-04
The cholera toxin B subunit (CTB) is secreted in its pentameric form from Escherichia coli if its leader peptide is replaced with one of E. coli origin. However, the secretion of the pentamer is generally severely impaired when the molecule is mutated or fused to a foreign peptide. Therefore, we attempted to regenerate pentameric CTB from the inclusion bodies (IBs) of E. coli. Stepwise dialysis of the IBs solubilized in guanidine hydrochloride predominantly generated soluble high-molecular-mass (HMM) aggregates and only a small fraction of pentamer. Three methods to reassemble homogeneous pentameric molecules were evaluated: (i) using a pentameric coiled-coil fusion partner, expecting it to function as an assembly core; (ii) optimizing the protein concentration during refolding; and (iii) eliminating contaminants before refolding. Coiled-coil fusion had some effect, but substantial amounts of HMM aggregates were still generated. Varying the protein concentration from 0.05 mg/mL to 5mg/mL had almost no effect. In contrast, eliminating the contaminants before refolding had a robust effect, and only the pentamer was regenerated, with no detectable HMM aggregates. Surprisingly, the protein concentration at refolding was up to 5mg/mL when the contaminants were removed, with no adverse effects on refolding. The regenerated pentamer was indistinguishable in its biochemical and immunological characteristics from CTB secreted from E. coli or choleragenoid from Vibrio cholerae. This study provides a simple but very efficient strategy for pentamerizing CTB with a highly homogeneous molecular conformation, with which it may be feasible to engineer CTB derivatives and CTB fusion antigens. Copyright © 2016 Elsevier Ltd. All rights reserved.
Qdot Labeled Actin Super Resolution Motility Assay Measures Low Duty Cycle Muscle Myosin Step-Size
Wang, Yihua; Ajtai, Katalin; Burghardt, Thomas P.
2013-01-01
Myosin powers contraction in heart and skeletal muscle and is a leading target for mutations implicated in inheritable muscle diseases. During contraction, myosin transduces ATP free energy into the work of muscle shortening against resisting force. Muscle shortening involves relative sliding of myosin and actin filaments. Skeletal actin filaments were fluorescence labeled with a streptavidin conjugate quantum dot (Qdot) binding biotin-phalloidin on actin. Single Qdot’s were imaged in time with total internal reflection fluorescence microscopy then spatially localized to 1-3 nanometers using a super-resolution algorithm as they translated with actin over a surface coated with skeletal heavy meromyosin (sHMM) or full length β-cardiac myosin (MYH7). Average Qdot-actin velocity matches measurements with rhodamine-phalloidin labeled actin. The sHMM Qdot-actin velocity histogram contains low velocity events corresponding to actin translation in quantized steps of ~5 nm. The MYH7 velocity histogram has quantized steps at 3 and 8 nm in addition to 5 nm, and, larger compliance than sHMM depending on MYH7 surface concentration. Low duty cycle skeletal and cardiac myosin present challenges for a single molecule assay because actomyosin dissociates quickly and the freely moving element diffuses away. The in vitro motility assay has modestly more actomyosin interactions and methylcellulose inhibited diffusion to sustain the complex while preserving a subset of encounters that do not overlap in time on a single actin filament. A single myosin step is isolated in time and space then characterized using super-resolution. The approach provides quick, quantitative, and inexpensive step-size measurement for low duty cycle muscle myosin. PMID:23383646
Dunbrack, Roland L.
2012-01-01
Motivation: Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. Results: We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM–HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. Availability: The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly. Contact: Roland.Dunbracks@fccc.edu PMID:22942020
The Roles of APOBEC3G Complexes in the Incorporation of APOBEC3G into HIV-1
Zhang, Quan; Liu, Zhenlong; Jia, Pingping; Zhou, Jinming; Guo, Fei; You, Xuefu; Yu, Liyan; Zhao, Lixun; Jiang, Jiandong; Cen, Shan
2013-01-01
Background The incorporation of human APOBEC3G (hA3G) into HIV is required for exerting its antiviral activity, therefore the mechanism underlying hA3G virion encapsidation has been investigated extensively. hA3G was shown to form low-molecular-mass (LMM) and high-molecular-mass (HMM) complexes. The function of different forms of hA3G in its viral incorporation remains unclear. Methodology/Principal Findings In this study, we investigated the subcellular distribution and lipid raft association of hA3G using subcellular fractionation, membrane floatation assay and pulse-chase radiolabeling experiments respectively, and studied the correlation between the ability of hA3G to form the different complex and its viral incorporation. Our work herein provides evidence that the majority of newly-synthesized hA3G interacts with membrane lipid raft domains to form Lipid raft-associated hA3G (RA hA3G), which serve as the precursor of mature HMM hA3G complex, while a minority of newly-synthesized hA3G remains in the cytoplasm as a soluble LMM form. The distribution of hA3G among the soluble LMM form, the RA LMM form and the mature forms of HMM is regulated by a mechanism involving the N-terminal part of the linker region and the C-terminus of hA3G. Mutagenesis studies reveal a direct correlation between the ability of hA3G to form the RA LMM complex and its viral incorporation. Conclusions/Significance Together these data suggest that the Lipid raft-associated LMM A3G complex functions as the cellular source of viral hA3G. PMID:24098356
Specific acoustic models for spontaneous and dictated style in indonesian speech recognition
NASA Astrophysics Data System (ADS)
Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.
2018-03-01
The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.
Tunable graphene-based hyperbolic metamaterial operating in SCLU telecom bands.
Janaszek, Bartosz; Tyszka-Zawadzka, Anna; Szczepański, Paweł
2016-10-17
The tunability of graphene-based hyperbolic metamaterial structure operating in SCLU telecom bands is investigated. For the first time it has been shown that for the proper design of a graphene/dielectric multilayer stack, the HMM Type I, Epsilon-Near-Zero and Type II regimes are possible by changing the biasing potential. Numerical results reveal the effect of structure parameters such as the thickness of the dielectric layer as well as a number of graphene sheets in a unit cell (i.e., dielectric/graphene bilayer) on the tunability range and shape of the dispersion characteristics (i.e., Type I/ENZ/Type II) in SCLU telecom bands. This kind of materials could offer a technological platform for novel devices having various applications in optical communications technology.
Daas, Martinus J A; Nijsse, Bart; van de Weijer, Antonius H P; Groenendaal, Bart W A J; Janssen, Fons; van der Oost, John; van Kranenburg, Richard
2018-06-27
Consolidated bioprocessing (CBP) is a cost-effective approach for the conversion of lignocellulosic biomass to biofuels and biochemicals. The enzymatic conversion of cellulose to glucose requires the synergistic action of three types of enzymes: exoglucanases, endoglucanases and β-glucosidases. The thermophilic, hemicellulolytic Geobacillus thermodenitrificans T12 was shown to harbor desired features for CBP, although it lacks the desired endo and exoglucanases required for the conversion of cellulose. Here, we report the expression of both endoglucanase and exoglucanase encoding genes by G. thermodenitrificans T12, in an initial attempt to express cellulolytic enzymes that complement the enzymatic machinery of this strain. A metagenome screen was performed on 73 G. thermodenitrificans strains using HMM profiles of all known CAZy families that contain endo and/or exoglucanases. Two putative endoglucanases, GE39 and GE40, belonging to glucoside hydrolase family 5 (GH5) were isolated and expressed in both E. coli and G. thermodenitrificans T12. Structure modeling of GE39 revealed a folding similar to a GH5 exo-1,3-β-glucanase from S. cerevisiae. However, we determined GE39 to be a β-xylosidase having pronounced activity towards p-nitrophenyl-β-D-xylopyranoside. Structure modelling of GE40 revealed its protein architecture to be similar to a GH5 endoglucanase from B. halodurans, and its endoglucanase activity was confirmed by enzymatic activity against 2-hydroxyethylcellulose, carboxymethylcellulose and barley β-glucan. Additionally, we introduced expression constructs into T12 containing Geobacillus sp. 70PC53 endoglucanase gene celA and both endoglucanase genes (M1 and M2) from Geobacillus sp. WSUCF1. Finally, we introduced expression constructs into T12 containing the C. thermocellum exoglucanases celK and celS genes and the endoglucanase celC gene. We identified a novel G. thermodenitrificans β-xylosidase (GE39) and a novel endoglucanase (GE40) using a metagenome screen based on multiple HMM profiles. We successfully expressed both genes in E. coli and functionally expressed the GE40 endoglucanase in G. thermodenitrificans T12. Additionally, the heterologous production of active CelK, a C. thermocellum derived exoglucanase, and CelA, a Geobacillus derived endoglucanase, was demonstrated with strain T12. The native hemicellulolytic activity and the heterologous cellulolytic activity described in this research provide a good basis for the further development of G. thermodenitrificans T12 as a host for consolidated bioprocessing.
Wearable flex sensor system for multiple badminton player grip identification
NASA Astrophysics Data System (ADS)
Jacob, Alvin; Zakaria, Wan Nurshazwani Wan; Tomari, Mohd Razali Bin Md; Sek, Tee Kian; Suberi, Anis Azwani Muhd
2017-09-01
This paper focuses on the development of a wearable sensor system to identify the different types of badminton grip that is used by a player during training. Badminton movements and strokes are fast and dynamic, where most of the involved movement are difficult to identify with the naked eye. Also, the usage of high processing optometric motion capture system is expensive and causes computational burden. Therefore, this paper suggests the development of a sensorized glove using flex sensor to measure a badminton player's finger flexion angle. The proposed Hand Monitoring Module (HMM) is connected to a personal computer through Bluetooth to enable wireless data transmission. The usability and feasibility of the HMM to identify different grip types were examined through a series of experiments, where the system exhibited 70% detection ability for the five different grip type. The outcome plays a major role in training players to use the proper grips for a badminton stroke to achieve a more powerful and accurate stroke execution.
NASA Astrophysics Data System (ADS)
Stisen, S.; Højberg, A. L.; Troldborg, L.; Refsgaard, J. C.; Christensen, B. S. B.; Olsen, M.; Henriksen, H. J.
2012-11-01
Precipitation gauge catch correction is often given very little attention in hydrological modelling compared to model parameter calibration. This is critical because significant precipitation biases often make the calibration exercise pointless, especially when supposedly physically-based models are in play. This study addresses the general importance of appropriate precipitation catch correction through a detailed modelling exercise. An existing precipitation gauge catch correction method addressing solid and liquid precipitation is applied, both as national mean monthly correction factors based on a historic 30 yr record and as gridded daily correction factors based on local daily observations of wind speed and temperature. The two methods, named the historic mean monthly (HMM) and the time-space variable (TSV) correction, resulted in different winter precipitation rates for the period 1990-2010. The resulting precipitation datasets were evaluated through the comprehensive Danish National Water Resources model (DK-Model), revealing major differences in both model performance and optimised model parameter sets. Simulated stream discharge is improved significantly when introducing the TSV correction, whereas the simulated hydraulic heads and multi-annual water balances performed similarly due to recalibration adjusting model parameters to compensate for input biases. The resulting optimised model parameters are much more physically plausible for the model based on the TSV correction of precipitation. A proxy-basin test where calibrated DK-Model parameters were transferred to another region without site specific calibration showed better performance for parameter values based on the TSV correction. Similarly, the performances of the TSV correction method were superior when considering two single years with a much dryer and a much wetter winter, respectively, as compared to the winters in the calibration period (differential split-sample tests). We conclude that TSV precipitation correction should be carried out for studies requiring a sound dynamic description of hydrological processes, and it is of particular importance when using hydrological models to make predictions for future climates when the snow/rain composition will differ from the past climate. This conclusion is expected to be applicable for mid to high latitudes, especially in coastal climates where winter precipitation types (solid/liquid) fluctuate significantly, causing climatological mean correction factors to be inadequate.
Unsupervised automated high throughput phenotyping of RNAi time-lapse movies.
Failmezger, Henrik; Fröhlich, Holger; Tresch, Achim
2013-10-04
Gene perturbation experiments in combination with fluorescence time-lapse cell imaging are a powerful tool in reverse genetics. High content applications require tools for the automated processing of the large amounts of data. These tools include in general several image processing steps, the extraction of morphological descriptors, and the grouping of cells into phenotype classes according to their descriptors. This phenotyping can be applied in a supervised or an unsupervised manner. Unsupervised methods are suitable for the discovery of formerly unknown phenotypes, which are expected to occur in high-throughput RNAi time-lapse screens. We developed an unsupervised phenotyping approach based on Hidden Markov Models (HMMs) with multivariate Gaussian emissions for the detection of knockdown-specific phenotypes in RNAi time-lapse movies. The automated detection of abnormal cell morphologies allows us to assign a phenotypic fingerprint to each gene knockdown. By applying our method to the Mitocheck database, we show that a phenotypic fingerprint is indicative of a gene's function. Our fully unsupervised HMM-based phenotyping is able to automatically identify cell morphologies that are specific for a certain knockdown. Beyond the identification of genes whose knockdown affects cell morphology, phenotypic fingerprints can be used to find modules of functionally related genes.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints
Titsias, Michalis K.; Holmes, Christopher C.; Yau, Christopher
2016-01-01
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward–backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online. PMID:27226674
Surface-Controlled Properties of Myosin Studied by Electric Field Modulation.
van Zalinge, Harm; Ramsey, Laurence C; Aveyard, Jenny; Persson, Malin; Mansson, Alf; Nicolau, Dan V
2015-08-04
The efficiency of dynamic nanodevices using surface-immobilized protein molecular motors, which have been proposed for diagnostics, drug discovery, and biocomputation, critically depends on the ability to precisely control the motion of motor-propelled, individual cytoskeletal filaments transporting cargo to designated locations. The efficiency of these devices also critically depends on the proper function of the propelling motors, which is controlled by their interaction with the surfaces they are immobilized on. Here we use a microfluidic device to study how the motion of the motile elements, i.e., actin filaments propelled by heavy mero-myosin (HMM) motor fragments immobilized on various surfaces, is altered by the application of electrical loads generated by an external electric field with strengths ranging from 0 to 8 kVm(-1). Because the motility is intimately linked to the function of surface-immobilized motors, the study also showed how the adsorption properties of HMM on various surfaces, such as nitrocellulose (NC), trimethylclorosilane (TMCS), poly(methyl methacrylate) (PMMA), poly(tert-butyl methacrylate) (PtBMA), and poly(butyl methacrylate) (PBMA), can be characterized using an external field. It was found that at an electric field of 5 kVm(-1) the force exerted on the filaments is sufficient to overcome the frictionlike resistive force of the inactive motors. It was also found that the effect of assisting electric fields on the relative increase in the sliding velocity was markedly higher for the TMCS-derivatized surface than for all other polymer-based surfaces. An explanation of this behavior, based on the molecular rigidity of the TMCS-on-glass surfaces as opposed to the flexibility of the polymer-based ones, is considered. To this end, the proposed microfluidic device could be used to select appropriate surfaces for future lab-on-a-chip applications as illustrated here for the almost ideal TMCS surface. Furthermore, the proposed methodology can be used to gain fundamental insights into the functioning of protein molecular motors, such as the force exerted by the motors under different operational conditions.
In vitro assays of molecular motors--impact of motor-surface interactions.
Mansson, Alf; Balaz, Martina; Albet-Torres, Nuria; Rosengren, K Johan
2008-05-01
In many types of biophysical studies of both single molecules and ensembles of molecular motors the motors are adsorbed to artificial surfaces. Some of the most important assay systems of this type (in vitro motility assays and related single molecule techniques) will be briefly described together with an account of breakthroughs in the understanding of actomyosin function that have resulted from their use. A poorly characterized, but potentially important, entity in these studies is the mechanism of motor adsorption to surfaces and the effects of motor surface interactions on experimental results. A better understanding of these phenomena is also important for the development of commercially viable nanotechnological applications powered by molecular motors. Here, we will consider several aspects of motor surface interactions with a particular focus on heavy meromyosin (HMM) from skeletal muscle. These aspects will be related to heavy meromyosin structure and relevant parts of the vast literature on protein-surface interactions for non-motor proteins. An overview of methods for studying motor-surface interactions will also be given. The information is used as a basis for further development of a model for HMM-surface interactions and is discussed in relation to experiments where nanopatterning has been employed for in vitro reconstruction of actomyosin order. The challenges and potentials of this approach in biophysical studies, compared to the use of self-assembly of biological components into supramolecular protein aggregates (e.g. myosin filaments) will be considered. Finally, this review will consider the implications for further developments of motor-powered lab-on-a-chip devices.
Fifty years of progress in speech and speaker recognition
NASA Astrophysics Data System (ADS)
Furui, Sadaoki
2004-10-01
Speech and speaker recognition technology has made very significant progress in the past 50 years. The progress can be summarized by the following changes: (1) from template matching to corpus-base statistical modeling, e.g., HMM and n-grams, (2) from filter bank/spectral resonance to Cepstral features (Cepstrum + DCepstrum + DDCepstrum), (3) from heuristic time-normalization to DTW/DP matching, (4) from gdistanceh-based to likelihood-based methods, (5) from maximum likelihood to discriminative approach, e.g., MCE/GPD and MMI, (6) from isolated word to continuous speech recognition, (7) from small vocabulary to large vocabulary recognition, (8) from context-independent units to context-dependent units for recognition, (9) from clean speech to noisy/telephone speech recognition, (10) from single speaker to speaker-independent/adaptive recognition, (11) from monologue to dialogue/conversation recognition, (12) from read speech to spontaneous speech recognition, (13) from recognition to understanding, (14) from single-modality (audio signal only) to multi-modal (audio/visual) speech recognition, (15) from hardware recognizer to software recognizer, and (16) from no commercial application to many practical commercial applications. Most of these advances have taken place in both the fields of speech recognition and speaker recognition. The majority of technological changes have been directed toward the purpose of increasing robustness of recognition, including many other additional important techniques not noted above.
Complex Sequencing Rules of Birdsong Can be Explained by Simple Hidden Markov Processes
Katahira, Kentaro; Suzuki, Kenta; Okanoya, Kazuo; Okada, Masato
2011-01-01
Complex sequencing rules observed in birdsongs provide an opportunity to investigate the neural mechanism for generating complex sequential behaviors. To relate the findings from studying birdsongs to other sequential behaviors such as human speech and musical performance, it is crucial to characterize the statistical properties of the sequencing rules in birdsongs. However, the properties of the sequencing rules in birdsongs have not yet been fully addressed. In this study, we investigate the statistical properties of the complex birdsong of the Bengalese finch (Lonchura striata var. domestica). Based on manual-annotated syllable labeles, we first show that there are significant higher-order context dependencies in Bengalese finch songs, that is, which syllable appears next depends on more than one previous syllable. We then analyze acoustic features of the song and show that higher-order context dependencies can be explained using first-order hidden state transition dynamics with redundant hidden states. This model corresponds to hidden Markov models (HMMs), well known statistical models with a large range of application for time series modeling. The song annotation with these models with first-order hidden state dynamics agreed well with manual annotation, the score was comparable to that of a second-order HMM, and surpassed the zeroth-order model (the Gaussian mixture model; GMM), which does not use context information. Our results imply that the hierarchical representation with hidden state dynamics may underlie the neural implementation for generating complex behavioral sequences with higher-order dependencies. PMID:21915345
Adaptive partially hidden Markov models with application to bilevel image coding.
Forchhammer, S; Rasmussen, T S
1999-01-01
Partially hidden Markov models (PHMMs) have previously been introduced. The transition and emission/output probabilities from hidden states, as known from the HMMs, are conditioned on the past. This way, the HMM may be applied to images introducing the dependencies of the second dimension by conditioning. In this paper, the PHMM is extended to multiple sequences with a multiple token version and adaptive versions of PHMM coding are presented. The different versions of the PHMM are applied to lossless bilevel image coding. To reduce and optimize the model cost and size, the contexts are organized in trees and effective quantization of the parameters is introduced. The new coding methods achieve results that are better than the JBIG standard on selected test images, although at the cost of increased complexity. By the minimum description length principle, the methods presented for optimizing the code length may apply as guidance for training (P)HMMs for, e.g., segmentation or recognition purposes. Thereby, the PHMM models provide a new approach to image modeling.
AdOn HDP-HMM: An Adaptive Online Model for Segmentation and Classification of Sequential Data.
Bargi, Ava; Xu, Richard Yi Da; Piccardi, Massimo
2017-09-21
Recent years have witnessed an increasing need for the automated classification of sequential data, such as activities of daily living, social media interactions, financial series, and others. With the continuous flow of new data, it is critical to classify the observations on-the-fly and without being limited by a predetermined number of classes. In addition, a model should be able to update its parameters in response to a possible evolution in the distributions of the classes. This compelling problem, however, does not seem to have been adequately addressed in the literature, since most studies focus on offline classification over predefined class sets. In this paper, we present a principled solution for this problem based on an adaptive online system leveraging Markov switching models and hierarchical Dirichlet process priors. This adaptive online approach is capable of classifying the sequential data over an unlimited number of classes while meeting the memory and delay constraints typical of streaming contexts. In this paper, we introduce an adaptive ''learning rate'' that is responsible for balancing the extent to which the model retains its previous parameters or adapts to new observations. Experimental results on stationary and evolving synthetic data and two video data sets, TUM Assistive Kitchen and collated Weizmann, show a remarkable performance in terms of segmentation and classification, particularly for sequences from evolutionary distributions and/or those containing previously unseen classes.
R, Elakkiya; K, Selvamani
2017-09-22
Subunit segmenting and modelling in medical sign language is one of the important studies in linguistic-oriented and vision-based Sign Language Recognition (SLR). Many efforts were made in the precedent to focus the functional subunits from the view of linguistic syllables but the problem is implementing such subunit extraction using syllables is not feasible in real-world computer vision techniques. And also, the present recognition systems are designed in such a way that it can detect the signer dependent actions under restricted and laboratory conditions. This research paper aims at solving these two important issues (1) Subunit extraction and (2) Signer independent action on visual sign language recognition. Subunit extraction involved in the sequential and parallel breakdown of sign gestures without any prior knowledge on syllables and number of subunits. A novel Bayesian Parallel Hidden Markov Model (BPaHMM) is introduced for subunit extraction to combine the features of manual and non-manual parameters to yield better results in classification and recognition of signs. Signer independent action aims in using a single web camera for different signer behaviour patterns and for cross-signer validation. Experimental results have proved that the proposed signer independent subunit level modelling for sign language classification and recognition has shown improvement and variations when compared with other existing works.
Time Series Expression Analyses Using RNA-seq: A Statistical Approach
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021
Time series expression analyses using RNA-seq: a statistical approach.
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
Time series segmentation: a new approach based on Genetic Algorithm and Hidden Markov Model
NASA Astrophysics Data System (ADS)
Toreti, A.; Kuglitsch, F. G.; Xoplaki, E.; Luterbacher, J.
2009-04-01
The subdivision of a time series into homogeneous segments has been performed using various methods applied to different disciplines. In climatology, for example, it is accompanied by the well-known homogenization problem and the detection of artificial change points. In this context, we present a new method (GAMM) based on Hidden Markov Model (HMM) and Genetic Algorithm (GA), applicable to series of independent observations (and easily adaptable to autoregressive processes). A left-to-right hidden Markov model, estimating the parameters and the best-state sequence, respectively, with the Baum-Welch and Viterbi algorithms, was applied. In order to avoid the well-known dependence of the Baum-Welch algorithm on the initial condition, a Genetic Algorithm was developed. This algorithm is characterized by mutation, elitism and a crossover procedure implemented with some restrictive rules. Moreover the function to be minimized was derived following the approach of Kehagias (2004), i.e. it is the so-called complete log-likelihood. The number of states was determined applying a two-fold cross-validation procedure (Celeux and Durand, 2008). Being aware that the last issue is complex, and it influences all the analysis, a Multi Response Permutation Procedure (MRPP; Mielke et al., 1981) was inserted. It tests the model with K+1 states (where K is the state number of the best model) if its likelihood is close to K-state model. Finally, an evaluation of the GAMM performances, applied as a break detection method in the field of climate time series homogenization, is shown. 1. G. Celeux and J.B. Durand, Comput Stat 2008. 2. A. Kehagias, Stoch Envir Res 2004. 3. P.W. Mielke, K.J. Berry, G.W. Brier, Monthly Wea Rev 1981.
The effectiveness of position- and composition-specific gap costs for protein similarity searches.
Stojmirović, Aleksandar; Gertz, E Michael; Altschul, Stephen F; Yu, Yi-Kuo
2008-07-01
The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program. The scripts for performing evaluations are available upon request from the authors.
Tilney, L G
1975-02-01
When Limulus sperm are induced to undergo the acrosomal reaction, a process, 50 mum in length, is generated in a few seconds. This process rotates as it elongates; thus the acrosomal process literally screws through the jelly of the egg. Within the process is a bundle of filaments which before induction are coiled up inside the sperm. The filament bundle exists in three stable states in the sperm. One of the states can be isolated in pure form. It is composed of only three proteins whose molecular weights (mol wt) are 43,000, 55,000, and 95,000. The 43,000 mol wt protein is actin, based on its molecular weight, net charge, morphology, G-F transformation, and heavy meromyosin (HMM) binding. The 55,000 mol wt protein is in equimolar ratio to actin and is not tubulin, binds tenaciously to actin, and inhibits HMM binding. Evidence is presented that both the 55,000 mol wt protein and the 95,000 mol wt protein (possibly alpha-actinin) are also present in Limulus muscle. Presumably these proteins function in the sperm in holding the actin filaments together. Before the acrosomal reaction, the actin filaments are twisted over one another in a supercoil; when the reaction is completed, the filaments lie parallel to each other and form an actin paracrystal. This change in their packing appears to give rise to the motion of the acrosomal process and is under the control of the 55,000 mol wt protein and the 95,000 mol wt protein.
Milestone, C B; Stuthridge, T R; Fulthorpe, R R
2007-01-01
This paper forms part of series of biological treatment colour behaviour studies. Surveys across a range of mills have observed colour increases in aerated stabilisation basins of 20-45%. Much of the colour formation has been demonstrated to occur in high molecular mass effluent organic constituents (HMM) present in bleach plant effluents. Removing material greater than 3000 Da essentially eliminated the colour forming ability in both E and D stage wastewaters. We have also shown that pulp and paper sludges contain anaerobic bacteria capable of reducing humic like materials. Colour formation was correlated to the anoxic conditions and the availability of readily biodegradable organic constituents during the wastewater treatment process. Overall, these studies suggest that colour formation in pulp and paper biological treatment systems may be caused by anaerobic bacteria using HMM material from the bleaching effluents as an electron acceptor for growth. This leads to the reduction of the material, which in turn leads to non-reversible internal changes, such as intra-molecular polymerisation or formation of chromophoric functional groups.
2007-09-01
B27 B3b. Open water performance characteristics, stock propellers 5234 and 5235 ................... B28 B4. Principal dimensions of...n~nu~ .. *ASA B27 B3b. Open water performance characteristics, stock propellers 5234 and 5235 5234 5235 FAMED OPEN WATER COEFFICIENTS FOR PROPELLER...LA V 00 H MN 0 LAW M 00 MN4 0 ( LA M 4. ~ ~ ~ ~ r 4. LALA (0 m N 0 0 H H- H H H Hm H H vN( N( N((((m 4 HmmN N4 L 0 .( N ( ( HW H HLA mL (n(mNHO H
Nguyen, Nam-Ninh; Srihari, Sriganesh; Leong, Hon Wai; Chong, Ket-Fah
2015-10-01
Determining the entire complement of enzymes and their enzymatic functions is a fundamental step for reconstructing the metabolic network of cells. High quality enzyme annotation helps in enhancing metabolic networks reconstructed from the genome, especially by reducing gaps and increasing the enzyme coverage. Currently, structure-based and network-based approaches can only cover a limited number of enzyme families, and the accuracy of homology-based approaches can be further improved. Bottom-up homology-based approach improves the coverage by rebuilding Hidden Markov Model (HMM) profiles for all known enzymes. However, its clustering procedure relies firmly on BLAST similarity score, ignoring protein domains/patterns, and is sensitive to changes in cut-off thresholds. Here, we use functional domain architecture to score the association between domain families and enzyme families (Domain-Enzyme Association Scoring, DEAS). The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score. We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites. Our analysis shows that our stringent protocol EnzDP can cover up to 90% of enzyme families available in Swiss-Prot. It achieves a high accuracy of 94.5% based on five-fold cross-validation. EnzDP outperforms existing methods across several testing scenarios. Thus, EnzDP serves as a reliable automated tool for enzyme annotation and metabolic network reconstruction. Available at: www.comp.nus.edu.sg/~nguyennn/EnzDP .
Lalys, Florent; Riffaud, Laurent; Bouget, David; Jannin, Pierre
2012-01-01
The need for a better integration of the new generation of Computer-Assisted-Surgical (CAS) systems has been recently emphasized. One necessity to achieve this objective is to retrieve data from the Operating Room (OR) with different sensors, then to derive models from these data. Recently, the use of videos from cameras in the OR has demonstrated its efficiency. In this paper, we propose a framework to assist in the development of systems for the automatic recognition of high level surgical tasks using microscope videos analysis. We validated its use on cataract procedures. The idea is to combine state-of-the-art computer vision techniques with time series analysis. The first step of the framework consisted in the definition of several visual cues for extracting semantic information, therefore characterizing each frame of the video. Five different pieces of image-based classifiers were therefore implemented. A step of pupil segmentation was also applied for dedicated visual cue detection. Time series classification algorithms were then applied to model time-varying data. Dynamic Time Warping (DTW) and Hidden Markov Models (HMM) were tested. This association combined the advantages of all methods for better understanding of the problem. The framework was finally validated through various studies. Six binary visual cues were chosen along with 12 phases to detect, obtaining accuracies of 94%. PMID:22203700
Behaviour Recognition from Sensory Streams in Smart Environments
NASA Astrophysics Data System (ADS)
Chua, Sook-Ling; Marsland, Stephen; Guesgen, Hans W.
One application of smart homes is to take sensor activations from a variety of sensors around the house and use them to recognise the particular behaviours of the inhabitants. This can be useful for monitoring of the elderly or cognitively impaired, amongst other applications. Since the behaviours themselves are not directly observed, only the observations by sensors, it is common to build a probabilistic model of how behaviours arise from these observations, for example in the form of a Hidden Markov Model (HMM). In this paper we present a method of selecting which of a set of trained HMMs best matches the current observations, together with experiments showing that it can reliably detect and segment the sensor stream into behaviours. We demonstrate our algorithm on real sensor data obtained from the MIT PlaceLab. The results show a significant improvement in the recognition accuracy over other approaches.
Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.
Wöllmer, Martin; Marchi, Erik; Squartini, Stefano; Schuller, Björn
2011-09-01
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".
Offline Signature Verification Using the Discrete Radon Transform and a Hidden Markov Model
NASA Astrophysics Data System (ADS)
Coetzer, J.; Herbst, B. M.; du Preez, J. A.
2004-12-01
We developed a system that automatically authenticates offline handwritten signatures using the discrete Radon transform (DRT) and a hidden Markov model (HMM). Given the robustness of our algorithm and the fact that only global features are considered, satisfactory results are obtained. Using a database of 924 signatures from 22 writers, our system achieves an equal error rate (EER) of 18% when only high-quality forgeries (skilled forgeries) are considered and an EER of 4.5% in the case of only casual forgeries. These signatures were originally captured offline. Using another database of 4800 signatures from 51 writers, our system achieves an EER of 12.2% when only skilled forgeries are considered. These signatures were originally captured online and then digitally converted into static signature images. These results compare well with the results of other algorithms that consider only global features.
Popular song and lyrics synchronization and its application to music information retrieval
NASA Astrophysics Data System (ADS)
Chen, Kai; Gao, Sheng; Zhu, Yongwei; Sun, Qibin
2006-01-01
An automatic synchronization system of the popular song and its lyrics is presented in the paper. The system includes two main components: a) automatically detecting vocal/non-vocal in the audio signal and b) automatically aligning the acoustic signal of the song with its lyric using speech recognition techniques and positioning the boundaries of the lyrics in its acoustic realization at the multiple levels simultaneously (e.g. the word / syllable level and phrase level). The GMM models and a set of HMM-based acoustic model units are carefully designed and trained for the detection and alignment. To eliminate the severe mismatch due to the diversity of musical signal and sparse training data available, the unsupervised adaptation technique such as maximum likelihood linear regression (MLLR) is exploited for tailoring the models to the real environment, which improves robustness of the synchronization system. To further reduce the effect of the missed non-vocal music on alignment, a novel grammar net is build to direct the alignment. As we know, this is the first automatic synchronization system only based on the low-level acoustic feature such as MFCC. We evaluate the system on a Chinese song dataset collecting from 3 popular singers. We obtain 76.1% for the boundary accuracy at the syllable level (BAS) and 81.5% for the boundary accuracy at the phrase level (BAP) using fully automatic vocal/non-vocal detection and alignment. The synchronization system has many applications such as multi-modality (audio and textual) content-based popular song browsing and retrieval. Through the study, we would like to open up the discussion of some challenging problems when developing a robust synchronization system for largescale database.
Detection of goal events in soccer videos
NASA Astrophysics Data System (ADS)
Kim, Hyoung-Gook; Roeber, Steffen; Samour, Amjad; Sikora, Thomas
2005-01-01
In this paper, we present an automatic extraction of goal events in soccer videos by using audio track features alone without relying on expensive-to-compute video track features. The extracted goal events can be used for high-level indexing and selective browsing of soccer videos. The detection of soccer video highlights using audio contents comprises three steps: 1) extraction of audio features from a video sequence, 2) event candidate detection of highlight events based on the information provided by the feature extraction Methods and the Hidden Markov Model (HMM), 3) goal event selection to finally determine the video intervals to be included in the summary. For this purpose we compared the performance of the well known Mel-scale Frequency Cepstral Coefficients (MFCC) feature extraction method vs. MPEG-7 Audio Spectrum Projection feature (ASP) extraction method based on three different decomposition methods namely Principal Component Analysis( PCA), Independent Component Analysis (ICA) and Non-Negative Matrix Factorization (NMF). To evaluate our system we collected five soccer game videos from various sources. In total we have seven hours of soccer games consisting of eight gigabytes of data. One of five soccer games is used as the training data (e.g., announcers' excited speech, audience ambient speech noise, audience clapping, environmental sounds). Our goal event detection results are encouraging.
Human body contour data based activity recognition.
Myagmarbayar, Nergui; Yuki, Yoshida; Imamoglu, Nevrez; Gonzalez, Jose; Otake, Mihoko; Yu, Wenwei
2013-01-01
This research work is aimed to develop autonomous bio-monitoring mobile robots, which are capable of tracking and measuring patients' motions, recognizing the patients' behavior based on observation data, and providing calling for medical personnel in emergency situations in home environment. The robots to be developed will bring about cost-effective, safe and easier at-home rehabilitation to most motor-function impaired patients (MIPs). In our previous research, a full framework was established towards this research goal. In this research, we aimed at improving the human activity recognition by using contour data of the tracked human subject extracted from the depth images as the signal source, instead of the lower limb joint angle data used in the previous research, which are more likely to be affected by the motion of the robot and human subjects. Several geometric parameters, such as, the ratio of height to weight of the tracked human subject, and distance (pixels) between centroid points of upper and lower parts of human body, were calculated from the contour data, and used as the features for the activity recognition. A Hidden Markov Model (HMM) is employed to classify different human activities from the features. Experimental results showed that the human activity recognition could be achieved with a high correct rate.
Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.
2007-01-01
The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268
Text Categorization for Multi-Page Documents: A Hybrid Naive Bayes HMM Approach.
ERIC Educational Resources Information Center
Frasconi, Paolo; Soda, Giovanni; Vullo, Alessandro
Text categorization is typically formulated as a concept learning problem where each instance is a single isolated document. This paper is interested in a more general formulation where documents are organized as page sequences, as naturally occurring in digital libraries of scanned books and magazines. The paper describes a method for classifying…
Image segmentation using hidden Markov Gauss mixture models.
Pyun, Kyungsuk; Lim, Johan; Won, Chee Sun; Gray, Robert M
2007-07-01
Image segmentation is an important tool in image processing and can serve as an efficient front end to sophisticated algorithms and thereby simplify subsequent processing. We develop a multiclass image segmentation method using hidden Markov Gauss mixture models (HMGMMs) and provide examples of segmentation of aerial images and textures. HMGMMs incorporate supervised learning, fitting the observation probability distribution given each class by a Gauss mixture estimated using vector quantization with a minimum discrimination information (MDI) distortion. We formulate the image segmentation problem using a maximum a posteriori criteria and find the hidden states that maximize the posterior density given the observation. We estimate both the hidden Markov parameter and hidden states using a stochastic expectation-maximization algorithm. Our results demonstrate that HMGMM provides better classification in terms of Bayes risk and spatial homogeneity of the classified objects than do several popular methods, including classification and regression trees, learning vector quantization, causal hidden Markov models (HMMs), and multiresolution HMMs. The computational load of HMGMM is similar to that of the causal HMM.
One-Shot Learning of Human Activity With an MAP Adapted GMM and Simplex-HMM.
Rodriguez, Mario; Orrite, Carlos; Medrano, Carlos; Makris, Dimitrios
2016-05-10
This paper presents a novel activity class representation using a single sequence for training. The contribution of this representation lays on the ability to train an one-shot learning recognition system, useful in new scenarios where capturing and labeling sequences is expensive or impractical. The method uses a universal background model of local descriptors obtained from source databases available on-line and adapts it to a new sequence in the target scenario through a maximum a posteriori adaptation. Each activity sample is encoded in a sequence of normalized bag of features and modeled by a new hidden Markov model formulation, where the expectation-maximization algorithm for training is modified to deal with observations consisting in vectors in a unit simplex. Extensive experiments in recognition have been performed using one-shot learning over the public datasets Weizmann, KTH, and IXMAS. These experiments demonstrate the discriminative properties of the representation and the validity of application in recognition systems, achieving state-of-the-art results.
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M.; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P.; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M.; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org. PMID:21036865
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
Analysis of Acoustic Features in Speakers with Cognitive Disorders and Speech Impairments
NASA Astrophysics Data System (ADS)
Saz, Oscar; Simón, Javier; Rodríguez, W. Ricardo; Lleida, Eduardo; Vaquero, Carlos
2009-12-01
This work presents the results in the analysis of the acoustic features (formants and the three suprasegmental features: tone, intensity and duration) of the vowel production in a group of 14 young speakers suffering different kinds of speech impairments due to physical and cognitive disorders. A corpus with unimpaired children's speech is used to determine the reference values for these features in speakers without any kind of speech impairment within the same domain of the impaired speakers; this is 57 isolated words. The signal processing to extract the formant and pitch values is based on a Linear Prediction Coefficients (LPCs) analysis of the segments considered as vowels in a Hidden Markov Model (HMM) based Viterbi forced alignment. Intensity and duration are also based in the outcome of the automated segmentation. As main conclusion of the work, it is shown that intelligibility of the vowel production is lowered in impaired speakers even when the vowel is perceived as correct by human labelers. The decrease in intelligibility is due to a 30% of increase in confusability in the formants map, a reduction of 50% in the discriminative power in energy between stressed and unstressed vowels and to a 50% increase of the standard deviation in the length of the vowels. On the other hand, impaired speakers keep good control of tone in the production of stressed and unstressed vowels.
Mining protein loops using a structural alphabet and statistical exceptionality
2010-01-01
Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/. PMID:20132552
Mining protein loops using a structural alphabet and statistical exceptionality.
Regad, Leslie; Martin, Juliette; Nuel, Gregory; Camproux, Anne-Claude
2010-02-04
Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 A). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.
Sourty, Marion; Thoraval, Laurent; Roquet, Daniel; Armspach, Jean-Paul; Foucher, Jack; Blanc, Frédéric
2016-01-01
Exploring time-varying connectivity networks in neurodegenerative disorders is a recent field of research in functional MRI. Dementia with Lewy bodies (DLB) represents 20% of the neurodegenerative forms of dementia. Fluctuations of cognition and vigilance are the key symptoms of DLB. To date, no dynamic functional connectivity (DFC) investigations of this disorder have been performed. In this paper, we refer to the concept of connectivity state as a piecewise stationary configuration of functional connectivity between brain networks. From this concept, we propose a new method for group-level as well as for subject-level studies to compare and characterize connectivity state changes between a set of resting-state networks (RSNs). Dynamic Bayesian networks, statistical and graph theory-based models, enable one to learn dependencies between interacting state-based processes. Product hidden Markov models (PHMM), an instance of dynamic Bayesian networks, are introduced here to capture both statistical and temporal aspects of DFC of a set of RSNs. This analysis was based on sliding-window cross-correlations between seven RSNs extracted from a group independent component analysis performed on 20 healthy elderly subjects and 16 patients with DLB. Statistical models of DFC differed in patients compared to healthy subjects for the occipito-parieto-frontal network, the medial occipital network and the right fronto-parietal network. In addition, pairwise comparisons of DFC of RSNs revealed a decrease of dependency between these two visual networks (occipito-parieto-frontal and medial occipital networks) and the right fronto-parietal control network. The analysis of DFC state changes thus pointed out networks related to the cognitive functions that are known to be impaired in DLB: visual processing as well as attentional and executive functions. Besides this context, product HMM applied to RSNs cross-correlations offers a promising new approach to investigate structural and temporal aspects of brain DFC.
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tu, Q.; Deng, Ye; Lin, Lu
Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
Identification of the sequence motif of glycoside hydrolase 13 family members
Kumar, Vikash
2011-01-01
A bioinformatics analysis of sequences of enzymes of the glycoside hydrolase (GH) 13 family members such as α-amylase, cyclodextrin glycosyltransferase (CGTase), branching enzyme and cyclomaltodextrinase has been carried out in order to find out the sequence motifs that govern the reactions specificities of these enzymes by using hidden Markov model (HMM) profile. This analysis suggests the existence of such sequence motifs and residues of these motifs constituting the −1 to +3 catalytic subsites of the enzyme. Hence, by introducing mutations in the residues of these four subsites, one can change the reaction specificities of the enzymes. In general it has been observed that α -amylase sequence motif have low sequence conservation than rest of the motifs of the GH13 family members. PMID:21544166
Recognition of surgical skills using hidden Markov models
NASA Astrophysics Data System (ADS)
Speidel, Stefanie; Zentek, Tom; Sudra, Gunther; Gehrig, Tobias; Müller-Stich, Beat Peter; Gutt, Carsten; Dillmann, Rüdiger
2009-02-01
Minimally invasive surgery is a highly complex medical discipline and can be regarded as a major breakthrough in surgical technique. A minimally invasive intervention requires enhanced motor skills to deal with difficulties like the complex hand-eye coordination and restricted mobility. To alleviate these constraints we propose to enhance the surgeon's capabilities by providing a context-aware assistance using augmented reality techniques. To recognize and analyze the current situation for context-aware assistance, we need intraoperative sensor data and a model of the intervention. Characteristics of a situation are the performed activity, the used instruments, the surgical objects and the anatomical structures. Important information about the surgical activity can be acquired by recognizing the surgical gesture performed. Surgical gestures in minimally invasive surgery like cutting, knot-tying or suturing are here referred to as surgical skills. We use the motion data from the endoscopic instruments to classify and analyze the performed skill and even use it for skill evaluation in a training scenario. The system uses Hidden Markov Models (HMM) to model and recognize a specific surgical skill like knot-tying or suturing with an average recognition rate of 92%.
Computational intelligence techniques for biological data mining: An overview
NASA Astrophysics Data System (ADS)
Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari
2014-10-01
Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.
Fan, Qixiang; Qiang, Maoshan
2014-01-01
The concern for workers' safety in construction industry is reflected in many studies focusing on static safety risk identification and assessment. However, studies on real-time safety risk assessment aimed at reducing uncertainty and supporting quick response are rare. A method for real-time safety risk assessment (RTSRA) to implement a dynamic evaluation of worker safety states on construction site has been proposed in this paper. The method provides construction managers who are in charge of safety with more abundant information to reduce the uncertainty of the site. A quantitative calculation formula, integrating the influence of static and dynamic hazards and that of safety supervisors, is established to link the safety risk of workers with the locations of on-site assets. By employing the hidden Markov model (HMM), the RTSRA provides a mechanism for processing location data provided by the real-time location system (RTLS) and analyzing the probability distributions of different states in terms of false positives and negatives. Simulation analysis demonstrated the logic of the proposed method and how it works. Application case shows that the proposed RTSRA is both feasible and effective in managing construction project safety concerns. PMID:25114958
Jiang, Hanchen; Lin, Peng; Fan, Qixiang; Qiang, Maoshan
2014-01-01
The concern for workers' safety in construction industry is reflected in many studies focusing on static safety risk identification and assessment. However, studies on real-time safety risk assessment aimed at reducing uncertainty and supporting quick response are rare. A method for real-time safety risk assessment (RTSRA) to implement a dynamic evaluation of worker safety states on construction site has been proposed in this paper. The method provides construction managers who are in charge of safety with more abundant information to reduce the uncertainty of the site. A quantitative calculation formula, integrating the influence of static and dynamic hazards and that of safety supervisors, is established to link the safety risk of workers with the locations of on-site assets. By employing the hidden Markov model (HMM), the RTSRA provides a mechanism for processing location data provided by the real-time location system (RTLS) and analyzing the probability distributions of different states in terms of false positives and negatives. Simulation analysis demonstrated the logic of the proposed method and how it works. Application case shows that the proposed RTSRA is both feasible and effective in managing construction project safety concerns.
Learning cellular sorting pathways using protein interactions and sequence motifs.
Lin, Tien-Ho; Bar-Joseph, Ziv; Murphy, Robert F
2011-11-01
Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.
Feasibility Study for Hotel/Motel Career Program for Harper College. Volume XIX, No. 1.
ERIC Educational Resources Information Center
Lucas, John A.; And Others
In spring 1990, a study was conducted at William Rainey Harper College (WRHC) to determine the feasibility of adding a career program in Hotel/Motel Management (HMM) to the current Food Service Program. Surveys were sent to 53 hotels and motels in the WRHC service area to determine employment demands that would affect the hiring of graduates of…
Xu, Jingting; Hu, Hong; Dai, Yang
The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their potential in enhancer prediction despite the fact that low methylated regions (LMRs) have been implied to be distal active regulatory regions. In this work, we propose a prediction framework, LMethyR-SVM, using LMRs identified from cell-type-specific WGBS DNA methylation profiles and a weighted support vector machine learning framework. In LMethyR-SVM, the set of cell-type-specific LMRs is further divided into three sets: reliable positive, like positive and likely negative, according to their resemblance to a small set of experimentally validated enhancers in the VISTA database based on an estimated non-parametric density distribution. Then, the prediction model is obtained by solving a weighted support vector machine. We demonstrate the performance of LMethyR-SVM by using the WGBS DNA methylation profiles derived from the human embryonic stem cell type (H1) and the fetal lung fibroblast cell type (IMR90). The predicted enhancers are highly conserved with a reasonable validation rate based on a set of commonly used positive markers including transcription factors, p300 binding and DNase-I hypersensitive sites. In addition, we show evidence that the large fraction of the LMethyR-SVM predicted enhancers are not predicted by ChromHMM in H1 cell type and they are more enriched for the FANTOM5 enhancers. Our work suggests that low methylated regions detected from the WGBS data are useful as complementary resources to histone modification marks in developing models for the prediction of cell-type-specific enhancers.
Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.
Neuwald, Andrew F; Altschul, Stephen F
2016-12-01
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).
Rohlman, C E; Blanco, M R; Walter, N G
2016-01-01
The spliceosome is a biomolecular machine that, in all eukaryotes, accomplishes site-specific splicing of introns from precursor messenger RNAs (pre-mRNAs) with high fidelity. Operating at the nanometer scale, where inertia and friction have lost the dominant role they play in the macroscopic realm, the spliceosome is highly dynamic and assembles its active site around each pre-mRNA anew. To understand the structural dynamics underlying the molecular motors, clocks, and ratchets that achieve functional accuracy in the yeast spliceosome (a long-standing model system), we have developed single-molecule fluorescence resonance energy transfer (smFRET) approaches that report changes in intra- and intermolecular interactions in real time. Building on our work using hidden Markov models (HMMs) to extract kinetic and conformational state information from smFRET time trajectories, we recognized that HMM analysis of individual state transitions as independent stochastic events is insufficient for a biomolecular machine as complex as the spliceosome. In this chapter, we elaborate on the recently developed smFRET-based Single-Molecule Cluster Analysis (SiMCAn) that dissects the intricate conformational dynamics of a pre-mRNA through the splicing cycle in a model-free fashion. By leveraging hierarchical clustering techniques developed for Bioinformatics, SiMCAn efficiently analyzes large datasets to first identify common molecular behaviors. Through a second level of clustering based on the abundance of dynamic behaviors exhibited by defined functional intermediates that have been stalled by biochemical or genetic tools, SiMCAn then efficiently assigns pre-mRNA FRET states and transitions to specific splicing complexes, with the potential to find heretofore undescribed conformations. SiMCAn thus arises as a general tool to analyze dynamic cellular machines more broadly. © 2016 Elsevier Inc. All rights reserved.
Zou, Yi-Bo; Chen, Yi-Min; Gao, Ming-Ke; Liu, Quan; Jiang, Si-Yu; Lu, Jia-Hui; Huang, Chen; Li, Ze-Yu; Zhang, Dian-Hua
2017-08-01
Coronary heart disease preoperative diagnosis plays an important role in the treatment of vascular interventional surgery. Actually, most doctors are used to diagnosing the position of the vascular stenosis and then empirically estimating vascular stenosis by selective coronary angiography images instead of using mouse, keyboard and computer during preoperative diagnosis. The invasive diagnostic modality is short of intuitive and natural interaction and the results are not accurate enough. Aiming at above problems, the coronary heart disease preoperative gesture interactive diagnostic system based on Augmented Reality is proposed. The system uses Leap Motion Controller to capture hand gesture video sequences and extract the features which that are the position and orientation vector of the gesture motion trajectory and the change of the hand shape. The training planet is determined by K-means algorithm and then the effect of gesture training is improved by multi-features and multi-observation sequences for gesture training. The reusability of gesture is improved by establishing the state transition model. The algorithm efficiency is improved by gesture prejudgment which is used by threshold discriminating before recognition. The integrity of the trajectory is preserved and the gesture motion space is extended by employing space rotation transformation of gesture manipulation plane. Ultimately, the gesture recognition based on SRT-HMM is realized. The diagnosis and measurement of the vascular stenosis are intuitively and naturally realized by operating and measuring the coronary artery model with augmented reality and gesture interaction techniques. All of the gesture recognition experiments show the distinguish ability and generalization ability of the algorithm and gesture interaction experiments prove the availability and reliability of the system.
Energy-aware scheduling of surveillance in wireless multimedia sensor networks.
Wang, Xue; Wang, Sheng; Ma, Junjie; Sun, Xinyao
2010-01-01
Wireless sensor networks involve a large number of sensor nodes with limited energy supply, which impacts the behavior of their application. In wireless multimedia sensor networks, sensor nodes are equipped with audio and visual information collection modules. Multimedia contents are ubiquitously retrieved in surveillance applications. To solve the energy problems during target surveillance with wireless multimedia sensor networks, an energy-aware sensor scheduling method is proposed in this paper. Sensor nodes which acquire acoustic signals are deployed randomly in the sensing fields. Target localization is based on the signal energy feature provided by multiple sensor nodes, employing particle swarm optimization (PSO). During the target surveillance procedure, sensor nodes are adaptively grouped in a totally distributed manner. Specially, the target motion information is extracted by a forecasting algorithm, which is based on the hidden Markov model (HMM). The forecasting results are utilized to awaken sensor node in the vicinity of future target position. According to the two properties, signal energy feature and residual energy, the sensor nodes decide whether to participate in target detection separately with a fuzzy control approach. Meanwhile, the local routing scheme of data transmission towards the observer is discussed. Experimental results demonstrate the efficiency of energy-aware scheduling of surveillance in wireless multimedia sensor network, where significant energy saving is achieved by the sensor awakening approach and data transmission paths are calculated with low computational complexity.
Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.
2016-10-07
Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less
Nguyen, Thao Thi; Chon, Tae-Soo; Kim, Jaehan; Seo, Young-Su; Heo, Muyoung
2017-07-01
Secreted proteins (secretomes) play crucial roles during bacterial pathogenesis in both plant and human hosts. The identification and characterization of secretomes in the two plant pathogens Burkholderia glumae BGR1 and B. gladioli BSR3, which cause diseases in rice such as seedling blight, panicle blight, and grain rot, are important steps to not only understand the disease-causing mechanisms but also find remedies for the diseases. Here, we identified two datasets of secretomes in B. glumae BGR1 and B. gladioli BSR3, which consist of 118 and 111 proteins, respectively, using mass spectrometry approach and literature curation. Next, we characterized the functional properties, potential secretion pathways and sequence information properties of secretomes of two plant pathogens in a comparative analysis by various computational approaches. The ratio of potential non-classically secreted proteins (NCSPs) to classically secreted proteins (CSPs) in B. glumae BGR1 was greater than that in B. gladioli BSR3. For CSPs, the putative hydrophobic regions (PHRs) which are essential for secretion process of CSPs were screened in detail at their N-terminal sequences using hidden Markov model (HMM)-based method. Total 31 pairs of homologous proteins in two bacterial secretomes were indicated based on the global alignment (identity ≥ 70%). Our results may facilitate the understanding of the species-specific features of secretomes in two plant pathogenic Burkholderia species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.
Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less
2017-01-01
Abstract Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes. PMID:28180325
Modeling Driver Behavior near Intersections in Hidden Markov Model
Li, Juan; He, Qinglian; Zhou, Hang; Guan, Yunlin; Dai, Wei
2016-01-01
Intersections are one of the major locations where safety is a big concern to drivers. Inappropriate driver behaviors in response to frequent changes when approaching intersections often lead to intersection-related crashes or collisions. Thus to better understand driver behaviors at intersections, especially in the dilemma zone, a Hidden Markov Model (HMM) is utilized in this study. With the discrete data processing, the observed dynamic data of vehicles are used for the inference of the Hidden Markov Model. The Baum-Welch (B-W) estimation algorithm is applied to calculate the vehicle state transition probability matrix and the observation probability matrix. When combined with the Forward algorithm, the most likely state of the driver can be obtained. Thus the model can be used to measure the stability and risk of driver behavior. It is found that drivers’ behaviors in the dilemma zone are of lower stability and higher risk compared with those in other regions around intersections. In addition to the B-W estimation algorithm, the Viterbi Algorithm is utilized to predict the potential dangers of vehicles. The results can be applied to driving assistance systems to warn drivers to avoid possible accidents. PMID:28009838
N-substituted methyl maleamates as larvicidal compounds against Aedes aegypti (Diptera: Culicidae).
Harburguer, Laura; Gonzalez, Paula V; Gonzalez Audino, Paola; Zerba, Eduardo; Masuh, Héctor
2018-02-01
Severe human arboviral diseases can be transmitted by the mosquito Aedes aegypti (L.), including dengue, chikungunya, zika, and yellow fever. The use of larvicides in containers that can result as potential breeding places and cannot be eliminated is the main alternative in control programs. However, their continuous and widespread use caused an increase in insecticide-resistant populations of this mosquito. The aim of this study was to evaluate the effect of three N-substituted methyl maleamates as larvicides on Ae. aegypti, the N-propyl methyl maleamate (PMM), N-butyl methyl maleamate (BMM), and N-hexyl methyl maleamate (HMM). These compounds could have a different mode of action from those larvicides known so far. We evaluated the larva mortality after 1 and 24 h of exposure and we found that mortality was fast and occurs within the first 60 min. HMM was slightly more effective with LC 50 values of 0.7 and 0.3 ppm for 1 and 24 h of exposure and LC 95 of 11 and 3 ppm. Our results demonstrate that N-substituted methyl maleamates have insecticidal properties for the control of Ae. aegypti larvae. These compounds could become useful alternatives to traditional larvicides after studying their insecticidal mechanism as well as their toxicity towards non target organisms.
Discriminating the reaction types of plant type III polyketide synthases
Shimizu, Yugo; Ogata, Hiroyuki; Goto, Susumu
2017-01-01
Abstract Motivation: Functional prediction of paralogs is challenging in bioinformatics because of rapid functional diversification after gene duplication events combined with parallel acquisitions of similar functions by different paralogs. Plant type III polyketide synthases (PKSs), producing various secondary metabolites, represent a paralogous family that has undergone gene duplication and functional alteration. Currently, there is no computational method available for the functional prediction of type III PKSs. Results: We developed a plant type III PKS reaction predictor, pPAP, based on the recently proposed classification of type III PKSs. pPAP combines two kinds of similarity measures: one calculated by profile hidden Markov models (pHMMs) built from functionally and structurally important partial sequence regions, and the other based on mutual information between residue positions. pPAP targets PKSs acting on ring-type starter substrates, and classifies their functions into four reaction types. The pHMM approach discriminated two reaction types with high accuracy (97.5%, 39/40), but its accuracy decreased when discriminating three reaction types (87.8%, 43/49). When combined with a correlation-based approach, all 49 PKSs were correctly discriminated, and pPAP was still highly accurate (91.4%, 64/70) even after adding other reaction types. These results suggest pPAP, which is based on linear discriminant analyses of similarity measures, is effective for plant type III PKS function prediction. Availability and Implementation: pPAP is freely available at ftp://ftp.genome.jp/pub/tools/ppap/ Contact: goto@kuicr.kyoto-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28334262
Pineda-Peña, Andrea-Clemencia; Faria, Nuno Rodrigues; Imbrechts, Stijn; Libin, Pieter; Abecasis, Ana Barroso; Deforche, Koen; Gómez-López, Arley; Camacho, Ricardo J; de Oliveira, Tulio; Vandamme, Anne-Mieke
2013-10-01
To investigate differences in pathogenesis, diagnosis and resistance pathways between HIV-1 subtypes, an accurate subtyping tool for large datasets is needed. We aimed to evaluate the performance of automated subtyping tools to classify the different subtypes and circulating recombinant forms using pol, the most sequenced region in clinical practice. We also present the upgraded version 3 of the Rega HIV subtyping tool (REGAv3). HIV-1 pol sequences (PR+RT) for 4674 patients retrieved from the Portuguese HIV Drug Resistance Database, and 1872 pol sequences trimmed from full-length genomes retrieved from the Los Alamos database were classified with statistical-based tools such as COMET, jpHMM and STAR; similarity-based tools such as NCBI and Stanford; and phylogenetic-based tools such as REGA version 2 (REGAv2), REGAv3, and SCUEAL. The performance of these tools, for pol, and for PR and RT separately, was compared in terms of reproducibility, sensitivity and specificity with respect to the gold standard which was manual phylogenetic analysis of the pol region. The sensitivity and specificity for subtypes B and C was more than 96% for seven tools, but was variable for other subtypes such as A, D, F and G. With regard to the most common circulating recombinant forms (CRFs), the sensitivity and specificity for CRF01_AE was ~99% with statistical-based tools, with phylogenetic-based tools and with Stanford, one of the similarity based tools. CRF02_AG was correctly identified for more than 96% by COMET, REGAv3, Stanford and STAR. All the tools reached a specificity of more than 97% for most of the subtypes and the two main CRFs (CRF01_AE and CRF02_AG). Other CRFs were identified only by COMET, REGAv2, REGAv3, and SCUEAL and with variable sensitivity. When analyzing sequences for PR and RT separately, the performance for PR was generally lower and variable between the tools. Similarity and statistical-based tools were 100% reproducible, but this was lower for phylogenetic-based tools such as REGA (~99%) and SCUEAL (~96%). REGAv3 had an improved performance for subtype B and CRF02_AG compared to REGAv2 and is now able to also identify all epidemiologically relevant CRFs. In general the best performing tools, in alphabetical order, were COMET, jpHMM, REGAv3, and SCUEAL when analyzing pure subtypes in the pol region, and COMET and REGAv3 when analyzing most of the CRFs. Based on this study, we recommend to confirm subtyping with 2 well performing tools, and be cautious with the interpretation of short sequences. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs
Lin, Tien-Ho; Bar-Joseph, Ziv
2011-01-01
Abstract Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/. PMID:21999284
Detection and diagnosis of bearing and cutting tool faults using hidden Markov models
NASA Astrophysics Data System (ADS)
Boutros, Tony; Liang, Ming
2011-08-01
Over the last few decades, the research for new fault detection and diagnosis techniques in machining processes and rotating machinery has attracted increasing interest worldwide. This development was mainly stimulated by the rapid advance in industrial technologies and the increase in complexity of machining and machinery systems. In this study, the discrete hidden Markov model (HMM) is applied to detect and diagnose mechanical faults. The technique is tested and validated successfully using two scenarios: tool wear/fracture and bearing faults. In the first case the model correctly detected the state of the tool (i.e., sharp, worn, or broken) whereas in the second application, the model classified the severity of the fault seeded in two different engine bearings. The success rate obtained in our tests for fault severity classification was above 95%. In addition to the fault severity, a location index was developed to determine the fault location. This index has been applied to determine the location (inner race, ball, or outer race) of a bearing fault with an average success rate of 96%. The training time required to develop the HMMs was less than 5 s in both the monitoring cases.
López-Orenes, Antonio; Bueso, María C; Conesa, Héctor M; Calderón, Antonio A; Ferrer, María A
2017-01-01
Soil pollution by heavy metals/metalloids (HMMs) is a problem worldwide. To prevent dispersion of contaminated particles by erosion, the maintenance of a vegetative cover is needed. Successful plant establishment in multi-polluted soils can be hampered not only by HMM toxicities, but also by soil nutrient deficiencies and the co-occurrence of abiotic stresses. Some plant species are able to thrive under these multi-stress scenarios often linked to marked fluctuations in environmental factors. This study aimed to investigate the metabolic adjustments involved in Zygophyllum fabago acclimative responses to conditions prevailing in HMM-enriched mine-tailings piles, during Mediterranean spring and summer. To this end, fully expanded leaves, and rhizosphere soil, of three contrasting mining and non-mining populations of Z. fabago grown spontaneously in south-eastern Spain were sampled in two consecutive years. Approximately 50 biochemical, physiological and edaphic parameters were examined, including leaf redox components, primary and secondary metabolites, endogenous levels of salicylic acid, and physicochemical properties of soil (fertility parameters and total concentration of HMMs). Multivariate data analysis showed a clear distinction in antioxidative/oxidative profiles between and within the populations studied. Levels of chlorophylls, proteins and proline characterized control plants whereas antioxidant capacity and C- and S-based antioxidant compounds were biomarkers of mining plants. Seasonal variations were characterized by higher levels of alkaloids and PAL and soluble peroxidase activities in summer, and by soluble sugars and hydroxycinnamic acids in spring irrespective of the population considered. Although the antioxidant systems are subjected to seasonal variations, the way and the intensity with which every population changes its antioxidative/oxidative profile seem to be determined by soil conditions. In short, Z. fabago displays a high physiological plasticity that allow it to successfully shift its metabolism to withstand the multiple stresses that plants must cope with in mine tailings piles under Mediterranean climatic conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
Landfors, Mattias; Philip, Philge; Rydén, Patrik; Stenberg, Per
2011-01-01
Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher sensitivity and lower bias than can be attained using standard and invariant normalization methods. PMID:22132175
Hand, belt, pocket or bag: Practical activity tracking with mobile phones
Antos, Stephen A.; Albert, Mark V.; Kording, Konrad P.
2013-01-01
For rehabilitation and diagnoses, an understanding of patient activities and movements is important. Modern smartphones have built in accelerometers which promise to enable quantifying minute-by-minute what patients do (e.g. walk or sit). Such a capability could inform recommendations of physical activities and improve medical diagnostics. However, a major problem is that during everyday life, we carry our phone in different ways, e.g. on our belt, in our pocket, in our hand, or in a bag. The recorded accelerations are not only affected by our activities but also by the phone’s location. Here we develop a method to solve this kind of problem, based on the intuition that activities change rarely, and phone locations change even less often. A Hidden Markov Model (HMM) tracks changes across both activities and locations, enabled by a static Support Vector Machine (SVM) classifier that probabilistically identifies activity-location pairs. We find that this approach improves tracking accuracy on healthy subjects as compared to a static classifier alone. The obtained method can be readily applied to patient populations. Our research enables the use of phones as activity tracking devices, without the need of previous approaches to instruct subjects to always carry the phone in the same location. PMID:24091138
Hand, belt, pocket or bag: Practical activity tracking with mobile phones.
Antos, Stephen A; Albert, Mark V; Kording, Konrad P
2014-07-15
For rehabilitation and diagnoses, an understanding of patient activities and movements is important. Modern smartphones have built in accelerometers which promise to enable quantifying minute-by-minute what patients do (e.g. walk or sit). Such a capability could inform recommendations of physical activities and improve medical diagnostics. However, a major problem is that during everyday life, we carry our phone in different ways, e.g. on our belt, in our pocket, in our hand, or in a bag. The recorded accelerations are not only affected by our activities but also by the phone's location. Here we develop a method to solve this kind of problem, based on the intuition that activities change rarely, and phone locations change even less often. A hidden Markov model (HMM) tracks changes across both activities and locations, enabled by a static support vector machine (SVM) classifier that probabilistically identifies activity-location pairs. We find that this approach improves tracking accuracy on healthy subjects as compared to a static classifier alone. The obtained method can be readily applied to patient populations. Our research enables the use of phones as activity tracking devices, without the need of previous approaches to instruct subjects to always carry the phone in the same location. Copyright © 2013 Elsevier B.V. All rights reserved.
Lozano, Roberto; Ponce, Olga; Ramirez, Manuel; Mostajo, Nelly; Orjeda, Gisella
2012-01-01
The majority of disease resistance (R) genes identified to date in plants encode a nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domain containing protein. Additional domains such as coiled-coil (CC) and TOLL/interleukin-1 receptor (TIR) domains can also be present. In the recently sequenced Solanum tuberosum group phureja genome we used HMM models and manual curation to annotate 435 NBS-encoding R gene homologs and 142 NBS-derived genes that lack the NBS domain. Highly similar homologs for most previously documented Solanaceae R genes were identified. A surprising ∼41% (179) of the 435 NBS-encoding genes are pseudogenes primarily caused by premature stop codons or frameshift mutations. Alignment of 81.80% of the 577 homologs to S. tuberosum group phureja pseudomolecules revealed non-random distribution of the R-genes; 362 of 470 genes were found in high density clusters on 11 chromosomes. PMID:22493716
Hyperbolic metamaterial lens with hydrodynamic nonlocal response.
Yan, Wei; Mortensen, N Asger; Wubs, Martijn
2013-06-17
We investigate the effects of hydrodynamic nonlocal response in hyperbolic metamaterials (HMMs), focusing on the experimentally realizable parameter regime where unit cells are much smaller than an optical wavelength but much larger than the wavelengths of the longitudinal pressure waves of the free-electron plasma in the metal constituents. We derive the nonlocal corrections to the effective material parameters analytically, and illustrate the noticeable nonlocal effects on the dispersion curves numerically. As an application, we find that the focusing characteristics of a HMM lens in the local-response approximation and in the hydrodynamic Drude model can differ considerably. In particular, the optimal frequency for imaging in the nonlocal theory is blueshifted with respect to that in the local theory. Thus, to detect whether nonlocal response is at work in a hyperbolic metamaterial, we propose to measure the near-field distribution of a hyperbolic metamaterial lens.
Phagonaute: A web-based interface for phage synteny browsing and protein function prediction.
Delattre, Hadrien; Souiai, Oussema; Fagoonee, Khema; Guerois, Raphaël; Petit, Marie-Agnès
2016-09-01
Distant homology search tools are of great help to predict viral protein functions. However, due to the lack of profile databases dedicated to viruses, they can lack sensitivity. We constructed HMM profiles for more than 80,000 proteins from both phages and archaeal viruses, and performed all pairwise comparisons with HHsearch program. The whole resulting database can be explored through a user-friendly "Phagonaute" interface to help predict functions. Results are displayed together with their genetic context, to strengthen inferences based on remote homology. Beyond function prediction, this tool permits detections of co-occurrences, often indicative of proteins completing a task together, and observation of conserved patterns across large evolutionary distances. As a test, Herpes simplex virus I was added to Phagonaute, and 25% of its proteome matched to bacterial or archaeal viral protein counterparts. Phagonaute should therefore help virologists in their quest for protein functions and evolutionary relationships. Copyright © 2016 Elsevier Inc. All rights reserved.
Nanopatterned organic semiconductors for visible light communications
NASA Astrophysics Data System (ADS)
Yang, Xilu; Dong, Yurong; Zeng, Pan; Yu, Yan; Xie, Yujun; Gong, Junyi; Shi, Meng; Liang, Rongqing; Ou, Qiongrong; Chi, Nan; Zhang, Shuyu
2018-03-01
Visible light communication (VLC) is becoming an important and promising supplement to the existing Wi-Fi network for the coming 5G communications. Organic light-emitting semiconductors present much fast fluorescent decay rates compared to those of conventional colour-converting phosphors, therefore capable of achieving much higher bandwidths. Here we explore how nanopatterned organic semiconductors can further enhance the data rates of VLC links by improving bandwidths and signal-to-noise ratios (SNRs) and by supporting spatial multiplexing. We first demonstrate a colour-converting VLC system based on nanopatterned hyperbolic metamaterials (HMM), the bandwidth of which is enhanced by 50%. With regard to enhancing SNRs, we achieve a tripling of optical gain by integrating a nanopatterned luminescent concentrator to a signal receiver. In addition, we demonstrate highly directional fluorescent VLC antennas based on nanoimprinted polymer films, paving the way to achieving parallel VLC communications via spatialmultiplexing. These results indicate nanopatterned organic semiconductors provide a promising route to high speed VLC links.
Constrained Versions of DEDICOM for Use in Unsupervised Part-Of-Speech Tagging
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel; Chew, Peter A.
This reports describes extensions of DEDICOM (DEcomposition into DIrectional COMponents) data models [3] that incorporate bound and linear constraints. The main purpose of these extensions is to investigate the use of improved data models for unsupervised part-of-speech tagging, as described by Chew et al. [2]. In that work, a single domain, two-way DEDICOM model was computed on a matrix of bigram fre- quencies of tokens in a corpus and used to identify parts-of-speech as an unsupervised approach to that problem. An open problem identi ed in that work was the com- putation of a DEDICOM model that more closely resembledmore » the matrices used in a Hidden Markov Model (HMM), speci cally through post-processing of the DEDICOM factor matrices. The work reported here consists of the description of several models that aim to provide a direct solution to that problem and a way to t those models. The approach taken here is to incorporate the model requirements as bound and lin- ear constrains into the DEDICOM model directly and solve the data tting problem as a constrained optimization problem. This is in contrast to the typical approaches in the literature, where the DEDICOM model is t using unconstrained optimization approaches, and model requirements are satis ed as a post-processing step.« less
Sola, J; Braun, F; Muntane, E; Verjus, C; Bertschi, M; Hugon, F; Manzano, S; Benissa, M; Gervaix, A
2016-08-01
Pneumonia remains the worldwide leading cause of children mortality under the age of five, with every year 1.4 million deaths. Unfortunately, in low resource settings, very limited diagnostic support aids are provided to point-of-care practitioners. Current UNICEF/WHO case management algorithm relies on the use of a chronometer to manually count breath rates on pediatric patients: there is thus a major need for more sophisticated tools to diagnose pneumonia that increase sensitivity and specificity of breath-rate-based algorithms. These tools should be low cost, and adapted to practitioners with limited training. In this work, a novel concept of unsupervised tool for the diagnosis of childhood pneumonia is presented. The concept relies on the automated analysis of respiratory sounds as recorded by a point-of-care electronic stethoscope. By identifying the presence of auscultation sounds at different chest locations, this diagnostic tool is intended to estimate a pneumonia likelihood score. After presenting the overall architecture of an algorithm to estimate pneumonia scores, the importance of a robust unsupervised method to identify inspiratory and expiratory phases of a respiratory cycle is highlighted. Based on data from an on-going study involving pediatric pneumonia patients, a first algorithm to segment respiratory sounds is suggested. The unsupervised algorithm relies on a Mel-frequency filter bank, a two-step Gaussian Mixture Model (GMM) description of data, and a final Hidden Markov Model (HMM) interpretation of inspiratory-expiratory sequences. Finally, illustrative results on first recruited patients are provided. The presented algorithm opens the doors to a new family of unsupervised respiratory sound analyzers that could improve future versions of case management algorithms for the diagnosis of pneumonia in low-resources settings.
HMM Sequential Hypothesis Tests for Intrusion Detection in MANETs Extended Abstract
2003-01-01
securing the routing protocols of mobile ad hoc wireless net- works has been done in prevention. Intrusion detection systems play a complimentary...TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 10 19a. NAME OF RESPONSIBLE PERSON a. REPORT unclassified...hops of A would be unable to communicate with B and vice versa [1]. 1.2 The role of intrusion detection in security In order to provide reliable
De novo identification of highly diverged protein repeats by probabilistic consistency.
Biegert, A; Söding, J
2008-03-15
An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Genome-Wide Analysis of Type VI System Clusters and Effectors in Burkholderia Species.
Nguyen, Thao Thi; Lee, Hyun-Hee; Park, Inmyoung; Seo, Young-Su
2018-02-01
Type VI secretion system (T6SS) has been discovered in a variety of gram-negative bacteria as a versatile weapon to stimulate the killing of eukaryotic cells or prokaryotic competitors. Type VI secretion effectors (T6SEs) are well known as key virulence factors for important pathogenic bacteria. In many Burkholderia species, T6SS has evolved as the most complicated secretion pathway with distinguished types to translocate diverse T6SEs, suggesting their essential roles in this genus. Here we attempted to detect and characterize T6SSs and potential T6SEs in target genomes of plant-associated and environmental Burkholderia species based on computational analyses. In total, 66 potential functional T6SS clusters were found in 30 target Burkholderia bacterial genomes, of which 33% possess three or four clusters. The core proteins in each cluster were specified and phylogenetic trees of three components (i.e., TssC, TssD, TssL) were constructed to elucidate the relationship among the identified T6SS clusters. Next, we identified 322 potential T6SEs in the target genomes based on homology searches and explored the important domains conserved in effector candidates. In addition, using the screening approach based on the profile hidden Markov model (pHMM) of T6SEs that possess markers for type VI effectors (MIX motif) (MIX T6SEs), 57 revealed proteins that were not included in training datasets were recognized as novel MIX T6SE candidates from the Burkholderia species. This approach could be useful to identify potential T6SEs from other bacterial genomes.
Tumor propagation model using generalized hidden Markov model
NASA Astrophysics Data System (ADS)
Park, Sun Young; Sargent, Dustin
2017-02-01
Tumor tracking and progression analysis using medical images is a crucial task for physicians to provide accurate and efficient treatment plans, and monitor treatment response. Tumor progression is tracked by manual measurement of tumor growth performed by radiologists. Several methods have been proposed to automate these measurements with segmentation, but many current algorithms are confounded by attached organs and vessels. To address this problem, we present a new generalized tumor propagation model considering time-series prior images and local anatomical features using a Hierarchical Hidden Markov model (HMM) for tumor tracking. First, we apply the multi-atlas segmentation technique to identify organs/sub-organs using pre-labeled atlases. Second, we apply a semi-automatic direct 3D segmentation method to label the initial boundary between the lesion and neighboring structures. Third, we detect vessels in the ROI surrounding the lesion. Finally, we apply the propagation model with the labeled organs and vessels to accurately segment and measure the target lesion. The algorithm has been designed in a general way to be applicable to various body parts and modalities. In this paper, we evaluate the proposed algorithm on lung and lung nodule segmentation and tracking. We report the algorithm's performance by comparing the longest diameter and nodule volumes using the FDA lung Phantom data and a clinical dataset.
Characterizing and Differentiating Brain State Dynamics via Hidden Markov Models
Ou, Jinli; Xie, Li; Jin, Changfeng; Li, Xiang; Zhu, Dajiang; Jiang, Rongxin; Chen, Yaowu
2014-01-01
Functional connectivity measured from resting state fMRI (R-fMRI) data has been widely used to examine the brain’s functional activities and has been recently used to characterize and differentiate brain conditions. However, the dynamical transition patterns of the brain’s functional states have been less explored. In this work, we propose a novel computational framework to quantitatively characterize the brain state dynamics via hidden Markov models (HMMs) learned from the observations of temporally dynamic functional connectomics, denoted as functional connectome states. The framework has been applied to the R-fMRI dataset including 44 post-traumatic stress disorder (PTSD) patients and 51 normal control (NC) subjects. Experimental results show that both PTSD and NC brains were undergoing remarkable changes in resting state and mainly transiting amongst a few brain states. Interestingly, further prediction with the best-matched HMM demonstrates that PTSD would enter into, but could not disengage from, a negative mood state. Importantly, 84 % of PTSD patients and 86 % of NC subjects are successfully classified via multiple HMMs using majority voting. PMID:25331991
Application of Dynamic naïve Bayesian classifier to comprehensive drought assessment
NASA Astrophysics Data System (ADS)
Park, D. H.; Lee, J. Y.; Lee, J. H.; KIm, T. W.
2017-12-01
Drought monitoring has already been extensively studied due to the widespread impacts and complex causes of drought. The most important component of drought monitoring is to estimate the characteristics and extent of drought by quantitatively measuring the characteristics of drought. Drought assessment considering different aspects of the complicated drought condition and uncertainty of drought index is great significance in accurate drought monitoring. This study used the dynamic Naïve Bayesian Classifier (DNBC) which is an extension of the Hidden Markov Model (HMM), to model and classify drought by using various drought indices for integrated drought assessment. To provide a stable model for combined use of multiple drought indices, this study employed the DNBC to perform multi-index drought assessment by aggregating the effect of different type of drought and considering the inherent uncertainty. Drought classification was performed by the DNBC using several drought indices: Standardized Precipitation Index (SPI), Streamflow Drought Index (SDI), and Normalized Vegetation Supply Water Index (NVSWI)) that reflect meteorological, hydrological, and agricultural drought characteristics. Overall results showed that in comparison unidirectional (SPI, SDI, and NVSWI) or multivariate (Composite Drought Index, CDI) drought assessment, the proposed DNBC was able to synthetically classify of drought considering uncertainty. Model provided method for comprehensive drought assessment with combined use of different drought indices.
Respiratory disease, behavior, and survival of mountain goat kids
Blanchong, Julie A.; Anderson, Christopher A.; Clark, Nicholas J.; Klaver, Robert W.; Plummer, Paul J.; Cox, Mike; Mcadoo, Caleb; Wolff, Peregrine L.
2018-01-01
Bacterial pneumonia is a threat to bighorn sheep (Ovis canadensis) populations. Bighorn sheep in the East Humboldt Mountain Range (EHR), Nevada, USA, experienced a pneumonia epizootic in 2009–2010. Testing of mountain goats (Oreamnos americanus) that were captured or found dead on this range during and after the epizootic detected bacteria commonly associated with bighorn sheep pneumonia die‐offs. Additionally, in years subsequent to the bighorn sheep epizootic, the mountain goat population had low kid:adult ratios, a common outcome for bighorn sheep populations that have experienced a pneumonia epizootic. We hypothesized that pneumonia was present and negatively affecting mountain goat kids in the EHR. From June–August 2013–2015, we attempted to observe mountain goat kids with marked adult females in the EHR at least once per week to document signs of respiratory disease; identify associations between respiratory disease, activity levels, and subsequent disappearance (i.e., death); and estimate weekly survival. Each time we observed a kid with a marked adult female, we recorded any signs of respiratory disease and collected behavior data that we fit to a 3‐state discrete hidden Markov model (HMM) to predict a kid's state (active vs. sedentary) and its probability of disappearing. We first observed clinical signs of respiratory disease in kids in late July–early August each summer. We observed 8 of 31 kids with marked adult females with signs of respiratory disease on 13 occasions. On 11 of these occasions, the HMM predicted that kids were in the sedentary state, which was associated with increased probability of subsequent death. We estimated overall probability of kid survival from June–August to be 0.19 (95% CI = 0.08–0.38), which was lower than has been reported in other mountain goat populations. We concluded that respiratory disease was present in the mountain goat kids in the EHR and negatively affected their activity levels and survival. Our results raise concerns about potential effects of pneumonia to mountain goat populations and the potential for disease transmission between mountain goats and bighorn sheep where the species are sympatric.
Taborri, Juri; Scalona, Emilia; Palermo, Eduardo; Rossi, Stefano; Cappa, Paolo
2015-09-23
Gait-phase recognition is a necessary functionality to drive robotic rehabilitation devices for lower limbs. Hidden Markov Models (HMMs) represent a viable solution, but they need subject-specific training, making data processing very time-consuming. Here, we validated an inter-subject procedure to avoid the intra-subject one in two, four and six gait-phase models in pediatric subjects. The inter-subject procedure consists in the identification of a standardized parameter set to adapt the model to measurements. We tested the inter-subject procedure both on scalar and distributed classifiers. Ten healthy children and ten hemiplegic children, each equipped with two Inertial Measurement Units placed on shank and foot, were recruited. The sagittal component of angular velocity was recorded by gyroscopes while subjects performed four walking trials on a treadmill. The goodness of classifiers was evaluated with the Receiver Operating Characteristic. The results provided a goodness from good to optimum for all examined classifiers (0 < G < 0.6), with the best performance for the distributed classifier in two-phase recognition (G = 0.02). Differences were found among gait partitioning models, while no differences were found between training procedures with the exception of the shank classifier. Our results raise the possibility of avoiding subject-specific training in HMM for gait-phase recognition and its implementation to control exoskeletons for the pediatric population.
Taborri, Juri; Scalona, Emilia; Palermo, Eduardo; Rossi, Stefano; Cappa, Paolo
2015-01-01
Gait-phase recognition is a necessary functionality to drive robotic rehabilitation devices for lower limbs. Hidden Markov Models (HMMs) represent a viable solution, but they need subject-specific training, making data processing very time-consuming. Here, we validated an inter-subject procedure to avoid the intra-subject one in two, four and six gait-phase models in pediatric subjects. The inter-subject procedure consists in the identification of a standardized parameter set to adapt the model to measurements. We tested the inter-subject procedure both on scalar and distributed classifiers. Ten healthy children and ten hemiplegic children, each equipped with two Inertial Measurement Units placed on shank and foot, were recruited. The sagittal component of angular velocity was recorded by gyroscopes while subjects performed four walking trials on a treadmill. The goodness of classifiers was evaluated with the Receiver Operating Characteristic. The results provided a goodness from good to optimum for all examined classifiers (0 < G < 0.6), with the best performance for the distributed classifier in two-phase recognition (G = 0.02). Differences were found among gait partitioning models, while no differences were found between training procedures with the exception of the shank classifier. Our results raise the possibility of avoiding subject-specific training in HMM for gait-phase recognition and its implementation to control exoskeletons for the pediatric population. PMID:26404309
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
NASA Astrophysics Data System (ADS)
Gao, Yun; Kontoyiannis, Ioannis; Bienenstock, Elie
2008-06-01
Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator. METHODOLOGY: Three new entropy estimators are introduced; two new LZ-based estimators, and the “renewal entropy estimator,” which is tailored to data generated by a binary renewal process. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters in practice. THEORY: We prove that, unlike their earlier versions, the two new LZ-based estimators are universally consistent, that is, they converge to the entropy rate for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state hidden Markov model (HMM) with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. SIMULATION: All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. The main conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency; with small word-lengths it fails to detect longer-range structure in the data, and with longer word-lengths the empirical distribution is severely undersampled, leading to large biases.
Maxwell AFB, Montgomery, Alabama. Revised Uniform Summary of Surface Weather Observations (RUSSWO)
1974-09-19
VISION GAS . MAY 00-02 09 3,8 90 3,8 499 7,7 .1 11.6 3150 ___03.05, 1.1 4,0 1__ __ 4t, 15t6 139, *1 24#7 3188 0o.o13l ,5 397 __ 3t7 10,8 17,6 1__ 1 2497...WINDDIRECTION AND SPEED i (FROM HOURLY OBSERVATIONS) i ( ..1AXWELL AF B A-AA/HMM15C)MIRY 37m,72 sr-P STA TIONl STATIO lolUNK Tgllil "Ol N CLADA NOPAll (IS T
Li, Ao; Liu, Zongzhi; Lezon-Geyda, Kimberly; Sarkar, Sudipa; Lannin, Donald; Schulz, Vincent; Krop, Ian; Winer, Eric; Harris, Lyndsay; Tuck, David
2011-01-01
There is an increasing interest in using single nucleotide polymorphism (SNP) genotyping arrays for profiling chromosomal rearrangements in tumors, as they allow simultaneous detection of copy number and loss of heterozygosity with high resolution. Critical issues such as signal baseline shift due to aneuploidy, normal cell contamination, and the presence of GC content bias have been reported to dramatically alter SNP array signals and complicate accurate identification of aberrations in cancer genomes. To address these issues, we propose a novel Global Parameter Hidden Markov Model (GPHMM) to unravel tangled genotyping data generated from tumor samples. In contrast to other HMM methods, a distinct feature of GPHMM is that the issues mentioned above are quantitatively modeled by global parameters and integrated within the statistical framework. We developed an efficient EM algorithm for parameter estimation. We evaluated performance on three data sets and show that GPHMM can correctly identify chromosomal aberrations in tumor samples containing as few as 10% cancer cells. Furthermore, we demonstrated that the estimation of global parameters in GPHMM provides information about the biological characteristics of tumor samples and the quality of genotyping signal from SNP array experiments, which is helpful for data quality control and outlier detection in cohort studies. PMID:21398628
Edwards, Laura J.; Moisés, Abú; Nzaramba, Mathias; Cassimo, Aboobacar; Silva, Laura; Mauricio, Joaquim; Wester, C. William; Vermund, Sten H.; Moon, Troy D.
2015-01-01
Background: Avante Zambézia is an initiative of a Non-Governmental Organization (NGO), Friends in Global Health, LLC (FGH) and the Vanderbilt Institute for Global Health (VIGH) to provide technical assistance to the Mozambican Ministry of Health (MoH) in rural Zambézia Province. Avante Zambézia developed a district level Health Management Mentorship (HMM) program to strengthen health systems in ten of Zambézia’s 17 districts. Our objective was to preliminarily analyze changes in four domains of health system capacity after the HMM’s first year: accounting, Human Resources (HRs), Monitoring and Evaluation (M&E), and transportation management. Methods: Quantitative metrics were developed in each domain. During district visits for weeklong, on-site mentoring, the health management mentoring teams documented each indicator as a success ratio percentage. We analyzed data using linear regressions of each indicator’s mean success ratio across all districts submitting a report over time. Results: Of the four domains, district performance in the accounting domain was the strongest and most sustained. Linear regressions of mean monthly compliance for HR objectives indicated improvement in three of six mean success ratios. The M&E capacity domain showed the least overall improvement. The one indicator analyzed for transportation management suggested progress. Conclusion: Our outcome evaluation demonstrates improvement in health system performance during a HMM initiative. Evaluating which elements of our mentoring program are succeeding in strengthening district level health systems is vital in preparing to transition fiscal and managerial responsibility to local authorities. PMID:26029894
Tunable infrared hyperbolic metamaterials with periodic indium-tin-oxide nanorods
Guo, Peijun; Chang, Robert P. H.; Schaller, Richard D.
2017-07-10
Hyperbolic metamaterials (HMMs) are artificially engineered optical media that have been used for light confinement, excited state decay rate engineering, and subwavelength imaging, due to their highly anisotropic permittivity and with it the capability of supporting high- k modes. HMMs in the infrared range can be conceived for additional applications such as free space communication, thermal engineering, and molecular sensing. Here, we demonstrate infrared HMMs comprised of periodic indium-tin-oxide nanorod arrays (ITO-NRAs). We show that the ITO-NRA based HMMs exhibit a stationary epsilon-near-pole resonance in the near-infrared regime that is insensitive to the filling ratio, and a highly tunable epsilon-near-zeromore » resonance in the mid-infrared range depending on the array periodicity. Experimental results are supported by finite-element simulations, in which the ITO-NRAs are treated both explicitly and as an effective hyperbolic media. Lastly, our work presents a low-loss HMM platform with favorable spectral tunability in the infrared range.« less
NASA Astrophysics Data System (ADS)
Mukhtar, Husneni; Montgomery, Paul; Gianto; Susanto, K.
2016-01-01
In order to develop image processing that is widely used in geo-processing and analysis, we introduce an alternative technique for the characterization of rock samples. The technique that we have used for characterizing inhomogeneous surfaces is based on Coherence Scanning Interferometry (CSI). An optical probe is first used to scan over the depth of the surface roughness of the sample. Then, to analyse the measured fringe data, we use the Five Sample Adaptive method to obtain quantitative results of the surface shape. To analyse the surface roughness parameters, Hmm and Rq, a new window resizing analysis technique is employed. The results of the morphology and surface roughness analysis show micron and nano-scale information which is characteristic of each rock type and its history. These could be used for mineral identification and studies in rock movement on different surfaces. Image processing is thus used to define the physical parameters of the rock surface.
Chuk, Tim; Chan, Antoni B; Hsiao, Janet H
2017-12-01
The hidden Markov model (HMM)-based approach for eye movement analysis is able to reflect individual differences in both spatial and temporal aspects of eye movements. Here we used this approach to understand the relationship between eye movements during face learning and recognition, and its association with recognition performance. We discovered holistic (i.e., mainly looking at the face center) and analytic (i.e., specifically looking at the two eyes in addition to the face center) patterns during both learning and recognition. Although for both learning and recognition, participants who adopted analytic patterns had better recognition performance than those with holistic patterns, a significant positive correlation between the likelihood of participants' patterns being classified as analytic and their recognition performance was only observed during recognition. Significantly more participants adopted holistic patterns during learning than recognition. Interestingly, about 40% of the participants used different patterns between learning and recognition, and among them 90% switched their patterns from holistic at learning to analytic at recognition. In contrast to the scan path theory, which posits that eye movements during learning have to be recapitulated during recognition for the recognition to be successful, participants who used the same or different patterns during learning and recognition did not differ in recognition performance. The similarity between their learning and recognition eye movement patterns also did not correlate with their recognition performance. These findings suggested that perceptuomotor memory elicited by eye movement patterns during learning does not play an important role in recognition. In contrast, the retrieval of diagnostic information for recognition, such as the eyes for face recognition, is a better predictor for recognition performance. Copyright © 2017 Elsevier Ltd. All rights reserved.
1995-01-01
expensive) option is to track the mean and variance of each input feature instead of the min and max. Then a sigmoid is the natural choice for a mapping...Scaling Down: Applying Large Vocabulary Hybrid HMM-MLP Methods to Telephone Recognition of Digits and Natural Numbers 223 Kristine Ma, Nelson Morgan...1 if Yt > 1 Yt + I if Yt < 0 where ct is uncorrelated Gaussian noise with a variance of o-2 = 0.01. Figure 2 (left) shows the time series. Figure 2
1980-08-01
0025 UNCLASSIFIED NL m -hmmII hhh~ENDhE~E EEEEL~ ___ OHIO RIVER BASIN TROUT RUN, CAMBRIA COUNTY PENNSYLVANIA NOI No. PA 00444 ~LEVEL tPennDER No. 11-17...COUNTY, COMMONWEALTH OF PENNSYLVANIA NDI No. PA 00444 PennDER No. 11-17 --PHASE--I -INSPECT-I ON--REPRT m - i-’ JNATIONAL.DAM. AFETY PROGRAM I,.ti/t UK...Construction History - The dam was designed by Andrew B. Crichton , Civil and Mining Engineer, Johnstown, Pennsylvania. The dam was constructed in 1909 and 1910
Joint Source-Channel Decoding of Variable-Length Codes with Soft Information: A Survey
NASA Astrophysics Data System (ADS)
Guillemot, Christine; Siohan, Pierre
2005-12-01
Multimedia transmission over time-varying wireless channels presents a number of challenges beyond existing capabilities conceived so far for third-generation networks. Efficient quality-of-service (QoS) provisioning for multimedia on these channels may in particular require a loosening and a rethinking of the layer separation principle. In that context, joint source-channel decoding (JSCD) strategies have gained attention as viable alternatives to separate decoding of source and channel codes. A statistical framework based on hidden Markov models (HMM) capturing dependencies between the source and channel coding components sets the foundation for optimal design of techniques of joint decoding of source and channel codes. The problem has been largely addressed in the research community, by considering both fixed-length codes (FLC) and variable-length source codes (VLC) widely used in compression standards. Joint source-channel decoding of VLC raises specific difficulties due to the fact that the segmentation of the received bitstream into source symbols is random. This paper makes a survey of recent theoretical and practical advances in the area of JSCD with soft information of VLC-encoded sources. It first describes the main paths followed for designing efficient estimators for VLC-encoded sources, the key component of the JSCD iterative structure. It then presents the main issues involved in the application of the turbo principle to JSCD of VLC-encoded sources as well as the main approaches to source-controlled channel decoding. This survey terminates by performance illustrations with real image and video decoding systems.
Distribution and prediction of catalytic domains in 2-oxoglutarate dependent dioxygenases
2012-01-01
Background The 2-oxoglutarate dependent superfamily is a diverse group of non-haem dioxygenases, and is present in prokaryotes, eukaryotes, and archaea. The enzymes differ in substrate preference and reaction chemistry, a factor that precludes their classification by homology studies and electronic annotation schemes alone. In this work, I propose and explore the rationale of using substrates to classify structurally similar alpha-ketoglutarate dependent enzymes. Findings Differential catalysis in phylogenetic clades of 2-OG dependent enzymes, is determined by the interactions of a subset of active-site amino acids. Identifying these with existing computational methods is challenging and not feasible for all proteins. A clustering protocol based on validated mechanisms of catalysis of known molecules, in tandem with group specific hidden markov model profiles is able to differentiate and sequester these enzymes. Access to this repository is by a web server that compares user defined unknown sequences to these pre-defined profiles and outputs a list of predicted catalytic domains. The server is free and is accessible at the following URL ( http://comp-biol.theacms.in/H2OGpred.html). Conclusions The proposed stratification is a novel attempt at classifying and predicting 2-oxoglutarate dependent function. In addition, the server will provide researchers with a tool to compare their data to a comprehensive list of HMM profiles of catalytic domains. This work, will aid efforts by investigators to screen and characterize putative 2-OG dependent sequences. The profile database will be updated at regular intervals. PMID:22862831
Short text sentiment classification based on feature extension and ensemble classifier
NASA Astrophysics Data System (ADS)
Liu, Yang; Zhu, Xie
2018-05-01
With the rapid development of Internet social media, excavating the emotional tendencies of the short text information from the Internet, the acquisition of useful information has attracted the attention of researchers. At present, the commonly used can be attributed to the rule-based classification and statistical machine learning classification methods. Although micro-blog sentiment analysis has made good progress, there still exist some shortcomings such as not highly accurate enough and strong dependence from sentiment classification effect. Aiming at the characteristics of Chinese short texts, such as less information, sparse features, and diverse expressions, this paper considers expanding the original text by mining related semantic information from the reviews, forwarding and other related information. First, this paper uses Word2vec to compute word similarity to extend the feature words. And then uses an ensemble classifier composed of SVM, KNN and HMM to analyze the emotion of the short text of micro-blog. The experimental results show that the proposed method can make good use of the comment forwarding information to extend the original features. Compared with the traditional method, the accuracy, recall and F1 value obtained by this method have been improved.
Improving Fishing Pattern Detection from Satellite AIS Using Data Mining and Machine Learning.
de Souza, Erico N; Boerder, Kristina; Matwin, Stan; Worm, Boris
2016-01-01
A key challenge in contemporary ecology and conservation is the accurate tracking of the spatial distribution of various human impacts, such as fishing. While coastal fisheries in national waters are closely monitored in some countries, existing maps of fishing effort elsewhere are fraught with uncertainty, especially in remote areas and the High Seas. Better understanding of the behavior of the global fishing fleets is required in order to prioritize and enforce fisheries management and conservation measures worldwide. Satellite-based Automatic Information Systems (S-AIS) are now commonly installed on most ocean-going vessels and have been proposed as a novel tool to explore the movements of fishing fleets in near real time. Here we present approaches to identify fishing activity from S-AIS data for three dominant fishing gear types: trawl, longline and purse seine. Using a large dataset containing worldwide fishing vessel tracks from 2011-2015, we developed three methods to detect and map fishing activities: for trawlers we produced a Hidden Markov Model (HMM) using vessel speed as observation variable. For longliners we have designed a Data Mining (DM) approach using an algorithm inspired from studies on animal movement. For purse seiners a multi-layered filtering strategy based on vessel speed and operation time was implemented. Validation against expert-labeled datasets showed average detection accuracies of 83% for trawler and longliner, and 97% for purse seiner. Our study represents the first comprehensive approach to detect and identify potential fishing behavior for three major gear types operating on a global scale. We hope that this work will enable new efforts to assess the spatial and temporal distribution of global fishing effort and make global fisheries activities transparent to ocean scientists, managers and the public.
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.
Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R
2009-07-01
The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
Improving Fishing Pattern Detection from Satellite AIS Using Data Mining and Machine Learning
Matwin, Stan; Worm, Boris
2016-01-01
A key challenge in contemporary ecology and conservation is the accurate tracking of the spatial distribution of various human impacts, such as fishing. While coastal fisheries in national waters are closely monitored in some countries, existing maps of fishing effort elsewhere are fraught with uncertainty, especially in remote areas and the High Seas. Better understanding of the behavior of the global fishing fleets is required in order to prioritize and enforce fisheries management and conservation measures worldwide. Satellite-based Automatic Information Systems (S-AIS) are now commonly installed on most ocean-going vessels and have been proposed as a novel tool to explore the movements of fishing fleets in near real time. Here we present approaches to identify fishing activity from S-AIS data for three dominant fishing gear types: trawl, longline and purse seine. Using a large dataset containing worldwide fishing vessel tracks from 2011–2015, we developed three methods to detect and map fishing activities: for trawlers we produced a Hidden Markov Model (HMM) using vessel speed as observation variable. For longliners we have designed a Data Mining (DM) approach using an algorithm inspired from studies on animal movement. For purse seiners a multi-layered filtering strategy based on vessel speed and operation time was implemented. Validation against expert-labeled datasets showed average detection accuracies of 83% for trawler and longliner, and 97% for purse seiner. Our study represents the first comprehensive approach to detect and identify potential fishing behavior for three major gear types operating on a global scale. We hope that this work will enable new efforts to assess the spatial and temporal distribution of global fishing effort and make global fisheries activities transparent to ocean scientists, managers and the public. PMID:27367425
Modelling the molecular composition and nuclear-spin chemistry of collapsing prestellar sources
NASA Astrophysics Data System (ADS)
Hily-Blant, P.; Faure, A.; Rist, C.; Pineau des Forêts, G.; Flower, D. R.
2018-04-01
We study the gravitational collapse of prestellar sources and the associated evolution of their chemical composition. We use the University of Grenoble Alpes Astrochemical Network (UGAN), which includes reactions involving the different nuclear-spin states of H2, H+3, and of the hydrides of carbon, nitrogen, oxygen, and sulfur, for reactions involving up to seven protons. In addition, species-to-species rate coefficients are provided for the ortho/para interconversion of the H_3^+ + H2 system and isotopic variants. The composition of the medium is followed from an initial steady state through the early phase of isothermal gravitational collapse. Both the freeze-out of the molecules on to grains and the coagulation of the grains were incorporated in the model. The predicted abundances and column densities of the spin isomers of ammonia and its deuterated forms are compared with those measured recently towards the prestellar cores H-MM1, L16293E, and Barnard B1. We find that gas-phase processes alone account satisfactorily for the observations, without recourse to grain-surface reactions. In particular, our model reproduces both the isotopologue abundance ratios and the ortho:para ratios of NH2D and NHD2 within observational uncertainties. More accurate observations are necessary to distinguish between full scrambling processes—as assumed in our gas-phase network—and direct nucleus- or atom-exchange reactions.
Modelling the molecular composition and nuclear-spin chemistryof collapsing pre-stellar sources
NASA Astrophysics Data System (ADS)
Hily-Blant, P.; Faure, A.; Rist, C.; Pineau des Forêts, G.; Flower, D. R.
2018-07-01
We study the gravitational collapse of pre-stellar sources and the associated evolution of their chemical composition. We use the University of Grenoble Alpes Astrochemical Network (UGAN), which includes reactions involving the different nuclear-spin states of H2, H_3^+, and of the hydrides of carbon, nitrogen, oxygen, and sulphur, for reactions involving up to seven protons. In addition, species-to-species rate coefficients are provided for the ortho/para interconversion of the H_3^+ + H2 system and isotopic variants. The composition of the medium is followed from an initial steady state through the early phase of isothermal gravitational collapse. Both the freeze-out of the molecules on to grains and the coagulation of the grains were incorporated in the model. The predicted abundances and column densities of the spin isomers of ammonia and its deuterated forms are compared with those measured recently towards the pre-stellar cores H-MM1, L16293E, and Barnard B1. We find that gas-phase processes alone account satisfactorily for the observations, without recourse to grain-surface reactions. In particular, our model reproduces both the isotopologue abundance ratios and the ortho:para ratios of NH2D and NHD2 within observational uncertainties. More accurate observations are necessary to distinguish between full scrambling processes - as assumed in our gas-phase network - and direct nucleus- or atom-exchange reactions.
The Dfam database of repetitive DNA families.
Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J
2016-01-04
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kinetic Characterization of Nonmuscle Myosin IIB at the Single Molecule Level*
Nagy, Attila; Takagi, Yasuharu; Billington, Neil; Sun, Sara A.; Hong, Davin K. T.; Homsher, Earl; Wang, Aibing; Sellers, James R.
2013-01-01
Nonmuscle myosin IIB (NMIIB) is a cytoplasmic myosin, which plays an important role in cell motility by maintaining cortical tension. It forms bipolar thick filaments with ∼14 myosin molecule dimers on each side of the bare zone. Our previous studies showed that the NMIIB is a moderately high duty ratio (∼20–25%) motor. The ADP release step (∼0.35 s−1) of NMIIB is only ∼3 times faster than the rate-limiting phosphate release (0.13 ± 0.01 s−1). The aim of this study was to relate the known in vitro kinetic parameters to the results of single molecule experiments and to compare the kinetic and mechanical properties of single- and double-headed myosin fragments and nonmuscle IIB thick filaments. Examination of the kinetics of NMIIB interaction with actin at the single molecule level was accomplished using total internal reflection fluorescence (TIRF) with fluorescence imaging with 1-nm accuracy (FIONA) and dual-beam optical trapping. At a physiological ATP concentration (1 mm), the rate of detachment of the single-headed and double-headed molecules was similar (∼0.4 s−1). Using optical tweezers we found that the power stroke sizes of single- and double-headed heavy meromyosin (HMM) were each ∼6 nm. No signs of processive stepping at the single molecule level were observed in the case of NMIIB-HMM in optical tweezers or TIRF/in vitro motility experiments. In contrast, robust motility of individual fluorescently labeled thick filaments of full-length NMIIB was observed on actin filaments. Our results are in good agreement with the previous steady-state and transient kinetic studies and show that the individual nonprocessive nonmuscle myosin IIB molecules form a highly processive unit when polymerized into filaments. PMID:23148220
Speech endpoint detection with non-language speech sounds for generic speech processing applications
NASA Astrophysics Data System (ADS)
McClain, Matthew; Romanowski, Brian
2009-05-01
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Reductive evolution and the loss of PDC/PAS domains from the genus Staphylococcus
2013-01-01
Background The Per-Arnt-Sim (PAS) domain represents a ubiquitous structural fold that is involved in bacterial sensing and adaptation systems, including several virulence related functions. Although PAS domains and the subclass of PhoQ-DcuS-CitA (PDC) domains have a common structure, there is limited amino acid sequence similarity. To gain greater insight into the evolution of PDC/PAS domains present in the bacterial kingdom and staphylococci in specific, the PDC/PAS domains from the genomic sequences of 48 bacteria, representing 5 phyla, were identified using the sensitive search method based on HMM-to-HMM comparisons (HHblits). Results A total of 1,007 PAS domains and 686 PDC domains distributed over 1,174 proteins were identified. For 28 Gram-positive bacteria, the distribution, organization, and molecular evolution of PDC/PAS domains were analyzed in greater detail, with a special emphasis on the genus Staphylococcus. Compared to other bacteria the staphylococci have relatively fewer proteins (6–9) containing PDC/PAS domains. As a general rule, the staphylococcal genomes examined in this study contain a core group of seven PDC/PAS domain-containing proteins consisting of WalK, SrrB, PhoR, ArlS, HssS, NreB, and GdpP. The exceptions to this rule are: 1) S. saprophyticus lacks the core NreB protein; 2) S. carnosus has two additional PAS domain containing proteins; 3) S. epidermidis, S. aureus, and S. pseudintermedius have an additional protein with two PDC domains that is predicted to code for a sensor histidine kinase; 4) S. lugdunensis has an additional PDC containing protein predicted to be a sensor histidine kinase. Conclusions This comprehensive analysis demonstrates that variation in PDC/PAS domains among bacteria has limited correlations to the genome size or pathogenicity; however, our analysis established that bacteria having a motile phase in their life cycle have significantly more PDC/PAS-containing proteins. In addition, our analysis revealed a tremendous amount of variation in the number of PDC/PAS-containing proteins within genera. This variation extended to the Staphylococcus genus, which had between 6 and 9 PDC/PAS proteins and some of these appear to be previously undescribed signaling proteins. This latter point is important because most staphylococcal proteins that contain PDC/PAS domains regulate virulence factor synthesis or antibiotic resistance. PMID:23902280
Reductive evolution and the loss of PDC/PAS domains from the genus Staphylococcus.
Shah, Neethu; Gaupp, Rosmarie; Moriyama, Hideaki; Eskridge, Kent M; Moriyama, Etsuko N; Somerville, Greg A
2013-07-31
The Per-Arnt-Sim (PAS) domain represents a ubiquitous structural fold that is involved in bacterial sensing and adaptation systems, including several virulence related functions. Although PAS domains and the subclass of PhoQ-DcuS-CitA (PDC) domains have a common structure, there is limited amino acid sequence similarity. To gain greater insight into the evolution of PDC/PAS domains present in the bacterial kingdom and staphylococci in specific, the PDC/PAS domains from the genomic sequences of 48 bacteria, representing 5 phyla, were identified using the sensitive search method based on HMM-to-HMM comparisons (HHblits). A total of 1,007 PAS domains and 686 PDC domains distributed over 1,174 proteins were identified. For 28 Gram-positive bacteria, the distribution, organization, and molecular evolution of PDC/PAS domains were analyzed in greater detail, with a special emphasis on the genus Staphylococcus. Compared to other bacteria the staphylococci have relatively fewer proteins (6-9) containing PDC/PAS domains. As a general rule, the staphylococcal genomes examined in this study contain a core group of seven PDC/PAS domain-containing proteins consisting of WalK, SrrB, PhoR, ArlS, HssS, NreB, and GdpP. The exceptions to this rule are: 1) S. saprophyticus lacks the core NreB protein; 2) S. carnosus has two additional PAS domain containing proteins; 3) S. epidermidis, S. aureus, and S. pseudintermedius have an additional protein with two PDC domains that is predicted to code for a sensor histidine kinase; 4) S. lugdunensis has an additional PDC containing protein predicted to be a sensor histidine kinase. This comprehensive analysis demonstrates that variation in PDC/PAS domains among bacteria has limited correlations to the genome size or pathogenicity; however, our analysis established that bacteria having a motile phase in their life cycle have significantly more PDC/PAS-containing proteins. In addition, our analysis revealed a tremendous amount of variation in the number of PDC/PAS-containing proteins within genera. This variation extended to the Staphylococcus genus, which had between 6 and 9 PDC/PAS proteins and some of these appear to be previously undescribed signaling proteins. This latter point is important because most staphylococcal proteins that contain PDC/PAS domains regulate virulence factor synthesis or antibiotic resistance.
Web life: The Evil Mad Scientist Project
NASA Astrophysics Data System (ADS)
2009-04-01
What is it? Have you ever tried to electrocute a hot dog? Wondered how to make a robot out of a toothbrush, watch battery and phone-pager motor? Seen a cantaloupe melon and thought, "Hmm, I could make this look like the Death Star from the original Star Wars films"? If you have not, but you would like to - preferably as soon as you can find a pager motor - then this is the site for you. The Evil Mad Scientist Project (EMSP) blog is packed full of ideas for unusual, silly and frequently physics-related creations that bring science out of the laboratory and into kitchens, backyards and tool sheds.
Multi-sensor physical activity recognition in free-living.
Ellis, Katherine; Godbole, Suneeta; Kerr, Jacqueline; Lanckriet, Gert
Physical activity monitoring in free-living populations has many applications for public health research, weight-loss interventions, context-aware recommendation systems and assistive technologies. We present a system for physical activity recognition that is learned from a free-living dataset of 40 women who wore multiple sensors for seven days. The multi-level classification system first learns low-level codebook representations for each sensor and uses a random forest classifier to produce minute-level probabilities for each activity class. Then a higher-level HMM layer learns patterns of transitions and durations of activities over time to smooth the minute-level predictions. [Formula: see text].
Perturbation theory in the catalytic rate constant of the Henri-Michaelis-Menten enzymatic reaction.
Bakalis, Evangelos; Kosmas, Marios; Papamichael, Emmanouel M
2012-11-01
The Henry-Michaelis-Menten (HMM) mechanism of enzymatic reaction is studied by means of perturbation theory in the reaction rate constant k (2) of product formation. We present analytical solutions that provide the concentrations of the enzyme (E), the substrate (S), as well as those of the enzyme-substrate complex (C), and the product (P) as functions of time. For k (2) small compared to k (-1), we properly describe the entire enzymatic activity from the beginning of the reaction up to longer times without imposing extra conditions on the initial concentrations E ( o ) and S ( o ), which can be comparable or much different.
Kao, Hui-Ju; Weng, Shun-Long; Huang, Kai-Yao; Kaunang, Fergie Joanda; Hsu, Justin Bo-Kai; Huang, Chien-Hsun; Lee, Tzong-Yi
2017-12-21
Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson's disease, and Alzheimer's disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/ ) and are also anticipated to facilitate the study of large-scale carbonylated proteomes.
Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun
2016-09-21
With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our findings will be relevant in furthering the understanding of HIV-1 recombination mechanisms.
NASA Astrophysics Data System (ADS)
Kayasith, Prakasith; Theeramunkong, Thanaruk
It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks
Reynolds, Sheila M.; Käll, Lukas; Riffle, Michael E.; Bilmes, Jeff A.; Noble, William Stafford
2008-01-01
Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by a previously published HMM, Phobius, and combines a signal peptide submodel with a transmembrane submodel. We introduce a two-stage DBN decoder that combines the power of posterior decoding with the grammar constraints of Viterbi-style decoding. Philius also provides protein type, segment, and topology confidence metrics to aid in the interpretation of the predictions. We report a relative improvement of 13% over Phobius in full-topology prediction accuracy on transmembrane proteins, and a sensitivity and specificity of 0.96 in detecting signal peptides. We also show that our confidence metrics correlate well with the observed precision. In addition, we have made predictions on all 6.3 million proteins in the Yeast Resource Center (YRC) database. This large-scale study provides an overall picture of the relative numbers of proteins that include a signal-peptide and/or one or more transmembrane segments as well as a valuable resource for the scientific community. All DBNs are implemented using the Graphical Models Toolkit. Source code for the models described here is available at http://noble.gs.washington.edu/proj/philius. A Philius Web server is available at http://www.yeastrc.org/philius, and the predictions on the YRC database are available at http://www.yeastrc.org/pdr. PMID:18989393
Cuadrat, Rafael R. C.; Cury, Juliano C.; Dávila, Alberto M. R.
2015-01-01
Marine environments harbor a wide range of microorganisms from the three domains of life. These microorganisms have great potential to enable discovery of new enzymes and bioactive compounds for industrial use. However, only ~1% of microorganisms from the environment can currently be identified through cultured isolates, limiting the discovery of new compounds. To overcome this limitation, a metagenomics approach has been widely adopted for biodiversity studies on samples from marine environments. In this study, we screened metagenomes in order to estimate the potential for new natural compound synthesis mediated by diversity in the Polyketide Synthase (PKS) and Nonribosomal Peptide Synthetase (NRPS) genes. The samples were collected from the Praia dos Anjos (Angel’s Beach) surface water—Arraial do Cabo (Rio de Janeiro state, Brazil), an environment affected by upwelling. In order to evaluate the potential for screening natural products in Arraial do Cabo samples, we used KS (keto-synthase) and C (condensation) domains (from PKS and NRPS, respectively) to build Hidden Markov Models (HMM) models. From both samples, a total of 84 KS and 46 C novel domain sequences were obtained, showing the potential of this environment for the discovery of new genes of biotechnological interest. These domains were classified by phylogenetic analysis and this was the first study conducted to screen PKS and NRPS genes in an upwelling affected sample PMID:26633360
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Caracheo, Barak F.; Emberly, Eldon; Hadizadeh, Shirin; Hyman, James M.; Seamans, Jeremy K.
2013-01-01
Foraging typically involves two distinct phases, an exploration phase where an organism explores its local environment in search of needed resources and an exploitation phase where a discovered resource is consumed. The behavior and cognitive requirements of exploration and exploitation are quite different and yet organisms can quickly and efficiently switch between them many times during a foraging bout. The present study investigated neural activity state dynamics in the anterior cingulate sub-region of the rat medial prefrontal cortex (mPFC) when a reliable food source was introduced into an environment. Distinct and largely independent states were detected using a Hidden Markov Model (HMM) when food was present or absent in the environment. Measures of neural entropy or complexity decreased when rats went from exploring the environment to exploiting a reliable food source. Exploration in the absence of food was associated with many weak activity states, while bouts of food consumption were characterized by fewer stronger states. Widespread activity state changes in the mPFC may help to inform foraging decisions and focus behavior on what is currently most prominent or valuable in the environment. PMID:23745102
TagDust2: a generic method to extract reads from sequencing data.
Lassmann, Timo
2015-01-28
Arguably the most basic step in the analysis of next generation sequencing data (NGS) involves the extraction of mappable reads from the raw reads produced by sequencing instruments. The presence of barcodes, adaptors and artifacts subject to sequencing errors makes this step non-trivial. Here I present TagDust2, a generic approach utilizing a library of hidden Markov models (HMM) to accurately extract reads from a wide array of possible read architectures. TagDust2 extracts more reads of higher quality compared to other approaches. Processing of multiplexed single, paired end and libraries containing unique molecular identifiers is fully supported. Two additional post processing steps are included to exclude known contaminants and filter out low complexity sequences. Finally, TagDust2 can automatically detect the library type of sequenced data from a predefined selection. Taken together TagDust2 is a feature rich, flexible and adaptive solution to go from raw to mappable NGS reads in a single step. The ability to recognize and record the contents of raw reads will help to automate and demystify the initial, and often poorly documented, steps in NGS data analysis pipelines. TagDust2 is freely available at: http://tagdust.sourceforge.net .
Bayesian switching factor analysis for estimating time-varying functional connectivity in fMRI.
Taghia, Jalil; Ryali, Srikanth; Chen, Tianwen; Supekar, Kaustubh; Cai, Weidong; Menon, Vinod
2017-07-15
There is growing interest in understanding the dynamical properties of functional interactions between distributed brain regions. However, robust estimation of temporal dynamics from functional magnetic resonance imaging (fMRI) data remains challenging due to limitations in extant multivariate methods for modeling time-varying functional interactions between multiple brain areas. Here, we develop a Bayesian generative model for fMRI time-series within the framework of hidden Markov models (HMMs). The model is a dynamic variant of the static factor analysis model (Ghahramani and Beal, 2000). We refer to this model as Bayesian switching factor analysis (BSFA) as it integrates factor analysis into a generative HMM in a unified Bayesian framework. In BSFA, brain dynamic functional networks are represented by latent states which are learnt from the data. Crucially, BSFA is a generative model which estimates the temporal evolution of brain states and transition probabilities between states as a function of time. An attractive feature of BSFA is the automatic determination of the number of latent states via Bayesian model selection arising from penalization of excessively complex models. Key features of BSFA are validated using extensive simulations on carefully designed synthetic data. We further validate BSFA using fingerprint analysis of multisession resting-state fMRI data from the Human Connectome Project (HCP). Our results show that modeling temporal dependencies in the generative model of BSFA results in improved fingerprinting of individual participants. Finally, we apply BSFA to elucidate the dynamic functional organization of the salience, central-executive, and default mode networks-three core neurocognitive systems with central role in cognitive and affective information processing (Menon, 2011). Across two HCP sessions, we demonstrate a high level of dynamic interactions between these networks and determine that the salience network has the highest temporal flexibility among the three networks. Our proposed methods provide a novel and powerful generative model for investigating dynamic brain connectivity. Copyright © 2017 Elsevier Inc. All rights reserved.
Bayesian Estimation and Inference Using Stochastic Electronics
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M.; Hamilton, Tara J.; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Affective State Level Recognition in Naturalistic Facial and Vocal Expressions.
Meng, Hongying; Bianchi-Berthouze, Nadia
2014-03-01
Naturalistic affective expressions change at a rate much slower than the typical rate at which video or audio is recorded. This increases the probability that consecutive recorded instants of expressions represent the same affective content. In this paper, we exploit such a relationship to improve the recognition performance of continuous naturalistic affective expressions. Using datasets of naturalistic affective expressions (AVEC 2011 audio and video dataset, PAINFUL video dataset) continuously labeled over time and over different dimensions, we analyze the transitions between levels of those dimensions (e.g., transitions in pain intensity level). We use an information theory approach to show that the transitions occur very slowly and hence suggest modeling them as first-order Markov models. The dimension levels are considered to be the hidden states in the Hidden Markov Model (HMM) framework. Their discrete transition and emission matrices are trained by using the labels provided with the training set. The recognition problem is converted into a best path-finding problem to obtain the best hidden states sequence in HMMs. This is a key difference from previous use of HMMs as classifiers. Modeling of the transitions between dimension levels is integrated in a multistage approach, where the first level performs a mapping between the affective expression features and a soft decision value (e.g., an affective dimension level), and further classification stages are modeled as HMMs that refine that mapping by taking into account the temporal relationships between the output decision labels. The experimental results for each of the unimodal datasets show overall performance to be significantly above that of a standard classification system that does not take into account temporal relationships. In particular, the results on the AVEC 2011 audio dataset outperform all other systems presented at the international competition.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
2017-10-06
>> HOUSTON, WE HAVE A PODCAST. WELCOME TO THE OFFICIAL PODCAST OF THE NASA JOHNSON SPACE CENTER, EPISODE 13: “BEFORE HIS FIRST FLIGHT.” I’M GARY JORDAN AND I’LL BE YOUR HOST TODAY. SO THIS IS THE PODCAST WHERE WE BRING IN THE EXPERTS, LIKE NASA SCIENTISTS, ENGINEERS, SOMETIMES EVEN ASTRONAUTS, AND THEY ALL TELL YOU THE COOLEST THINGS GOING ON HERE AT NASA. SO TODAY, WE’RE TALKING WITH MARK VANDE HEI. HE’S A U.S. ASTRONAUT HERE AT THE JOHNSON SPACE CENTER IN HOUSTON, TEXAS, AND HE JUST LAUNCHED TO THE INTERNATIONAL SPACE STATION ON SEPTEMBER 12th, 2017 TO GO TO SPACE FOR THE VERY FIRST TIME. WE HAD A GREAT DISCUSSION ABOUT HIS EXPECTATIONS FOR FLYING TO SPACE AND SOME OF THE WORK AND HIS TRAINING THAT HE HAD TO GO THROUGH TO GET READY FOR HIS VOYAGE TO THE STATION. SO WITH NO FURTHER DELAY, LET’S GO LIGHTSPEED AND JUMP RIGHT AHEAD TO OUR TALK WITH MR. MARK VANDE HEI. ENJOY. [ MUSIC ] >> T MINUS FIVE SECONDS AND COUNTING. MARK. [ INDISTINCT RADIO CHATTER ] >> HOUSTON, WE HAVE A PODCAST. [ MUSIC ] >> ALL RIGHT, WELL, THANKS FOR COMING TODAY, MARK. I KNOW YOU’RE VERY BUSY, ESPECIALLY COMING SO CLOSE TO YOUR LAUNCH DATE. SO THAT’S SEPTEMBER AGAIN, RIGHT? >> IT IS SEPTEMBER 13th. >> IT IS SEPTEMBER, OKAY. SO THAT’S WITH-- NOW, IT’S KIND OF CHANGED UP A BIT, RIGHT? SO NOW WE’RE TALKING-- YOU’RE LAUNCHING WITH ALEXANDER AND JOE, RIGHT? >> THAT’S CORRECT. >> ALEXANDER MISURKIN AND JOE ACABA. SO, I MEAN, THIS IS YOUR VERY FIRST FLIGHT COMING UP SOON, SO YOU’VE BEEN BUSY TRAINING FOR YEARS. I MEAN, YOU WERE SELECTED IN 2009, IF I’M NOT MISTAKEN, RIGHT? >> THAT’S CORRECT. >> THERE’S A LOT OF TRAINING TO BE HAD SO, I MEAN, LET’S TALK ABOUT SOME OF THOSE THINGS. LIKE, WHAT WERE YOUR-- WHAT ARE YOUR EXPECTATIONS AND WHAT ARE YOU PREPARING FOR REALLY? I MEAN, WHAT DOES AN ASTRONAUT NEED TO KNOW BEFORE THEY LAUNCH? >> SO, THE PRIMARY THING WE NEED TO KNOW IS HOW TO-- I WOULD SAY THE PRIMARY THING WE NEED TO KNOW IS HOW TO FOLLOW INSTRUCTIONS. >> ALL RIGHT. >> BECAUSE WE REALLY ARE SERVING AS THE EYES AND HANDS OF A LOT OF OTHER PEOPLE THAT AREN’T THERE WITH US BUT ARE ABLE TO SUPPORT US. >> MM-HMM. >> SO THAT’S THE PRIMARY THING. YOU ALSO NEED TO KNOW HOW TO WORK WELL WITH THE OTHER PEOPLE THAT YOU’RE LIVING WITH. >> THAT’S RIGHT. >> AND MAKE SURE YOU TAKE CARE OF EACH OTHER, MAKE SURE THAT EVERYTHING’S FULLY FUNCTIONAL, AND THEN AFTER THAT I WOULD SAY WE HAVE TO HAVE ALL THE TECHNICAL SKILLS TO DO OUR JOB THAT ARE OPERATE THE SCIENCE EXPERIMENTS AND BE ABLE TO KEEP THE SPACE STATION ACTUALLY RUNNING. >> NICE. NOW, I MEAN, SO WE TALKED A LITTLE BIT ON A PREVIOUS EPISODE WITH RANDY BRESNIK ABOUT SOME OF THE THINGS YOU HAVE TO LEARN, BUT JUST LIKE AN OVERVIEW OF SOME OF THE THINGS, LIKE, IN TERMS OF KNOWING WHAT TO DO ON THE STATION. >> MM-HMM. >> YOU’RE TALKING ALL THE DIFFERENT SYSTEMS, RIGHT? SO, KOMRADE DESCRIBED MORE FIXING THE TOILET. >> YEAH, YEAH. >> AND YOU KNOW, LEARNING HOW TO DO AN EVA AND EVERYTHING IN BETWEEN. >> YEAH. >> SO IS THAT KIND OF WHAT YOU’VE BEEN DOING OVER THE PAST-- >> ABSOLUTELY. I’VE GOT-- KOMRADE’S GOING TO BE THE COMMANDER SO THERE’S SOME-- CERTAINLY SOME ADDITIONAL THINGS HE’S GOT TO LEARN. >> OKAY. >> BUT, BY AND LARGE, THE CREW MEMBERS ON THE SPACE STATION, WHEN THERE’S NOT AN EMERGENCY TAKING PLACE, WE’RE ALL KIND OF EQUAL. >> MM-HMM. >> CERTAINLY THE COMMANDER, WHEN AN EMERGENCY IS HAPPENING, HE’S-- THAT’S THE PERSON THAT’S MAKING THOSE TOUGH CALLS AND PULLING THE TEAM TOGETHER. >> MM-HMM. >> AND HE WILL ALSO COORDINATE ON BEHALF OF THE ENTIRE TEAM. BUT, CREW MEMBERS ON THE STATION ARE GENERALISTS. WE HAVE TO HAVE A SKILL SET THAT WILL ALLOW US TO DO WHATEVER THE GROUND NEEDS US TO DO AND THAT DOES INVOLVE EVA TRAINING, OF COURSE. >> MM-HMM. >> THAT INVOLVES ROBOTICS TRAINING. THAT INVOLVES MEDICAL TRAINING, TOO, JUST IN CASE SOMETHING COMES UP, WE’LL HAVE TO TAKE CARE OF EACH OTHER. THAT’S BEEN PRETTY INTERESTING. >> YEAH. >> DID KOMRADE TALK AT ALL ABOUT THAT? >> ABOUT THE-- WHICH PART? >> THE MEDICAL TRAINING? >> YEAH, OH, YEAH. I MEAN, JUST A TINY LITTLE BIT. WE ACTUALLY ONLY HAD ABOUT 25 MINUTES TO TALK, SO HE TALKED-- I MEAN, MOSTLY A LITTLE BIT. HE SAID, I MEAN, YOU HAVE TO-- YOU HAVE TO KNOW KIND OF THE BASICS OF MEDICAL TRAINING IN CASE THERE’S AN EMERGENCY SITUATION, BUT HE ALSO MENTIONED THAT YOU HAVE-- YOU CAN CALL DOWN TO DOCTORS AND THEY CAN WALK YOU THROUGH SOME OF THOSE THINGS. >> ABSOLUTELY. >> AND I GUESS THAT KIND OF HELPS, RIGHT? BECAUSE ESPECIALLY NOT BEING A DOCTOR AND YOU GUYS-- ONE THING I SAID LAST TIME WAS YOU HAVE TO BE A JACK OF ALL TRADES AND A MASTER OF ALL IN SORT OF A-- IN A WAY, I GUESS. YOU HAVE TO REALLY KNOW THE SYSTEMS. >> IN A WAY, BUT THE GROUND IS ALWAYS THERE TO HELP OUT. >> THAT’S TRUE. >> FOR EXAMPLE, WE HAD AN EVENT THAT INVOLVED US SIMULATING THAT ONE OF THE CREW MEMBERS NEEDED CPR. >> MM-HMM. >> AND IT HAD BEEN SIX MONTHS AT LEAST, MAYBE EVEN A YEAR, SINCE MY PREVIOUS TRAINING ON THAT AND THE INSTRUCTORS DID A GOOD JOB OF SAYING, “OKAY, GO FOR IT.” SO, I KNEW I SHOULD DO CHEST COMPRESSIONS. I KNEW I SHOULD GIVE-- DO BREATHS PERIODICALLY. >> RIGHT, RIGHT. >> BUT, I WASN’T 100% CERTAIN OF WHAT NUMBER OF BREATHS, WHAT NUMBER OF REPETITIONS. >> RIGHT. >> SO I JUST STARTED, AND THEN THEY REMINDED AS PART OF THE TRAINING THAT, “HEY, LOOK, WHEN YOU HAVE THAT UNCERTAINTY-- YOU DID A GOOD JOB OF GETTING STARTING, BUT THE GROUND’S THERE TO HELP ANSWER THAT QUESTION. YOU COULD’VE GOT-- SAID, “HEY, WE NEED THIS CONFERENCE RIGHT NOW AND LET’S GET A DOCTOR TALKING TO US AND MAKE SURE WE’RE DOING THE RIGHT THINGS.”” >> MM-HMM. >> SO BECAUSE YOU HAVE TO KNOW SO MUCH SOMETIMES THE DETAILS-- THE GROUND CAN REALLY HELP YOU OUT WITH THAT. >> YEAH, AND THEY’RE THERE 24/7, RIGHT? >> ABSOLUTELY. >> SO YOU CAN CALL DOWN AND SAY, “HEY, SOMETHING’S GOING ON. I NEED HELP.” >> YES. YES. >> AND YOU GUYS WALK THROUGH ALL OF THOSE DIFFERENT THINGS. SO, I MEAN, ON TOP OF JUST TRAINING FOR SOME OF THE THINGS ON THE INTERNATIONAL SPACE STATION THAT YOU’RE GOING TO BE DOING, ESPECIALLY EMERGENCY SITUATIONS, YOU GO THROUGH OTHER TYPES OF TRAINING TOO, RIGHT? DON’T YOU DO SURVIVAL TRAINING AND THINGS LIKE THAT? >> YEAH, ABSOLUTELY. WE HAVE THE-- FIRST OF ALL, THERE’S LAND SURVIVAL TRAINING-- ONE OF THE FIRST THINGS YOU DO AS ASTRONAUT CANDIDATES. >> MM-HMM. >> I BELIEVE THE NEXT CLASS IS GOING TO DO THAT AT FORT RUCKER, IT’S AN ARMY BASE. >> OKAY. >> THEN, THERE’S LAND SURVIVAL TRAIN-- NO, I ALREADY TALKED ABOUT THAT. THERE’S LAND SURVIVAL TRAINING THAT WE DO AS ASTRONAUT CANDIDATES. >> RIGHT. >> AND THEN, THE NEXT SURVIVAL TRAINING YOU DO IS ACTUALLY AFTER YOU’RE ASSIGNED TO A SOYUZ CREW. THERE’S WINTER SURVIVAL TRAINING IN CASE YOUR SOYUZ LANDS SOME PLACE WHERE THE SEARCH AND RESCUE FORCES CAN’T GET TO YOU AS QUICKLY AS YOU’D LIKE. >> OH. >> AND YOU MAY HAVE TO BE SOME PLACE IN THE WINTER IN RUSSIA AND HAVE TO BE ABLE TO SURVIVE FOR A COUPLE DAYS. >> OH, WOW. >> WORST CASE. >> RIGHT, RIGHT. >> SO WE DO THAT TRAINING. THAT’S ALSO A VERY GOOD TIME FOR THE CREW TO BOND WITH EACH OTHER, AS YOU CAN IMAGINE. >> YEAH. >> THERE’S ALSO NOMINALLY, THE SOYUZ LANDS ON LAND. >> RIGHT. >> BUT, WE ALSO HAVE WATER SURVIVAL TRAINING. >> OKAY, JUST IN CASE IT DOES LAND ON WATER. >> JUST IN CASE. WELL, IF THERE’S A REALLY URGENT NEED TO DESCEND. >> RIGHT. >> AND WE’RE NOT GOING TO WORRY ABOUT WHERE ON THE EARTH WE HIT. >> RIGHT. >> POSSIBLY, IF IT’S THAT-- NORMALLY, WE’RE VERY-- >> A LOT OF BAD THINGS HAVE TO HAPPEN IN A ROW TO GET TO THAT POINT. >> YES. WE REALLY WANT TO LAND IN SPECIFIC PLACES, BUT JUST IN CASE, THERE’S THE OPTION. MUCH OF THE EARTH IS COVERED WITH WATER, SO WE LEARNED HOW TO DEAL WITH THAT SITUATION AS WELL. >> RIGHT. SO, YOU DID DO THE WINTER SURVIVAL TRAINING, RIGHT? YOU HAD TO GO THROUGH THAT. WHAT ARE-- DO YOU HAVE ANY GOOD STORIES OF-- YOU SAID IT WAS A GOOD TIME TO BOND WITH YOUR CREWMATES, SO ARE THERE ANY GOOD STORIES THERE? >> SURE. SO, THE TRAINING CONSISTS OF STAYING UP. FOR US, WE STAYED UP FOR TWO NIGHTS. >> MM-HMM. >> THE FIRST NIGHT YOU EGRESS THE SOYUZ CAPSULE THAT THEY PUT OUT IN THE FOREST. WE’VE GOT A REALLY GOOD SET OF COLD WEATHER GEAR THAT WE PUT ON. >> MM-HMM. >> AND SO, WE PUT ALL THAT STUFF ON, AND THEN WE USE THE SEAT LINERS, THAT ARE MOLDED TO US, THAT ARE IN THE-- I WOULD CALL IT KIND OF LIKE A BUCKET INSIDE THE SOYUZ. >> OH. >> WE CAN TAKE THOSE OUT AND USE THOSE AS SLEDS. SO WE PUT A BUNCH OF GEAR ON THAT. >> OH, I SEE. >> AND YOU GOT TO DRAG THOSE THROUGH TO A PLACE TO FIND A PLACE TO SET UP CAMP. >> COOL. >> OF COURSE, THE PARACHUTE THAT THE SOYUZ LANDS WITH IS HUGE, SO THAT’S A MASSIVE RESOURCE OF CLOTH. >> MM-HMM. >> SO THE FIRST NIGHT, WHAT WE DID IS HAD TO SET UP A LEAN-TO AND USED BOTH TIMBER THAT WE FOUND IN THE AREA, AND STRINGS FROM THE PARACHUTE, AND THE ACTUAL CLOTH FROM THE PARACHUTE, AS WELL AS A LOTO OF BRANCHES TO SET UP A SHELTER. BUT, THAT WAS REALLY-- THAT NIGHT WAS ALL ABOUT THE FIRE. >> OH. >> BECAUSE THE LEAN-TO JUST KEPT US FROM LOSING ALL THE HEAT, BUT WE WERE KIND OF SLEEPING-- THERE WAS TWO PEOPLE KIND OF SLEEPING ON TOP OF EACH OTHER JUST ABOUT-- >> SORRY, A LEAN-TO IS LIKE-- IS THAT A SHELTER THAT, I’M ASSUMING, LEANS UP AGAINST SOMETHING? IS THAT WHAT THAT IS? >> A LEAN-TO-- IMAGINE IF YOU HAD A PLANE THAT WAS-- LIKE, A HALF OF A ROOF. >> OKAY. >> AND ALL IT IS IS ONE WALL THAT GOES FROM MAYBE ABOUT WAIST HIGH DOWN TO THE GROUND, WITH ENOUGH SPACE UNDERNEATH IT SO THAT TWO PEOPLE COULD BE SLEEPING UNDERNEATH IT WITH THE LENGTH OF THEIR BODIES FACING OUT TO THE OPEN. >> I SEE, OKAY. >> AND WHAT WE DO WITH THAT IS WE LIGHT A FIRE ON THE OPEN SIDE SO THAT THEY GET A LOT OF WARMTH, AND THE FACT THAT YOU HAVE THAT BACKDROP HELPS REFLECT SOME OF THAT HEAT DOWN TOWARDS YOU. >> NICE. BUT, IT DOESN’T TRAP ANY OF THE SMOKE OR ANYTHING LIKE THAT? >> IDEALLY, NO. >> YEAH. >> NO. BUT, THAT’S WHY I SAID, IT’S ALL ABOUT THE FIRE. >> RIGHT. >> IF THE FIRE GOES OUT, THAT LEAN-TO IS REALLY WORTHLESS. >> RIGHT. >> SO, ONE PERSON’S AWAKE AND CONSTANTLY CUTTING WOOD, BECAUSE TO KEEP THE FIRE GOING IT’S AMAZING HOW MUCH WOOD YOU NEED IN THAT ENVIRONMENT. >> WOW. >> WE DID THAT. MY TWO RUSSIANS THAT I-- INITIALLY I WAS GOING TO LAUNCH WITH TWO RUSSIANS, SO I DID THAT WITH TWO RUSSIANS. >> I SEE. >> THEY HAD BOTH DONE THIS BEFORE. THEY WERE REALLY, REALLY GOOD WITH THE MATERIAL WE HAD. >> NICE. >> AND WERE SMART ENOUGH THAT THEY KNEW THAT THE NEXT DAY WE’D HAVE TO SET UP A TEEPEE. SO, OUR LEAN-TO KIND OF HAD A FEW PIECES THAT WE COULD USE FOR THE TEEPEE READY TO GO, SO WE JUST HAD TO CHANGE THE LEAN-TO AND WE KIND OF TURNED IT INTO A TEEPEE ON THE NEXT DAY. >> OH. >> SO, THE TEEPEE WAS GREAT. WE-- IT’S MUCH MORE COMFORTABLE. IT HAD A MUCH SMALLER FIRE INSIDE THE TEEPEE. >> OH, OKAY. >> SO, YOU HAD TO MAKE THE TEEPEE ON THE SECOND DAY BECAUSE IT’S-- I GUESS, IT’S MORE INTENSIVE TO BUILD? IS THAT WHY? >> IT TAKES LONGER TO BUILD. >> I SEE. >> BUT, IT’S ALSO MUCH BETTER SHELTER. >> OKAY. >> SO, IT’S THE TYPE OF THING THAT-- QUITE HONESTLY, I THINK ALL OF US WOULD’VE PREFERRED TO GO RIGHT TO THE TEEPEE, BECAUSE-- I MEAN, I’M NOT 100% CERTAIN IT REALLY IS-- TAKES LONGER TO BUILD, BUT THE RUSSIANS WANTED US TO HAVE THE EXPERIENCE BUILDING BOTH TYPES. >> I SEE. >> AND TO UNDERSTAND WHAT IT TOOK TO LIVE IN BOTH OF THEM. >> OKAY, OKAY. >> YOU NEED A LOT LESS LUMBER TO KEEP THE TEEPEE WARM, BUT AGAIN, WE WERE BOTH-- WE WERE EXPERIENCING BOTH SITUATIONS. >> MM-HMM. WOW. AND THEN, I GUESS, YOU HAVE SURVIVAL TRAINING. WHAT OTHER KINDS OF THINGS DO YOU GO THROUGH? >> WELL, ONE OF THE BIG DEALS FOR ASTRONAUTS THAT WORK AT NASA IS WE COME FROM A LOT OF DIFFERENT BACKGROUNDS-- >> OKAY. >> --FROM MICROBIOLOGIST TO NAVY SEALS. SO, WE’VE GOT TO BE ABLE TO HAVE A CULTURE WHERE ALL THOSE PEOPLE CAN COME TOGETHER AND OPERATE IN A-- OPERATE HIGHLY TECHNICAL MACHINES IN AN ENVIRONMENT WHERE IF YOU MESS IT UP YOU COULD DIE. >> RIGHT. >> SO, ANOTHER THING THAT’S REALLY VERY, VERY INTERESTING IS WE USE T-38s. IT’S A-- IT’S THE SAME TYPE OF AIRCRAFT THAT THE AIR FORCE USES TO TRAIN PILOTS. >> OKAY. >> SO WE USED THOSE. >> MM-HMM. >> THE NICE THING ABOUT IS, MUCH LIKE-- WE CAN’T FLY PEOPLE IN SPACE VERY OFTEN, BUT WE CAN PUT PEOPLE IN THESE JETS VERY OFTEN AND IT-- YOU HAVE TO-- THE JET MOVES REALLY, REALLY FAST, SO YOU HAVE TO BE ABLE TO THINK FAST. YOU’VE ALSO GOT TO COORDINATE WITH THE GROUND AND THEY WILL DIRECT YOU WHAT TO DO, AND AT TIMES YOU HAVE TO MAKE DECISIONS THAT REQUIRE YOU TO SAY, “HEY, I GET WHAT YOU JUST SAID, BUT WE REALLY NEED TO DO THIS BECAUSE WE’RE IN A TOUGH SITUATION,” FOR EXAMPLE. >> OKAY. >> AND YOU HAVE TO COORDINATE WITH THE OTHER CREW MEMBER BECAUSE IT’S A TWO COCKPIT AIRCRAFT. THERE’S A PILOT AND TYPICALLY WE CALL HIM A BACK SEATER. YOU WORK AS THE NAVIGATOR AND COMMUNICATOR IN A NOMINAL SITUATION. >> AND WAS THAT YOUR JOB? >> WELL, BECAUSE I’M NOT A MILITARY PILOT, YES. >> OKAY. >> SO ALL OF THE FRONT SEATERS ARE MILITARY PILOTS IF THEY’RE ASTRONAUTS, AND THEY ARE INSTRUCTOR PILOTS, TYPICALLY FROM THE MILITARY AS WELL IF THEY’RE NOT ASTRONAUTS. IT’S A GREAT DEAL TO HAVE TO GO FLY AROUND IN A JET AS PART OF YOUR JOB. >> RIGHT. DID YOU END UP FLYING A FELLOW ASTRONAUT? OR DID YOU FLY WITH ONE OF THE PILOTS THAT THEY HAD, I GUESS? >> INITIALLY, YOU FLY WITH INSTRUCTORS. >> OKAY. >> BUT, BY AND LARGE, ALMOST EVERY FLIGHT IS WITH THE-- ANOTHER ASTRONAUT PILOT. >> I SEE. DID ANY OF THEM MESS WITH YOU AT ANY TIME OR TRY TO MAKE YOU THROW UP OR ANYTHING LIKE THAT? >> NO. SO, ONE TIME THOUGH-- SO ONE OF THE THINGS THEY ALWAYS TELL-- BECAUSE THEY’RE VERY EXPERIENCED AND WE’RE NOT, IT’S REAL EASY TO JUST ASSUME THAT THEY KNOW HOW TO DO EVERYTHING. THEY CAN FLY THAT JET COMPLETELY BY THEMSELVES. >> AWESOME. >> SO IT CAN BE A LITTLE INTIMIDATING WHEN YOU GET IN THE BACK SEAT. YOU KNOW THE FRONT SEATER CAN DO EVERYTHING BY THEMSELVES. >> MM-HMM. >> BUT, THEY REALLY WANT YOU TO BE ENGAGED AND RECOGNIZE THAT IF THEY DO SOMETHING STUPID THAT WOULD KILL THEM IT’S GOING TO KILL BOTH OF US. >> RIGHT. >> YOU’RE A NANO SECOND BEHIND THEM. AND WE TRAIN AND THE ASTRONAUT PILOTS ALLOW US TO DO EVERYTHING. THEY’LL ALLOW US TO FLY THE JET, DO THE COMMUNICATIONS, DO THE NAVIGATION, JUST TO GET GOOD AT THAT, BECAUSE THERE’S A-- FOR EXAMPLE, IF SOMETHING HAPPENED TO THE PILOT, YOU MIGHT HAVE TO DO THAT. >> RIGHT. >> AND IT’S MORE FUN FOR US. AND ACTUALLY, A LOT OF THE ASTRONAUT PILOTS HAVE EXPERIENCED WITH BEING AN INSTRUCTOR PILOTS, SO THEY’RE GOOD AT THAT. >> MM-HMM. >> WELL, ONE TIME, BARRY WILMORE WAS TRYING TO MAKE SURE I WAS PAYING ATTENTION, AND I WAS SUPPOSED TO BE CLIMBING TO A SPECIFIC ALTITUDE, AND JUST MAYBE ABOUT 500 FEET BEFORE I NEEDED TO START LEVELING OFF, HE SAID, “SO, WHERE DO YOU GO TO CHURCH?” AND I STOPPED PAYING ATTENTION TO WHAT WAS GOING ON IN THE JET AND THEN I STARTED TALKING TO HIM. AND THEN HE DID THAT ON PURPOSE SO THAT HE-- SO THEN I RECOGNIZED I NEED TO PRIORITIZE WHAT I WAS DOING TO THE JET MORE, AND SO THEN HE WAITED UNTIL I WAS REALLY FLYING STRAIGHT THROUGH THE ALTITUDE I WAS SUPPOSED TO BE LEVELING OFF AT AND SAID, “CHECK YOUR ALTITUDE.” AND THEN I DID. ANOTHER TIME, WE’RE NOT-- AS A BACK SEATER, I’M NOT ALLOWED TO FLY WITHIN 200 FEET OF THE GROUND, BUT YOU CAN FLY TOWARDS AN AIRPORT, GET TO 200 FEET, AND THEN ACT LIKE THERE’S A PROBLEM ON THE RUNWAY, AND THEN BASICALLY ADD POWER TO THE JET AND GO THROUGH THE TAKE OFF PROCESS. >> I SEE. >> WELL, EARLIER ON IN MY TRAINING, I WAS FLYING WITH ANOTHER GUY AND HE DID A REALLY GOOD JOB OF LETTING ME MESS UP AS MUCH AS POSSIBLE BEFORE HE’D CORRECT ME SO THAT I WOULD LEARN. SAME TYPE OF THING, I GAVE IT A LOT OF POWER, I STARTED CLIMBING. >> MM-HMM. >> I DIDN’T-- I WASN’T EXPERIENCED ENOUGH TO RECOGNIZE THAT RIGHT AFTER I STARTED CLIMBING I NEEDED TO REDUCE THE POWER. >> OH. >> SO, I WAS REALLY, REALLY SPEEDING UP AND I ONLY HAD TO CLIMB UP TO 3,000 FEET, WHICH YOU DO REALLY FAST IN THAT JET IF YOU HAVEN’T TAKEN THE POWER OUT. >> WHOA. >> AND SO, SAME THING, I GOT TO 3,000 FEET, I WAS CLIMBING REALLY, REALLY FAST, HE SAID, “CHECK YOUR ALTITUDE.” AND MY IMMEDIATE RESPONSE WASN’T TO TAKE OUT THE POWER, IT WAS JUST TO PITCH THE NOSE FORWARD, WHICH MEANT THAT ANYTHING THAT I HAD LOOSE IN THE JET JUST HIT THE CEILING BECAUSE I JUST WENT DOWN SO FAST ALL THE SUDDEN. >> WHOA. >> REALLY GOOD TRAINING. >> YEAH. >> I DIDN’T FORGET THAT LESSON. >> YEAH. THAT’S GOOD THAT YOU GUYS ARE ALWAYS KEEPING EACH OTHER IN CHECK. I’M SURE THAT ALL YOUR ASTRONAUT-- YOUR FELLOW ASTRONAUTS ARE CONSTANTLY DOING THIS, RIGHT? THEY’RE GIVING YOU ADVICE AND ANYTHING LIKE THAT. >> ABSOLUTELY. >> NOW, YOU BEING A FIRST TIME FLYER, I’M SURE THEY’VE GIVEN YOU SOME OF THOSE EXPERIENCES, ESPECIALLY SOME OF YOUR CLASSMATES, RIGHT? >> MM-HMM. >> SO WE HAVE REID WISEMAN, AND I’M TRYING TO THINK. >> MIKE HOPKINS. >> MIKE HOPKINS. >> KJELL LINDGREN. >> KJELL-- ALL THESE GUYS HAVE FLOWN BEFORE. >> KATE RUBINS. >> YEAH, THAT’S RIGHT, KATE MOST RECENTLY. SO, HAVE THESE GUYS GIVEN YOU SOME ADVICE, COME TO YOU AND SAY, “HEY, THIS”-- YOU KNOW, ANY KIND OF THINGS THAT YOU HAVE TO BE WATCHING OUT FOR? >> ABSOLUTELY. >> YEAH. >> AND NOT JUST THEM, ALL OF THEM. >> RIGHT. >> EVERYTHING FROM IF YOU’RE HAVING A BAD DAY DON’T TALK TO IT ON THE-- DON’T TALK TO PEOPLE ABOUT IT ON THE RADIO, TO EXPECTATIONS ON HOW TO-- AS YOU’RE GETTING READY FOR THE LAUNCH AND YOUR FAMILY’S IN KAZAKHSTAN, GETTING READY FOR THAT, WHAT TO EXPECT OUT OF THAT. >> ANY GOOD NUGGETS THAT THEY’VE TOLD YOU? >> CHRIS CASSIDY TOLD ME THAT ONE OF THE THINGS TO DO WHEN YOU’RE DOING A PROCEDURE IS TO MAKE SURE-- THERE’S NOTES BLOCKS IN A LOT OF THE PROCEDURES. >> MM-HMM. >> AND HE SAID, “THE NOTES BLOCKS AREN’T REQUIRED FOR US TO READ.” >> HMM. >> BUT, YOU REALLY NEED TO READ THOSE BECAUSE THEY TYPICALLY GIVE YOU THE BIG PICTURE. >> HMM. >> AND SO, WHEN YOU READ THOSE CAREFULLY, THEN AS YOU’RE DOING THE STEPS IT’LL PREVENT YOU FROM DOING THOSE STEPS BLINDLY, WHICH HELPS YOU BE A LITTLE MORE ACCURATE IN HOW YOU’RE DOING THE PROCEDURE. SO IF YOU KNOW WHY YOU’RE DOING THIS PARTICULAR THING THEN IT’S A LOT EASIER TO RECOGNIZE WHEN YOU’RE PRESSING THE WRONG-- ABOUT TO PRESS THE WRONG BUTTON BECAUSE IT DOESN’T MAKE SENSE. >> I SEE. >> MAYBE YOU MISREAD THAT STEP LATER ON. >> OKAY, SO LIKE, ALL THE LITTLE DETAILS, I’M SURE. >>THERE’S A-- OH, YEAH. YES, YES. >> SO, I MEAN, IS THERE ANYTHING THAT YOU-- THAT ANY ASTRONAUT HAS GIVEN YOU SO FAR JUST TO ALWAYS KEEP THIS IN MIND. I GUESS, THE NOTES IS ONE OF THEM, BUT ESPECIALLY-- MAYBE SOYUZ ASCENT OR SOMETHING, YOU KNOW, MAYBE LEAN BACK. I REMEMBER, WHAT WAS-- I WAS TALKING WITH SHANE KIMBROUGH JUST RECENTLY AND THEY SAID ONCE HE GETS TO A CERTAIN POINT YOU GOT TO MAKE SURE YOU STRAP DOWN, OTHERWISE YOU’RE GOING TO GO FLYING UP OR SOMETHING LIKE THAT. ANY KIND OF PIECES OF ADVICE LIKE THAT? WELL, IT DOESN’T EVEN HAVE TO BE OPERATIONAL. IT COULD BE YOU’RE GOING TO THE BATHROOM AND YOU HAVE TO MAKE SURE THAT YOU TURN THE FAN ON FIRST OR ONE OF THOSE THINGS. >> MM-HMM. >> I’M SURE YOU GO THROUGH ALL OF THOSE THINGS. >> KEEP TRACK OF YOUR STUFF. SO, ONE OF THE THINGS THAT WE’RE VERY COMFORTABLE WITH ON EARTH IS WHEN YOU PUT SOMETHING DOWN IT’S DOWN. >> MM-HMM. >> AND WE TEND TO THINK OF LEAVING THINGS ON A TWO DIMENSIONAL SURFACE AND STAYING THERE. >> YEAH. >> BUT, YOU HAVE AN EXTRA DIMENSION IN SPACE AND YOU HAVE TO PUT A LITTLE EXTRA EFFORT INTO REMEMBERING, LIKE, ANOTHER DIMENSION THAT IT COULD BE SOME PLACE ELSE, TOO. >> THAT’S RIGHT. >> THAT CAN BE CHALLENGING FOR PEOPLE, IS JUST REALLY SLOWING YOURSELF DOWN ENOUGH TO LOOK AT WHERE YOU PUT SOMETHING AND VISUALIZE WHAT’S AROUND YOU. BECAUSE YOU COULD COME BACK TO THE SAME PLACE, AND IF YOU WEREN’T VERY DELIBERATE ABOUT LOOKING AT THAT PLACE FROM AN ORIENTATION THAT YOU ALWAYS TAKE, YOU MIGHT COME IN THERE UPSIDE DOWN AND BE LIKE, “WELL, I REMEMBER PUTTING IT SOMEWHERE IN HERE, BUT NOTHING LOOKS-- I CAN’T PICTURE IT IN THIS SPOT.” >> YEAH. >> SO, THINGS LIKE THAT. >> I REMEMBER TALKING TO MIKE HOPKINS A COUPLE-- WELL, PROBABLY MORE THAN A COUPLE MONTHS AGO, BUT HE-- ONE THINGS THAT ALWAYS STUCK WITH ME WAS HE WAS TALKING ABOUT HE WAS WORKING ON THIS RACK, I GUESS, AND HE HAD TO PULL IT BACK AND GET TO-- GET BEHIND IT. AND JUST THE WAY THAT HE WAS DOING IT, HE JUST-- IT WAS HARD TO REACH. AND I DON’T KNOW IF HE’S TOLD YOU THE SAME STORY, BUT IT WAS HARD TO REACH AND HE CALLS TO THE GROUND, TELLS HIM HIS PROBLEM, AND HE’S LIKE-- AND THEY’RE LIKE, “WELL, FLIP UPSIDE DOWN.” AND HE’S LIKE, “OH, YEAH, I CAN DO THAT.” AND SO, I GUESS YOU’RE TRAINING ON THE GROUND, BUT YOU DO HAVE THE LIMITATIONS OF GRAVITY ON THE GROUND EVEN THOUGH YOU HAVE ALL THESE MOCK UPS. BUT, FLIPPING UPSIDE DOWN WAS-- IT SOLVED THE PROBLEM IMMEDIATELY. HE GOT A WHOLE NEW VANTAGE POINT, BUT YOU CAN’T PRACTICE FLIPPING UP ON-- IN 1G ON THE AIRPLANE. >> YOU CAN'T. YEAH, DEFINITELY CAN’T. >> OH. SO AN ASTRONAUT CLASS, JUST ACTUALLY RECENTLY GOT SELECTED. DOES THIS BRING BACK ANY KIND OF ANY MEMORIES OF WHEN YOU GOT SELECTED AS AN ASTRONAUT BACK IN 2009? >> YES, DEFINITELY. I’VE SEEN A LOT OF THOSE ASTRONAUT HOPEFULS THAT HAVE BEEN EITHER IN THE GYM. >> YEAH. >> OR GOING TO THEIR INTERVIEWS OR WHATEVER. THAT IS AN EMOTIONAL ROLLERCOASTER. I DON’T ENVY THEM AT ALL. >> BECAUSE YOU WENT THROUGH IT. >> ABSOLUTELY, YEAH. >> YEAH, YEAH. >> IT’S-- I THINK I DID A PRETTY GOOD JOB OF ASSUMING THERE WAS NO HOPE THAT I WOULD GET THE JOB AND THAT MADE IT A LOT LESS STRESSFUL. IN FACT, THE ONLY TIME THAT I GOT KIND OF LIKE, “WHOA, BE CAREFUL,” WAS WHEN I THOUGHT I HAD JUST DONE SOMETHING REALLY, REALLY SUCCESSFUL AND MAYBE THERE’S A CHANCE I’LL GET THIS JOB. I THOUGHT, “NO, NO, NO. DON’T DO THAT TO YOURSELF.” >> BECAUSE THAT’S WHEN YOU GET-- YOU MAKE YOURSELF ALL NERVOUS, RIGHT, I GUESS? >> THAT’S WHEN YOU-- IF YOU HAVE NOTHING TO LOSE, THEN IT’S NO BIG DEAL. >> RIGHT. >> I JUST WOULD’VE-- IF I DIDN’T GET THE JOB I WOULD’VE HAD-- STILL HAD A REALLY COOL EXPERIENCE GETTING THE FIRST HAND EXPERIENCE OF WHAT THE ASTRONAUT SELECTION PROCESS IS LIKE, IF NOTHING ELSE. >> YEAH, I MEAN, WHAT IS IT LIKE, RIGHT? I MEAN, YOU SAY IT’S STRESSFUL AND THERE’S THINGS, BUT WHAT ARE THEY DOING THROUGHOUT THIS INTERVIEW PROCESS? >> WELL, I WOULD SAY IT’S-- I’M CERTAIN THAT THE PROCESS THAT THIS CLASS THAT REALLY HASN’T BEEN SELECTED YET, BUT IS IN THE PROCESS OF FINISHING BEING SELECTED. >> UH-HUH, AT THIS TIME THROUGH. >> I’M SURE THEIR-- I KNOW THEIR PROCESS HAS CHANGED SINCE WE WENT THROUGH, BUT THERE’S PSYCHOLOGICAL EXAMINATIONS THAT WE DID. >> OH, WOW. YEAH. >> THERE WAS GROUP PROBLEM SOLVING EXERCISES THAT WE DID. THERE WAS A LOT OF MEDICAL EXAMS, ESPECIALLY BY THE SECOND INTERVIEW. A LOT OF THAT IS CHECKING TO MAKE SURE THAT YOU DON’T HAVE ANY MEDICAL ISSUES. >> RIGHT. THERE ARE-- OF COURSE, THERE’S AN INTERVIEW. EACH TIME YOU COME TO VISIT NASA, THE FIRST TIME AND THE SECOND TIME, THERE’S AN HOUR LONG INTERVIEW. >> MM-HMM. >> THERE-- >> SO, IT’S TO TIMES THAT YOU COME? YOU COME-- >> WELL, THE FIRST TIME-- >> OKAY. >> FOR MY CLASS, THE FIRST TIME THEY INTERVIEWED PEOPLE THEY INVITED 120 PEOPLE TO COME. >> OKAY. >> AND THEN, OF THAT 120 THEY PARED IT DOWN TO 40 OR 50 FOR A SECOND INTERVIEW. >> WOW. >> AND BECAUSE THE MEDICAL EXAMS, YOU CAN IMAGINE ARE SO EXPENSIVE, THEY ONLY GIVE THE MEDICAL EXAMS MOSTLY TO THAT SMALLER GROUP. >> MAKES SENSE. I MEAN, HONESTLY, LIKE TO BE AN ASTRONAUT, NOT ONLY DO YOU HAVE TO BE SUPER SMART AND BE ABLE TO GET ALONG WITH YOUR CREWMATES AND EVERYTHING, BUT YOU HAVE TO MAKE SURE YOU’RE IN TIP TOP PHYSICAL SHAPE AND THAT NOTHING COULD POSSIBLY GO WRONG. YOU WERE FORTUNATE ENOUGH TO ACTUALLY GET THE CALL TO BE-- >> YES, YEAH. >> WHAT WAS THAT LIKE? WHERE WERE YOU? >> I WAS ACTUALLY IN THE MISSION CONTROL CENTER WORKING AS A CAPCOM THAT DAY. >> OH. >> SO IT WAS-- I’M PRETTY SURE THEY DIDN’T KNOW WHERE I WAS. I ANSWERED MY CELL PHONE AND IT WAS TOUGH BECAUSE I WAS SO EXCITED, BUT I WASN’T IN A SITUATION WHERE I WAS ALLOWED TO ANNOUNCE IT TO ANYBODY. >> RIGHT. >> SO I’M SITTING AROUND A WHOLE BUNCH OF OTHER PEOPLE THAT I’M WORKING WITH AND I JUST WANTED TO CHEER, BUT I JUST-- AND I HAD TO-- BUT, I WAS STILL WORKING ON CONSOLE. I HAD TO BE LISTENING FOR THE CREW TO CALL AND I HAD TO BE LISTENING TO WHAT THE GROUND WAS TALKING ABOUT. >> YEAH. >> SO I HAD TO JUST ACT LIKE IT DIDN’T HAPPEN AND JUST GET BACK TO WORK. >> SO, IN THAT SITUATION, FROM WHAT I UNDERSTAND, YOU’RE ONLY ALLOWED TO TELL VERY FEW PEOPLE, LIKE YOUR WIFE AND YOUR PARENTS. >> I TOLD MY WIFE-- YUP. >> AND THAT’S PRETTY MUCH IT. >> YEAH, I THINK I SENT MY WIFE AN EMAIL, TOLD HER WHAT HAD HAPPENED, AND THEN ONLY ABOUT THREE HOURS LATER DID I-- THAT I SENT HER ANOTHER EMAIL THAT SAID, “OH, AND DON’T TELL ANYBODY ELSE.” >> OH. [ LAUGHING ] >> YEAH, LET’S JUST SAY THAT WASN’T QUITE AS SUCCESSFUL AS I SHOULD’VE MADE IT. >> OH, MAN, THAT HAD TO BE-- I CAN’T EVEN IMAGINE JUST GETTING THAT CALL. THAT WOULD BE-- >> I WAS-- YEAH, I WAS PRETTY EXCITED. >> YEAH. >> LET’S GO BACK TO SOME OF THE OTHER TRAINING. SO YOU HAVE-- WE TALKED ABOUT A LITTLE JUST TRAINING FOR ON ORBIT, SURVIVAL TRAINING. HOW ABOUT, I GUESS, SOYUZ TRAINING. NOW, YOU SAID THAT NOW THEY SWITCHED THE CREWS AROUND AND NOW YOU HAVE TO LEARN A LOT MORE. NOW YOU HAVE TO-- YOU HAVE TO BE IN THE KIND OF NOT THE HOT SEAT BUT I GUESS ONE OF THE HOT SEATS? IS THAT HOW THAT WORKS? >> YES. >> OKAY. >> AS I INITIALLY STARTED TRAINING I WAS IN THE RIGHT SEAT. >> OKAY. >> WHICH HAS VERY LIMITED RESPONSIBILITIES. THE CREW-- WELL, EXAMPLE, JACK FISCHER AND FYODOR YURCHIKHIN, WHEN THEY LAUNCHED THEY DIDN’T HAVE ANYBODY IN THE RIGHT SEAT. >> RIGHT. >> THEY DON’T-- YOU DON’T NEED SOMEONE TO BE THERE. >> OKAY. >> THERE ARE SOME THINGS THAT ARE MORE UNCOMFORTABLE FOR-- IT’S VERY, VERY HELPFUL TO HAVE A RIGHT SEATER, AND I REALIZED THAT WHEN I STARTED TRAINING AS A LEFT SEATER BECAUSE YOU NEED SO MUCH MORE TIME TO TRAIN AS A LEFT SEATER. >> MM-HMM. >> YOU DON’T ALWAYS HAVE THE RIGHT SEATER THERE. AND SO, JUST HAVING AN ADDITIONAL PERSON WHO YOU CAN SAY, “HEY, REMIND ME WHEN-- TELL ME WHEN FIVE MINUTES GOES BY,” OR “CALCULATE AT WHAT RATE THE PRESSURE'S DROPPING SO THAT WE CAN FIGURE OUT HOW MUCH TIME WE HAVE TO-- CAN WE WAIT TO LAND AT OUR NOMINAL LANDING SPOT? OR DO WE HAVE TO START THE LANDING PROCESS IMMEDIATELY, WHEREVER THAT TAKES US?” I’M TALKING ABOUT SITUATIONS IN THE SIMULATIONS IN RUSSIA WHERE THEY’RE MAKING IT A REALLY BAD DAY IN THE SOYUZ. >> RIGHT. YEAH. >> SO, WHEN I CHANGED TO BEING A LEFT SEATER IT WAS A LOT-- YOU’RE REALLY HELPING TO OPERATE THE SPACECRAFT. >> MM-HMM. >> THE TRAINING’S GOOD, BUT YOU CAN IMAGINE THE FIRST TIME YOU’RE IN THERE PRESSING BUTTONS AND RECOGNIZING THAT, “IF I MESS THIS UP THIS IS REALLY GOING TO BE BAD.” AND I’VE DONE IT SO MANY TIMES NOW THAT I’M WELL PAST WORRYING ABOUT THAT. >> OH, YEAH. >> BUT, THERE'S A LOT THAT GOES ON AND IT’S-- THE TRAINERS THERE DO A REALLY GOOD JOB OF MAKING YOU READY FOR A REALLY, REALLY BAD DAY, BUT EVEN GIVEN SIX MALFUNCTIONS-- WELL, FOR EXAMPLE, ONE OF THE SIMULATIONS THAT I DON’T THINK I’LL EVER FORGET WAS WE WERE DOCKING WITH THE SPACE STATION AND THIS-- THE AUTOMATIC SYSTEMS TO DOCK HAD STOPPED WORKING, SO THE COMMANDER HAD TO TAKE OVER AND DO EVERYTHING MANUALLY. >> MM-HMM. >> AND THEN, WE GOT UP TO THE SPACE STATION, WE MADE CONTACT WITH THE SPACE STATION. I WAS EXPECTING THE SIMULATION TO END AT ANY MOMENT, BECAUSE ALL WE HAD TO DO AT THIS POINT WAS-- THE WAY THE SOYUZ DOCKING MECHANISM WORKS IS THERE’S A PROBE THAT STICKS OUT THE FRONT, AND THEN ONCE IT MAKES CONNECTION WITH THE SPACE STATION THEN THE NEXT STEP IS YOU RETRACT THAT PROBE AND THAT DRAWS THE TWO SPACECRAFTS TOGETHER. >> OKAY. >> SO WE’RE IN THAT SITUATION, WE’RE CONNECTED NOW TO THE SPACE STATION, BUT THE RETRACTION MECHANISM DIDN’T WORK. >> OH. >> SO WE COULDN’T GET THAT LAST DISTANCE TO CLOSE THE GAP WITH THE SPACE STATION. AND SO, WE’RE GOING THROUGH THE TROUBLESHOOTING FOR THAT. IT WASN’T-- NOTHING HAD TO HAPPEN SUPER FAST. >> MM-HMM. >> WE HAD TIME, SO WE’RE KIND OF GOING THROUGH THAT PROCEDURE. >> OKAY. >> AND THEN, IN THE MIDST OF THAT, SUDDENLY SIMULATED SMOKE STARTED COMING FROM UNDERNEATH THE SPACECRAFT. >> FANTASTIC. >> SO HERE WE ARE-- SO IN THE MIDST OF THAT, WE HAD A FIRE WHERE WE COULDN’T GET TO THE SPACE STATION. WE HAD TO DO AN EMERGENCY UNDOCKING AND THEN HAD-- SO WE HAD TO GO THROUGH THE WHOLE EMERGENCY DESCENT PROCESS. >> WOW. >> AND IT WAS JUST TOTAL-- IT WAS A LOT OF-- TONS OF STUFF HAD TO HAPPEN REALLY FAST AT THAT POINT. >> WOW. YEAH, BECAUSE I MEAN, IF YOU’RE GOING THROUGH THE SIMULATION YOU THINK, LIKE YOU SAID, THIS IS THE LAST THING. >> YEAH, I WAS MENTALLY KIND OF ON THE, LIKE, WINDING DOWN, LIKE, “OKAY, IT WON’T BE LONG NOW AND WE’LL BE DONE.” >> YEAH. >> AND THEN, IT WAS LIKE A WHOLE OTHER SIMULATION STARTED. >> WOW. OH, MY GOSH. THE THINGS YOU GUYS HAVE TO GO THROUGH IS JUST UNREAL. >> BUT, IT’S REALLY KIND OF COOL, TOO. >> IT IS. IT IS. BUT, THAT’S WHAT YOU HAVE TO DO, RIGHT? SO A LOT OF THE-- A LOT OF THE TRAINING IS NOT ONLY KIND OF UNDERSTANDING THE SYSTEMS AND DOING JUST THE DAY TO DAY STUFF, BUT REALLY, “HEY, IF THIS SCENARIO HAPPENS, THIS IS WHAT YOU DO. IF THIS SCENARIO”-- LIKE, A LOT OF PROCEDURAL STUFF. >> AND NOT ONLY THAT, BUT IT’S IMPORTANT THAT WE’RE DOING IT AS A CREW BECAUSE THE STYLES OF EACH PERSON ARE DIFFERENT. AND UNDERSTANDING WHAT THE EXPECTATIONS OF THAT SOYUZ COMMANDER ARE FOR ME AS A LEFT SEATER VERSUS THE CREW WHO HAD TRAINED FOR YEARS TO DO THAT ROLE WHERE I WAS GETTING ANOTHER SIX MONTHS TO DO THAT. >> YEAH. >> SO THE TEAMWORK ASPECT IS HUGE. >> RIGHT. I MEAN, THAT’S TRUE FOR SOME OF THESE THINGS, BUT ALSO, I GUESS, EVA TRAINING, TRAINING IN THE NEUTRAL BUOYANCY LABORATORY. >> YES. >> SO I’M SURE YOU’VE DONE THAT BEFORE, RIGHT? >> A LOT, YUP. >> YEAH, SO WHAT KIND-- HOW OFTEN HAVE YOU BEEN IN DOING THAT KIND OF TRAINING AND SORT OF WHAT IS IT LIKE? >> BEFORE I GOT ASSIGNED, I DID IT ABOUT AN AVERAGE OF SEVEN TIMES A YEAR. >> OKAY. >> AND I THINK I WAS KIND OF PUSHING TO GET MORE OPPORTUNITIES TO DO THAT. >> OKAY. >> NOW THAT I’VE BEEN ASSIGNED, IT’S PROBABLY BEEN A LITTLE LESS THAN THAT. >> INTERESTING. >> BUT, IT’S ALWAYS A SIX HOUR-- IT’S TYPICALLY SIX HOURS UNDERWATER-- >> RIGHT. >> --IN THE EXTERNAL MOBILITY UNIT IS WHAT WE CALL IT, THE SPACEWALKING SPACESUIT. >> MM-HMM, EMU. >> MM-HMM. AND JUST IN CASE PEOPLE AREN’T AWARE, THE WAY THAT WORKS IS THERE’S DIVERS THAT ARE AROUND US TO HELP BALANCE THE SUIT TO MAKE IT AS GOOD AS POSSIBLE A SIMULATION OF WEIGHTLESSNESS. >> RIGHT. >> IT’S-- BECAUSE OF THE AIR VOLUME IN THE SUIT AND THE FACT THAT THE SUIT IS ACTUALLY QUITE HEAVY, IT WOULD BE REALLY EASY TO END UP IN A SITUATION WHERE YOUR LEGS ARE REALLY, REALLY LIGHT AND YOUR CHEST IS HEAVY, AND YOU WOULDN’T HAVE THE STRENGTH TO FLIP YOURSELF SO THAT YOUR FEET ARE BACK UNDERNEATH YOU AGAIN. >> RIGHT. >> SO THE DIVERS WILL HELP TRY TO MAKE IT SEEM A LITTLE MORE LIKE YOU’RE OUT IN SPACE, HOWEVER, THE SUIT IS FLOATING. YOU’RE NOT FLOATING INSIDE THE SUIT. >> YEAH. >> SO IF YOU’RE UPSIDE DOWN IN THE SUIT THEN ALL THE WEIGHT OF YOUR BODY MIGHT BE RESTING ON YOUR SHOULDERS, SO IT’S-- IT CAN NEVER BE A PERFECT SIMULATION. >> YEAH. I GUESS, I MEAN, FROM WHAT I’VE HEARD IS KIND OF-- SO, LIKE YOU SAID IT, YOU’RE UNDERWATER IN THIS HUGE POOL THAT’S LIKE 40 FEET DEEP, JUST ENORMOUS, AND THEY HAVE FULL SCALE MOCKUPS OF THE ISS UNDERNEATH SO YOU CAN ACTUALLY KIND OF FEEL LIKE WHAT IT WOULD BE TO BE ON THE STATION AND HAVE KIND OF THE MUSCLE MEMORY TO KNOW, “OKAY, THIS IS HERE, AND THIS IS HERE, AND THEN THIS HANDRAIL’S HERE,” SO YOU KNOW KIND OF WHERE TO GRAB ON AND EVERYTHING. BUT, FROM WHAT I UNDERSTAND, IS YOU’RE RIGHT, IT’S PROBABLY AS CLOSE TO SIMULATING WHAT IT’S LIKE TO ACTUALLY DO A SPACEWALK AS POSSIBLE. >> MM-HMM. >> BUT, FIRST OF ALL, YEAH, IF YOU’RE UPSIDE DOWN IN SPACE, THAT’S IT, YOU’RE JUST UPSIDE DOWN BUT YOU’RE STILL KIND OF FLOATING IN THE SUIT. >> MM-HMM. >> WHEREAS, YOU STILL HAVE GRAVITY ON EARTH, SO YOU’RE RIGHT, YOU FEEL THE WHOLE WEIGHT. BUT THEN ALSO MOVING, YOU STILL HAVE THAT WATER RESISTANCE, RIGHT. >> THAT’S TRUE. THAT’S VERY TRUE. >> SO I GUESS THINGS FLY A LITTLE BIT QUICKER IN SPACE THAN THEY WOULD IF YOU WERE TO TOSS THEM OR MOVE YOUR HAND OR SOMETHING IN UNDERWATER. AND I’M SURE YOU’VE KIND OF NOTICED A LITTLE BIT OF THAT, RIGHT? AND MAYBE THE DIVERS ARE SORT OF-- ARE SORT OF PUSHING THINGS A LITTLE BIT FASTER SO THAT IT SIMULATES IT? >> NO, WE-- I THINK SOMETIMES BECAUSE IT’S SO HARD FOR THE DIVERS TO TELL WHAT YOU’RE TRYING TO DO. >> OKAY, YEAH. >. THEY TEND TO LIKE LET YOU DO WHAT YOU NEED TO DO, UNLESS THEY CAN TELL IF THERE’S A SITUATION WHERE IT’S CLEARLY NOT. OR, YOU MIGHT-- WHAT I STARTED DOING WITH THE DIVERS IS I REALIZED THAT SOME THINGS THERE’S NO NEED FOR YOU TO FIGHT THROUGH JUST TOUGHING SOMETHING OUT. >> MM-HMM. >> SOMETIMES THEY’LL SAY-- WELL, FOR EXAMPLE, WE HAVE A BODY RESTRAINT TETHER. >> OKAY. >> IT’S KIND OF LIKE A SNAKE THAT YOU CAN RIGIDIZE IN A CERTAIN SHAPE. >> MM-HMM. >> AND IT’S LIKE A THIRD ARM. YOU CAN USE IT TO ATTACH YOURSELF TO THE SPACE STATION SO YOU HAVE TWO HANDS FREE AND YOU CAN DO WORK. >> MM-HMM. >> OR, IF YOU HAVE A LARGE WHAT WE CALL AN ORU, AN ORBITAL REPLACEABLE UNIT. >> OKAY, IT’S LIKE A SPARE PART ALMOST? >> A SPARE PART. >> YEAH. RIGHT. >> IT COULD BE VERY LARGE. IT COULD BE REALLY TINY. >> OKAY. >> YOU CAN ATTACH THAT TO THAT BODY RESTRAINT TETHER AND TRANSLATE ALONG AND IT'LL JUST BE THERE. >> OKAY. >> WELL, IMAGINE THAT THAT THING WANTS TO FLOAT UP TO THE SURFACE OF THE WATER. >> RIGHT. >> OR WANTS TO SINK TO THE BOTTOM OF THE POOL. THE DIVERS WILL HOLD ON TO THAT, BUT THEN YOU COULD POTENTIALLY HAVE THIS ARM STICKING OFF OF YOUR HIP AND IF A DIVER DOESN’T REALIZE THAT YOU’RE TRYING REALLY HARD TO ROTATE TOWARDS YOUR RIGHT SHOULDER YOU’RE NOT JUST TRYING TO ROTATE YOURSELF, YOU’RE SUDDENLY TRYING TO ROTATE THIS DIVER WITH A TANK WHO’S HOLDING ON TO THAT. >> RIGHT. >> SO WHEN I REALIZED THAT THAT BECOMES AN ISSUE SOMETIMES IS THAT I JUST SAY, “HEY, I’M NOT SURE WHY, BUT I’M HAVING A HARD TIME ROTATING TOWARDS MY RIGHT SHOULDER.” AND THEN SUDDENLY IT’LL BECOME VERY EASY TO ROTATE TOWARDS MY RIGHT SHOULDER. >> SO YOU DON’T HAVE DIRECT COMMUNICATIONS WITH THE DIVERS THEN? >> OH, THERE’S UNDERWATER SPEAKERS. >> OH. >> SO EVERYTHING YOU’RE SAYING-- IF THERE’S A LOT OF NOISE UNDERWATER, BECAUSE WHEN WE DO SCUBA STUFF SOMETIMES IT IS HARD TO HEAR. >> UH-HUH. >> WHEN YOU’RE BLOWING BUBBLES OUT, THERE’S A LOT OF NOISE FROM THE BUBBLES. BUT IF THEY STOP BREATHING FOR A MOMENT THEY CAN HEAR WHAT YOU’RE SAYING AND THEY’RE REALLY, REALLY GOOD ABOUT KEEPING TRACK OF WHAT WE’RE SAYING. >> THAT’S RIGHT. YEAH, AND THEY DO-- I MEAN, I’VE SPOKEN WITH DIVERS IN THE PAST AND THEY DO-- SO YOU GUYS DO SIX HOUR KIND OF SIMULATIONS UNDERWATER AND THEY DO TWO HOUR ROTATIONS. >> MM-HMM. >> AND IT’S A LITTLE BIT DIFFERENT BECAUSE THE ASTRONAUTS ARE IN THE EMUs, SO YOU GUYS HAVE THE LIQUID COOLING GARMENT, AND YOU GUYS ARE AT A PRETTY GOOD TEMPERATURE. BUT FOR THEM, TWO HOURS IS A LONG TIME TO BE IN THE POOL AND THE TEMPERATURES, SO THEY DO THAT KIND OF ROTATION THING. >> YEAH, THAT’S TRUE. YEAH, IT’S ALSO PARTLY BECAUSE IT’S SUCH-- THEY’RE RESPONSIBLE FOR OUR SAFETY AND IT’S A VERY-- THEY’VE GOT TO BE VERY, VERY ATTENTIVE SO THEY GOT TO MAKE SURE THEY’RE SUPER ALERT. AND THERE ARE LIMITATIONS FOR HOW LONG YOU CAN DIVE ON THOSE TANKS. >> YEAH. YEAH. SO, I MEAN, ONE OF THE THINGS I THINK ABOUT WITH BEING AN ASTRONAUT AND PREPARING TO BE AN ASTRONAUT IS JUST HOW PHYSICALLY ABLE YOU HAVE TO BE. YOU HAVE TO-- BECAUSE YOU’RE TALK-- I MEAN, WE’RE TALKING ABOUT SPACESUITS, THESE ARE VERY HEAVY AND BEING ABLE TO SPEND SIX HOURS UNDERWATER IN A POOL, NOT EATING, YOU KNOW, I’D BE SO HUNGRY AFTER SIX HOURS. BUT, THINGS LIKE THAT, WHAT DO YOU DO TO STAY HEALTHY AND TO MAKE SURE YOU’RE PHYSICALLY AT YOUR PEAK TO MAKE SURE YOU’RE ABLE TO DO ALL OF THESE CRAZY THINGS-- SURVIVE IN RUSSIA IN THE WINTER, AND STUFF LIKE THAT? >> SO, I HAD A BOSS ONE TIME WHEN I FIRST-- EARLY IN MY ARMY CAREER, THAT SAID MAKE PHYSICAL TRAINING THE FIRST PRIORITY OF EVERY DAY. >> HMM. >> AND I THINK SOMETIMES WE DON’T GIVE OURSELVES PERMISSION TO DO THAT. WE MIGHT FEEL A LITTLE GUILTY, LIKE IT ALMOST SEEMS SELFISH. >> YEAH. >> BUT, BECAUSE MY BOSS TOLD ME THAT, IT REALLY IS SOMETHING THAT STUCK WITH ME AND I REALLY I CAN’T AFFORD TO ALWAYS MAKE IT THE FIRST PRIORITY OF EVERY DAY. >> MM-HMM. >> BUT, I’VE RECOGNIZED THAT IT REALLY DOES NEED TO BE A PRIORITY AND THE NICE THING ABOUT THIS JOB IS THE JOB GIVES US OPPORTUNITIES TO DO THAT. >> MM-HMM. >> IT’S GOT A GREAT FACILITY. WE’VE GOT GREAT TRAINERS AND WE’VE ALSO GOT-- IF WE INJURE OURSELVES WE’VE GOT PEOPLE THAT’LL HELP US GET REHABILITATED AS QUICKLY AS POSSIBLE. >> AND YOU GUYS-- THE ASTRONAUTS ACTUALLY HAVE THEIR OWN GYM HERE, RIGHT, AT THE JOHNSON SPACE CENTER? >> IT’S ACTUALLY NOT REALLY CALLED THE ASTRONAUT GYM. >> OH, OKAY. >> IT’S MORE DESIGNED TOWARDS A REHABILITATION FACILITY. >> OH. >> SO, WHEN PEOPLE COME BACK FROM SPACE, WE NEED-- THEY’VE GOT TO READAPT TO LIVING IN GRAVITY AGAIN. >> RIGHT. >> AND THAT’S REALLY THE PRIMARY FUNCTION. >> MM-HMM. >>IT WORKS OUT THAT AS A SECONDARY BENEFIT OF THAT IS WE GET SOME REALLY GOOD WORKOUT FACILITIES. >> THAT’S RIGHT. I REMEMBER TALKING WITH, AGAIN, SHANE KIMBROUGH A COUPLE WEEKS AGO, I THINK AT THIS POINT. YEAH, A COUPLE WEEKS AGO AND HE HAD-- I GOT THE CHANCE TO TALK WITH HIM JUST TWO DAYS AFTER HE LANDED. >> MM-HMM. >> AND HE WAS ALREADY WORKING OUT. IT’S CRAZY. I MEAN, HE WAS TALKING ABOUT BEING DIZZY JUST RIGHT AFTER LANDING, AND THEN, BAM, HE’S UP ON HIS FEET AND BEING REHABILITATED. >> MM-HMM. >> THAT’S CRAZY. SO, ARE THERE ANY OTHER SORT OF TRAINING ASPECTS THAT, LIKE, WE NEED TO KNOW BASED-- >> INTERESTING STUFF? >> YEAH, INTERESTING STUFF THAT YOU GO THROUGH THAT JUST, YOU KNOW, A CIVILIAN LIKE US DON’T REALLY GET TO EXPERIENCE. YOU KNOW, I KNOW ABOUT THE SURVIVAL TRAINING, ALL THE DIFFERENT THINGS THAT YOU DO TO PREPARE FOR BEING ON ORBIT, LEARNING ALL THE SYSTEMS, LEARNING HOW TO DO EVAs, ALL THESE DIFFERENT THINGS. >> YEAH, THERE’S ANOTHER FACILITY THAT I THINK IS REALLY, REALLY NEAT. IT’S CALLED THE VIRTUAL REALITY LAB HERE AT JOHNSON SPACE CENTER. >> OH. >> HAVE YOU EVER BEEN OVER THERE? >> YOU KNOW, I’VE SEEN IT. OH, IS THAT THE ONE WHERE YOU SIT IN THE CHAIR AND THEY PUT THE GOGGLES OVER YOU AND YOU HAVE THE HANDS-- YES, I’VE DONE THAT, YEAH. >> THAT’S AMAZING. THERE’S TWO THINGS THAT I’VE REALLY GOTTEN A KICK OUT OF LATELY DOING OVER THERE. ONE IS THE-- PRACTICING USING THE SAFER-- >> OH, OKAY. >> SO, EVERYTIME WE DO A SPACE WALK, WE’RE ALWAYS TETHERED TO THE SPACE STATION, SO THAT-- AND WE’RE LOCALLY TETHERED, SO IF YOU LET GO, YOU SHOULD STAY RIGHT WITHIN HANDS REACH OF SOMETHING. >> RIGHT. >> BUT ALSO ANOTHER, MUCH LONGER TETHER, JUST IN CASE WE MESS THAT UP, THAT WILL KEEP US SAFELY ATTACHED TO THE SPACE STATION. BUT IF WE MESS BOTH OF THOSE THINGS UP, THERE’S ALSO A THING CALLED THE SIMPLIFIED AID FOR EVA RESCUE. IT’S CALLED A SAFER. >> SAFER. >> IT LOOKS LIKE A BACKPACK THAT WE WEAR THAT’S BASICALLY A JET PACK. >> YEAH. >> BUT IT’S GOT VERY LIMITED RESOURCES AND YOU NEED TO KNOW HOW TO USE IT. SO, TO PRACTICE FLYING YOURSELF AS AN INDEPENDENT SPACECRAFT BACK TO THE SPACE STATION REQUIRES A LITTLE BIT OF TRAINING. SO, WHAT THEY DO IN THAT TRAINING IS THEY’LL TELL YOU, “OKAY, HERE’S WHERE WE’RE GOING TO START. YOU CAN SEE THE SPACE STATION RIGHT THERE.” I MEAN, YOU’RE WEARING THOSE GOGGLES, SO YOU CAN LOOK IN ANY DIRECTION AND YOU SEE EITHER STARS OR THE EARTH OR THE SPACE STATION. >> MM-HMM. >> AND THEN, THEY’LL SAY, “OKAY, WE’RE GOING TO START THE SIMULATION.” AND THEY’LL PUSH YOU OFF OF THE SPACE STATION. >> WHOA! >> SO THE SPACE STATION WILL BE SPINNING AND YOU’LL BE-- THE DISTANCE WILL BE INCREASING BETWEEN YOU AND THE SPACE STATION. >> SO YOU’RE SORT OF TUMBLING IN THIS SIMULATION, RIGHT? >> YES, ABSOLUTELY. >> OH, WHOA! >> AND YOU HAVE TO DO THAT BECAUSE IT TAKES A LITTLE BIT OF TIME FOR-- THEY KNOW THAT IT TAKES SOME TIME TO DEPLOY THE SAFER AND THE HAND CONTROLLERS AND THINGS LIKE THAT. >> OKAY. >> SO, MAYBE TEN SECONDS. I CAN’T REMEMBER EXACTLY. >> MM-HMM. >> AND THEY’LL TELL YOU-- BECAUSE, YOU’RE INITIALLY-- THEY DON’T HAVE A MOCK UP WHERE YOU HAVE TO ACTUALLY DEPLOY THE SAFER. YOU START OFF WITH HOLDING IT IN YOUR HANDS. >> OH. >> BECAUSE THEY KNOW IT’S GOING TO TAKE SOME TIME, THEY DON’T LET YOU START IT RIGHT AWAY. >> MAKES SENSE, OKAY. >> SO THEY’LL SAY, “OKAY, NOW YOU CAN START IT.” BUT, THE FIRST THING YOU’VE GOT TO DO IS CALL THE GROUND AND SAY, “HEY, THIS IS EV2. I’M NOT CONNECTED TO THE SPACE STATION. I’M HEADING NADIR AND I’M DEPLOYING THE SAFER.” WHICH, YOU CAN IMAGINE, WOULD BE A VERY UNCOMFORTABLE SITUATION. >> OH, YEAH. YEAH, THAT’S A VERY CALM WAY OF SAYING, “HEY, I’M PLUMMETING TOWARDS EARTH, BY THE WAY.” >> AND IT’S A PRETTY SLOW SPEED, THANKFULLY. >> THAT’S TRUE. >> BECAUSE IT WOULD HAVE TO BE A SPEED WHERE YOU PUSHED YOURSELF OFF. >> OKAY, OKAY. >> BUT THE SAFER’S REALLY NEAT. ONCE YOU DEPLOY IT, IT WILL STOP ITSELF. SO, YOU MIGHT BE SPINNING, BUT ONCE YOU-- IT’S GOT SENSORS, SO IT WILL STOP ALL THE ROTATIONS. SO, YOU’LL BE FIXED IN ONE LOCATION. IT MIGHT BE LOOKING AWAY FROM THE SPACE STATION, BUT AT LEAST YOU’RE NOT ROTATING ANYMORE. AND THEN WE’RE TRAINED FIRST TO YAW, TO FIND THE SPACE STATION. >> OKAY. >> AND THEN-- SO WE START THAT YAW AND THEN ONCE YOU GET TO THE RIGHT STOP PLACE, THEN YOU PRESS A BUTTON AND IT’LL STOP THAT ROTATION AGAIN. >> FANCY. >> BASICALLY, YOU HAVE A LITTLE BIT OF AN IMPULSE. DON’T USE UP MUCH OF THE RESOURCES. >> RIGHT. >> YOU WAIT, BE PATIENT, WAIT FOR THE SPACE STATION TO BE LINED UP, AND THEN YOU STOP IT, AND THEN YOU CAN ADJUST YOUR PITCH. >> OKAY. >> GIVE IT JUST A LITTLE BIT, BE PATIENT, WAIT SO YOU’RE JUST LINED UP. AND THEN YOU CHANGE IT FROM ADJUSTING ROTATIONS TO ADJUSTING THE TRANSLATIONS. >> OKAY. >> SO, IDEALLY, AT THAT POINT, YOU’RE LINED UP EXACTLY WHERE YOU WANT TO GO, WHICH SHOULD BE EXACTLY WHERE YOU LEFT FROM, AND THEN YOU JUST GIVE IT A POSITIVE X. SO YOU START TRANSLATING DIRECTLY TOWARDS IT, JUST A LITTLE BIT. AND, IF YOUR AIM IS GOOD, YOU SHOULDN’T HAVE TO MAKE ANY ADJUSTMENTS AND YOU HAVE PLENTY OF RESOURCES TO GET BACK. >> ALL RIGHT. >> IF YOU MESS UP-- MAYBE YOU FORGOT HOW TO CONTROL IT-- YOU COULD BURN THROUGH HALF OF YOUR STUFF AND JUST COMPLETELY MISS THE SPACE STATION. >> OKAY, SO, IT’S NOT LIKE A JETPACK HOW YOU WOULD IMAGINE IN LIKE A SCI-FI MOVIE, WHERE YOU’RE JUST KIND OF ZOOMING AROUND. IT’S STOP, PRESS A BUTTON, TURN, PRESS A BUTTON, LEAN FORWARD, OR WHATEVER IT IS. >> YOU DON’T WANT TO OVERDO ANY OF THOSE THINGS. >> RIGHT. >> YOU WANT TO DO EVERYTHING-- YOU WANT TO BE VERY CALM ABOUT IT. >> VERY METHODICAL, YEAH. >> AND THEY’LL DO IT AT A VARIETY OF LOCATIONS. THEY’LL DO IT FROM DIFFERENT VELOCITIES OF SEPARATION. >> OKAY. >> SO, THAT’S REALLY GOOD TRAINING. >> YEAH. >> ANOTHER THING-- DO YOU HAVE ANY QUESTIONS ABOUT THAT? >> NO-- WELL, I MEAN, THE ONE THING I WAS GOING TO ASK WAS: DO YOU GUYS HAVE A COMPETITION TO SEE HOW ACCURATE YOU CAN GO ON THAT FIRST-- BECAUSE YOU SAID YOU’VE GOT TO LINE UP AND THE HOPE IS THAT YOU PRESS THE BUTTON ONCE AND THEN YOU GO RIGHT WHERE-- DO YOU GUYS HAVE COMPETITIONS TO SEE WHO’S THE MOST ACCURATE? >> I HAVEN’T EVER WALKED OUT OF THERE AND TRIED TO COMPARE HOW MUCH PROPELLANT I HAD LEFT TO SOMEBODY ELSE. BUT MAYBE THAT MIGHT BE A GOOD THING TO DO IN THE FUTURE. WE’LL HAVE LIKE AN ASTRONAUT OLYMPICS. >> YEAH, THAT WOULD BE FUN. >> THAT WOULD BE REALLY FUN. >> YEAH! >> OR REALLY HUMBLING. >> YEAH! GO THROUGH THE TRAINING AND SEE-- DO LIKE LITTLE THINGS LIKE THAT. >> “HOW’D YOU SCORE?” >> “I HAD THIS MUCH PROPELLANT LEFT.” >> “OOH! I HAD THIS MUCH.” >> NO, BUT GO ON. YOU WERE GOING TO SAY SOMETHING ELSE. >> OH, ANOTHER THING THAT I THOUGHT WAS REALLY INTERESTING IN VIRTUAL REALITY LAB IS THEY TRAIN YOU HOW TO DO MASS HANDLING. SO, YOU PUT ON THOSE GLASSES AGAIN. >> OKAY. >> THIS TIME, AGAIN, YOU’RE SITTING IN THE CHAIR. BUT THEY HAVE, BASICALLY, HANDLES, LIKE WE WOULD HAVE FOR AN ORU. >> MM-HMM. >> IT COULD BE SOMETHING THAT, IN SPACE, HAS A MASS OF 1,000 KILOGRAMS. IT COULD BE SOMETHING THAT’S 200 KILOGRAMS. BUT THEY CAN SET UP THE COMPUTER, THE SIMULATION TO OPERATE THAT WAY. AND IT’S ATTACHED TO A BUNCH OF STRINGS IN EACH DIRECTION. >> OH. >> SO, YOU CAN START IT MOVING AND YOU’LL FEEL THE FORCE. AS YOU GET IT MOVING-- YOU CAN IMAGINE IF IT’S A TON-- >> RIGHT. >> AS YOU GET IT MOVING, IT’S HARDER TO GET IT TO STOP MOVING. AND MAYBE IT’S HARD TO GET-- >> OH. >> SO THINGS ARE, WE CALL IT, WEIGHTLESS. >> RIGHT. >> BUT THEY HAVE A LOT OF INERTIA. THEY HAVE THE SAME AMOUNT OF INERTIA AS THEY HAVE ON THE GROUND. >> MM-HMM. >> IF SOMETHING WEIGHS A LOT, IT’S GOING TO TAKE MORE FORCE TO GET IT STARTED MOVING-- >> MM-HMM. >> --AND MORE FORCE TO STOP IT MOVING. AND IT’S A REALLY INTERESTING-- IT’S THE CLOSEST TO DEALING WITH WEIGHTLESSNESS THAT I’VE EVER FELT, BECAUSE I HAD A LARGE OBJECT THAT I NEEDED TO LINE UP OVER SOME PINS. AND THEN, ONCE I GOT IT OVER THE PINS, I HAD TO LOWER IT DOWN. THE FIRST TIME I DID IT, I THINK, AS MOST PEOPLE WOULD, YOU HAVE A TENDENCY TO WANT TO BE MOVING IT ALL THE TIME. SO, I GRABBED THIS OBJECT. IT SEEMS REALLY HEAVY. I GET IT STARTED MOVING, BUT I KIND OF KEEP PUSHING IT. I’M USING MY STRENGTH TO KEEP IT MOVING. >> RIGHT. >> AND THEN, I HAD TO USE EVEN MORE STRENGTH TO GET IT TO STOP MOVING. THE SECOND TIME I DID IT, I REALIZED THAT ONCE I GOT IT STARTED MOVING I COULD ALMOST-- I COULD TAKE MY HAND-- BECAUSE IT WAS ALREADY MOVING. NOTHING’S GOING TO STOP IT FROM MOVING. >> MM-HMM. >> SO, ONCE I GOT IT JUST MOVING REALLY SLOWLY I JUST PUT MY FINGERTIPS ON THOSE HANDLES AND THEY KEPT MOVING. >> OH. >> AND THEN I JUST-- VERY RELAXED AND VERY CALMLY WAITED FOR IT TO GET TO THE RIGHT SPOT. AND I GAVE IT VERY LITTLE PRESSURE TO STOP IT, THIS MASSIVE OBJECT. >> WOW! >> AND THEN I-- WHEN I WANTED TO MOVE IT DOWN-- I JUST GAVE IT A LITTLE BIT OF A NUDGE. AS SOON AS I KNEW THAT IT WAS MOVING IN THE RIGHT DIRECTION, I JUST USED MY FINGERTIPS AND LET IT GO. AND I SUSPECT, WHEN YOU’RE IN SPACE, DOING A SPACE WALK THAT, BECAUSE WE’RE IN THE POOL, YOU’RE GOING TO HAVE THIS TENDENCY, WHEN WE WERE TRAINING AS A NEWBIE, TO WANT TO FEEL LIKE YOU’VE GOT TO CONTINUOUSLY FORCE YOURSELF TO KEEP MOVING. >> RIGHT. >> BUT ONCE YOU START GETTING YOURSELF TO MOVE IN THE RIGHT DIRECTION, YOU JUST HAVE TO USE FINGERTIP PRESSURE TO TEND YOURSELF AND MAKE SURE YOU’RE CONTINUING TO DO THE RIGHT THING. >> SO, THAT’S THE NICE PAIRING BETWEEN DOING SIMULATION RUNS IN THE NEUTRAL BUOYANCY LABORATORY AND THEN GOING TO THE VIRTUAL REALITY AND DOING-- YOU JUST GET A DIFFERENT PERSPECTIVE. >> EXACTLY. IN THE NBL-- IN THE NEUTRAL BUOYANCY LAB-- >> YEAH. >> YOU CAN MOVE 100 METERS. >> MM-HMM. >> IN THE VIRTUAL REALITY LAB, YOU CAN MOVE ABOUT A FOOT. YOU CAN MOVE SOMETHING ABOUT A FOOT. SO, IT’S REALLY JUST A FINE TUNING OF THINGS. >> IT’S THE LITTLE THINGS. BUT THEY’RE REALLY IMPORTANT, RIGHT? >> ABSOLUTELY. >> KNOWING THAT IF YOU TRY TO TUG THIS BIG, MASSIVE OBJECT REALLY, REALLY FAST, IT’S GOING TO BE REALLY HARD TO STOP. >> YES. >> THOSE ARE LITTLE THINGS BUT, ALSO, EXTREMELY IMPORTANT. ALL RIGHT. WELL, MARK, THANKS FOR TAKING THE TIME TO ACTUALLY SIT DOWN AND TALK THROUGH SOME OF THE ASTRONAUT TRAINING AND WHAT IT WAS LIKE TO BE SELECTED AS AN ASTRONAUT, ALL OF THE ABOVE. I KNOW YOU’RE VERY BUSY, SO I KNOW THIS IS A BIG CHUNK OF TIME FOR YOU. SO, THAT WAS AWESOME. BUT, FOR THE LISTENERS, IF YOU WANT TO KNOW MORE, AND FOLLOW MARK’S JOURNEY ONCE HE GOES TO THE INTERNATIONAL SPACE STATION, STAY TUNED UNTIL AFTER THE MUSIC CLOSING CREDITS THAT WE HAVE HERE AND WE’LL TELL YOU EXACTLY WHERE YOU NEED TO GO. SO, THANKS AGAIN, MARK, FOR COMING ON THE SHOW. >> THANK YOU. [ MUSIC ] >> HOUSTON, GO AHEAD. >> I’M ON THE SPACE SHUTTLE. >> ROGER, ZERO-G AND I FEEL FINE. >> SHUTTLE HAS CLEARED THE TOWER. >> WE CAME IN PEACE FOR ALL MANKIND. >> IT’S ACTUALLY A HUGE HONOR TO BREAK THE RECORD LIKE THIS. >> NOT BECAUSE THEY ARE EASY, BUT BECAUSE THEY ARE HARD. >> HOUSTON, WELCOME TO SPACE. >> HEY, THANKS FOR STICKING AROUND. SO, TODAY WE TALKED WITH MARK VANDE HEI. HE’S GOING TO BE LAUNCHING TO THE INTERNATIONAL SPACE STATION LATER THIS YEAR OR MAYBE RIGHT NOW, DEPENDING ON WHEN THIS PODCAST GETS POSTED. BUT MARK IS ON SOCIAL MEDIA. HE’S ON TWITTER @ASTRO_SABOT. THAT’S S-A-B-O-T, AND YOU CAN FOLLOW HIS JOURNEY ABOARD THE INTERNATIONAL SPACE STATION AS HE TALKS ABOUT HIS DAY-TO-DAY LIFE AND MAYBE TAKES SOME PHOTOS FROM THAT VANTAGE POINT 250 MILES ABOVE THE EARTH. YOU CAN ALSO SEE HIS JOURNEY AT NASA.GOV/ISS. WE HAVE UPDATES ALL THE TIME ON WHAT’S GOING ON ABOARD THE INTERNATIONAL SPACE STATION. SOME OF THE RESEARCH STUDIES AND EXPERIMENTS THAT MARK WILL BE TAKING PART OF WHILE HE’S ABOARD. ON SOCIAL MEDIA, WE’RE VERY ACTIVE. JUST GO TO FACEBOOK, TWITTER, OR INSTAGRAM. ON FACEBOOK IT’S INTERNATIONAL SPACE STATION, ON TWITTER IT’S @SPACE_STATION, AND ON INSTAGRAM IT’S @ISS. WE’LL BE FOLLOWING MARK THROUGHOUT HIS JOURNEY AND POSTING PICTURES OF HIM AND SOME OF THE THINGS THAT HE’S DOING WHILE ON THAT ORBITING COMPLEX. YOU CAN ALSO USE THE #ASKNASA ON ANY ONE OF THOSE PLATFORMS AND SUBMIT AN IDEA FOR THE PODCAST, MAYBE ASK ANY QUESTIONS, AND WE’LL MAKE SURE TO ANSWER IT IN A LATER PODCAST. THIS PODCAST WAS RECORDED ON MAY THE 4th. THAT’S RIGHT, WE RECORDED TWO PODCASTS ON MAY THE 4th. MAY THE FOURTH BE WITH YOU. SUPER LATE. I’M STILL GOING TO SAY IT. AND SPECIAL THANKS TO JOHN STOLL, ALEX PERRYMAN, PAT RYAN, AND JOHN STREETER FOR MAKING THIS PODCAST HAPPEN. AND THANKS AGAIN TO MR. MARK VANDE HEI FOR COMING ON THE SHOW. WE’LL BE BACK NEXT WEEK.
Substrate specificity of low-molecular mass bacterial DD-peptidases.
Nemmara, Venkatesh V; Dzhekieva, Liudmila; Sarkar, Kumar Subarno; Adediran, S A; Duez, Colette; Nicholas, Robert A; Pratt, R F
2011-11-22
The bacterial DD-peptidases or penicillin-binding proteins (PBPs) catalyze the formation and regulation of cross-links in peptidoglycan biosynthesis. They are classified into two groups, the high-molecular mass (HMM) and low-molecular mass (LMM) enzymes. The latter group, which is subdivided into classes A-C (LMMA, -B, and -C, respectively), is believed to catalyze DD-carboxypeptidase and endopeptidase reactions in vivo. To date, the specificity of their reactions with particular elements of peptidoglycan structure has not, in general, been defined. This paper describes the steady-state kinetics of hydrolysis of a series of specific peptidoglycan-mimetic peptides, representing various elements of stem peptide structure, catalyzed by a range of LMM PBPs (the LMMA enzymes, Escherichia coli PBP5, Neisseria gonorrhoeae PBP4, and Streptococcus pneumoniae PBP3, and the LMMC enzymes, the Actinomadura R39 dd-peptidase, Bacillus subtilis PBP4a, and N. gonorrhoeae PBP3). The R39 enzyme (LMMC), like the previously studied Streptomyces R61 DD-peptidase (LMMB), specifically and rapidly hydrolyzes stem peptide fragments with a free N-terminus. In accord with this result, the crystal structures of the R61 and R39 enzymes display a binding site specific to the stem peptide N-terminus. These are water-soluble enzymes, however, with no known specific function in vivo. On the other hand, soluble versions of the remaining enzymes of those noted above, all of which are likely to be membrane-bound and/or associated in vivo and have been assigned particular roles in cell wall biosynthesis and maintenance, show little or no specificity for peptides containing elements of peptidoglycan structure. Peptidoglycan-mimetic boronate transition-state analogues do inhibit these enzymes but display notable specificity only for the LMMC enzymes, where, unlike peptide substrates, they may be able to effectively induce a specific active site structure. The manner in which LMMA (and HMM) DD-peptidases achieve substrate specificity, both in vitro and in vivo, remains unknown. © 2011 American Chemical Society
Oehr, Lucy; Anderson, Jacqueline
2017-11-01
To undertake a systematic review and meta-analysis of the relationship between microstructural damage and cognitive function after hospitalized mixed-mechanism (HMM) mild traumatic brain injury (mTBI). PsycInfo, EMBASE, and MEDLINE were used to find relevant empirical articles published between January 2002 and January 2016. Studies that examined the specific relationship between diffusion tensor imaging (DTI) and cognitive test performance were included. The final sample comprised previously medically and psychiatrically healthy adults with HMM mTBI. Specific data were extracted including mTBI definitional criteria, descriptive statistics, outcome measures, and specific results of associations between DTI metrics and cognitive test performance. Of the 248 original articles retrieved and reviewed, 8 studies met all inclusion criteria and were included in the meta-analysis. The meta-analysis revealed statistically significant associations between reduced white matter integrity and poor performance on measures of attention (fractional anisotropy [FA]: d=.413, P<.001; mean diffusivity [MD]: d=-.407, P=.001), memory (FA: d=.347, P<.001; MD: d=-.568, P<.001), and executive function (FA: d=.246, P<.05), which persisted beyond 1 month postinjury. The findings from the meta-analysis provide clear support for an association between in vivo markers of underlying neuropathology and cognitive function after mTBI. Furthermore, these results demonstrate clearly for the first time that in vivo markers of structural neuropathology are associated with cognitive dysfunction within the domains of attention, memory, and executive function. These findings provide an avenue for future research to examine the causal relationship between mTBI-related neuropathology and cognitive dysfunction. Furthermore, they have important implications for clinical management of patients with mTBI because they provide a more comprehensive understanding of factors that are associated with cognitive dysfunction after mTBI. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Nuclear Thermal Propulsion Development Risks
NASA Technical Reports Server (NTRS)
Kim, Tony
2015-01-01
There are clear advantages of development of a Nuclear Thermal Propulsion (NTP) for a crewed mission to Mars. NTP for in-space propulsion enables more ambitious space missions by providing high thrust at high specific impulse ((is) approximately 900 sec) that is 2 times the best theoretical performance possible for chemical rockets. Missions can be optimized for maximum payload capability to take more payload with reduced total mass to orbit; saving cost on reduction of the number of launch vehicles needed. Or missions can be optimized to minimize trip time significantly to reduce the deep space radiation exposure to the crew. NTR propulsion technology is a game changer for space exploration to Mars and beyond. However, 'NUCLEAR' is a word that is feared and vilified by some groups and the hostility towards development of any nuclear systems can meet great opposition by the public as well as from national leaders and people in authority. The public often associates the 'nuclear' word with weapons of mass destruction. The development NTP is at risk due to unwarranted public fears and clear honest communication of nuclear safety will be critical to the success of the development of the NTP technology. Reducing cost to NTP development is critical to its acceptance and funding. In the past, highly inflated cost estimates of a full-scale development nuclear engine due to Category I nuclear security requirements and costly regulatory requirements have put the NTP technology as a low priority. Innovative approaches utilizing low enriched uranium (LEU). Even though NTP can be a small source of radiation to the crew, NTP can facilitate significant reduction of crew exposure to solar and cosmic radiation by reducing trip times by 3-4 months. Current Human Mars Mission (HMM) trajectories with conventional propulsion systems and fuel-efficient transfer orbits exceed astronaut radiation exposure limits. Utilizing extra propellant from one additional SLS launch and available energy in the NTP fuel, HMM radiation exposure can be reduced significantly.
Change and Anomaly Detection in Real-Time GPS Data
NASA Astrophysics Data System (ADS)
Granat, R.; Pierce, M.; Gao, X.; Bock, Y.
2008-12-01
The California Real-Time Network (CRTN) is currently generating real-time GPS position data at a rate of 1-2Hz at over 80 locations. The CRTN data presents the possibility of studying dynamical solid earth processes in a way that complements existing seismic networks. To realize this possibility we have developed a prototype system for detecting changes and anomalies in the real-time data. Through this system, we can can correlate changes in multiple stations in order to detect signals with geographical extent. Our approach involves developing a statistical model for each GPS station in the network, and then using those models to segment the time series into a number of discrete states described by the model. We use a hidden Markov model (HMM) to describe the behavior of each station; fitting the model to the data requires neither labeled training examples nor a priori information about the system. As such, HMMs are well suited to this problem domain, in which the data remains largely uncharacterized. There are two main components to our approach. The first is the model fitting algorithm, regularized deterministic annealing expectation- maximization (RDAEM), which provides robust, high-quality results. The second is a web service infrastructure that connects the data to the statistical modeling analysis and allows us to easily present the results of that analysis through a web portal interface. This web service approach facilitates the automatic updating of station models to keep pace with dynamical changes in the data. Our web portal interface is critical to the process of interpreting the data. A Google Maps interface allows users to visually interpret state changes not only on individual stations but across the entire network. Users can drill down from the map interface to inspect detailed results for individual stations, download the time series data, and inspect fitted models. Alternatively, users can use the web portal look at the evolution of changes on the network by moving backwards and forwards in time.
Electrical Power Systems for NASA's Space Transportation Program
NASA Technical Reports Server (NTRS)
Lollar, Louis F.; Maus, Louis C.
1998-01-01
Marshall Space Flight Center (MSFC) is the National Aeronautics and Space Administration's (NASA) lead center for space transportation systems development. These systems include earth to orbit launch vehicles, as well as vehicles for orbital transfer and deep space missions. The tasks for these systems include research, technology maturation, design, development, and integration of space transportation and propulsion systems. One of the key elements in any transportation system is the electrical power system (EPS). Every transportation system has to have some form of electrical power and the EPS for each of these systems tends to be as varied and unique as the missions they are supporting. The Preliminary Design Office (PD) at MSFC is tasked to perform feasibility analyses and preliminary design studies for new projects, particularly in the space transportation systems area. All major subsystems, including electrical power, are included in each of these studies. Three example systems being evaluated in PD at this time are the Liquid Fly Back Booster (LFBB) system, the Human Mission to Mars (HMM) study, and a tether based flight experiment called the Propulsive Small Expendable Deployer System (ProSEDS). These three systems are in various stages of definition in the study phase.
Real-time classification of auditory sentences using evoked cortical activity in humans
NASA Astrophysics Data System (ADS)
Moses, David A.; Leonard, Matthew K.; Chang, Edward F.
2018-06-01
Objective. Recent research has characterized the anatomical and functional basis of speech perception in the human auditory cortex. These advances have made it possible to decode speech information from activity in brain regions like the superior temporal gyrus, but no published work has demonstrated this ability in real-time, which is necessary for neuroprosthetic brain-computer interfaces. Approach. Here, we introduce a real-time neural speech recognition (rtNSR) software package, which was used to classify spoken input from high-resolution electrocorticography signals in real-time. We tested the system with two human subjects implanted with electrode arrays over the lateral brain surface. Subjects listened to multiple repetitions of ten sentences, and rtNSR classified what was heard in real-time from neural activity patterns using direct sentence-level and HMM-based phoneme-level classification schemes. Main results. We observed single-trial sentence classification accuracies of 90% or higher for each subject with less than 7 minutes of training data, demonstrating the ability of rtNSR to use cortical recordings to perform accurate real-time speech decoding in a limited vocabulary setting. Significance. Further development and testing of the package with different speech paradigms could influence the design of future speech neuroprosthetic applications.
BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins.
van Heel, Auke J; de Jong, Anne; Song, Chunxu; Viel, Jakob H; Kok, Jan; Kuipers, Oscar P
2018-05-21
Interest in secondary metabolites such as RiPPs (ribosomally synthesized and posttranslationally modified peptides) is increasing worldwide. To facilitate the research in this field we have updated our mining web server. BAGEL4 is faster than its predecessor and is now fully independent from ORF-calling. Gene clusters of interest are discovered using the core-peptide database and/or through HMM motifs that are present in associated context genes. The databases used for mining have been updated and extended with literature references and links to UniProt and NCBI. Additionally, we have included automated promoter and terminator prediction and the option to upload RNA expression data, which can be displayed along with the identified clusters. Further improvements include the annotation of the context genes, which is now based on a fast blast against the prokaryote part of the UniRef90 database, and the improved web-BLAST feature that dynamically loads structural data such as internal cross-linking from UniProt. Overall BAGEL4 provides the user with more information through a user-friendly web-interface which simplifies data evaluation. BAGEL4 is freely accessible at http://bagel4.molgenrug.nl.
The Essentials of Protein Import in the Degenerate Mitochondrion of Entamoeba histolytica
Dolezal, Pavel; Dagley, Michael J.; Kono, Maya; Wolynec, Peter; Likić, Vladimir A.; Foo, Jung Hock; Sedinová, Miroslava; Tachezy, Jan; Bachmann, Anna; Bruchhaus, Iris; Lithgow, Trevor
2010-01-01
Several essential biochemical processes are situated in mitochondria. The metabolic transformation of mitochondria in distinct lineages of eukaryotes created proteomes ranging from thousands of proteins to what appear to be a much simpler scenario. In the case of Entamoeba histolytica, tiny mitochondria known as mitosomes have undergone extreme reduction. Only recently a single complete metabolic pathway of sulfate activation has been identified in these organelles. The E. histolytica mitosomes do not produce ATP needed for the sulfate activation pathway and for three molecular chaperones, Cpn60, Cpn10 and mtHsp70. The already characterized ADP/ATP carrier would thus be essential to provide cytosolic ATP for these processes, but how the equilibrium of inorganic phosphate could be maintained was unknown. Finally, how the mitosomal proteins are translocated to the mitosomes had remained unclear. We used a hidden Markov model (HMM) based search of the E. histolytica genome sequence to discover candidate (i) mitosomal phosphate carrier complementing the activity of the ADP/ATP carrier and (ii) membrane-located components of the protein import machinery that includes the outer membrane translocation channel Tom40 and membrane assembly protein Sam50. Using in vitro and in vivo systems we show that E. histolytica contains a minimalist set up of the core import components in order to accommodate a handful of mitosomal proteins. The anaerobic and parasitic lifestyle of E. histolytica has produced one of the simplest known mitochondrial compartments of all eukaryotes. Comparisons with mitochondria of another amoeba, Dictystelium discoideum, emphasize just how dramatic the reduction of the protein import apparatus was after the loss of archetypal mitochondrial functions in the mitosomes of E. histolytica. PMID:20333239
Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming
2012-07-01
We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.
An accurate algorithm for the detection of DNA fragments from dilution pool sequencing experiments.
Bansal, Vikas
2018-01-01
The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention. We formulate the problem of detecting DNA fragments from dilution pool sequencing experiments as a genome segmentation problem and develop an algorithm that uses dynamic programming to optimize a likelihood function derived from a generative model for the sequence reads. This algorithm uses an iterative approach to automatically infer the mean background read depth and the number of fragments in each pool. Using simulated data, we demonstrate that our method, FragmentCut, has 25-30% greater sensitivity compared with an HMM based method for fragment detection and can also detect overlapping fragments. On a whole-genome human fosmid pool dataset, the haplotypes assembled using the fragments identified by FragmentCut had greater N50 length, 16.2% lower switch error rate and 35.8% lower mismatch error rate compared with two existing methods. We further demonstrate the greater accuracy of our method using two additional dilution pool datasets. FragmentCut is available from https://bansal-lab.github.io/software/FragmentCut. vibansal@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Automated surgical skill assessment in RMIS training.
Zia, Aneeq; Essa, Irfan
2018-05-01
Manual feedback in basic robot-assisted minimally invasive surgery (RMIS) training can consume a significant amount of time from expert surgeons' schedule and is prone to subjectivity. In this paper, we explore the usage of different holistic features for automated skill assessment using only robot kinematic data and propose a weighted feature fusion technique for improving score prediction performance. Moreover, we also propose a method for generating 'task highlights' which can give surgeons a more directed feedback regarding which segments had the most effect on the final skill score. We perform our experiments on the publicly available JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) and evaluate four different types of holistic features from robot kinematic data-sequential motion texture (SMT), discrete Fourier transform (DFT), discrete cosine transform (DCT) and approximate entropy (ApEn). The features are then used for skill classification and exact skill score prediction. Along with using these features individually, we also evaluate the performance using our proposed weighted combination technique. The task highlights are produced using DCT features. Our results demonstrate that these holistic features outperform all previous Hidden Markov Model (HMM)-based state-of-the-art methods for skill classification on the JIGSAWS dataset. Also, our proposed feature fusion strategy significantly improves performance for skill score predictions achieving up to 0.61 average spearman correlation coefficient. Moreover, we provide an analysis on how the proposed task highlights can relate to different surgical gestures within a task. Holistic features capturing global information from robot kinematic data can successfully be used for evaluating surgeon skill in basic surgical tasks on the da Vinci robot. Using the framework presented can potentially allow for real-time score feedback in RMIS training and help surgical trainees have more focused training.
Identification of copy number variants in whole-genome data using Reference Coverage Profiles
Glusman, Gustavo; Severson, Alissa; Dhankani, Varsha; Robinson, Max; Farrah, Terry; Mauldin, Denise E.; Stittrich, Anna B.; Ament, Seth A.; Roach, Jared C.; Brunkow, Mary E.; Bodian, Dale L.; Vockley, Joseph G.; Shmulevich, Ilya; Niederhuber, John E.; Hood, Leroy
2015-01-01
The identification of DNA copy numbers from short-read sequencing data remains a challenge for both technical and algorithmic reasons. The raw data for these analyses are measured in tens to hundreds of gigabytes per genome; transmitting, storing, and analyzing such large files is cumbersome, particularly for methods that analyze several samples simultaneously. We developed a very efficient representation of depth of coverage (150–1000× compression) that enables such analyses. Current methods for analyzing variants in whole-genome sequencing (WGS) data frequently miss copy number variants (CNVs), particularly hemizygous deletions in the 1–100 kb range. To fill this gap, we developed a method to identify CNVs in individual genomes, based on comparison to joint profiles pre-computed from a large set of genomes. We analyzed depth of coverage in over 6000 high quality (>40×) genomes. The depth of coverage has strong sequence-specific fluctuations only partially explained by global parameters like %GC. To account for these fluctuations, we constructed multi-genome profiles representing the observed or inferred diploid depth of coverage at each position along the genome. These Reference Coverage Profiles (RCPs) take into account the diverse technologies and pipeline versions used. Normalization of the scaled coverage to the RCP followed by hidden Markov model (HMM) segmentation enables efficient detection of CNVs and large deletions in individual genomes. Use of pre-computed multi-genome coverage profiles improves our ability to analyze each individual genome. We make available RCPs and tools for performing these analyses on personal genomes. We expect the increased sensitivity and specificity for individual genome analysis to be critical for achieving clinical-grade genome interpretation. PMID:25741365
Chandra, Saket; Kazmi, Andaleeb Z; Ahmed, Zainab; Roychowdhury, Gargi; Kumari, Veena; Kumar, Manish; Mukhopadhyay, Kunal
2017-07-01
NB-ARC domain-containing resistance genes from the wheat genome were identified, characterized and localized on chromosome arms that displayed differential yet positive response during incompatible and compatible leaf rust interactions. Wheat (Triticum aestivum L.) is an important cereal crop; however, its production is affected severely by numerous diseases including rusts. An efficient, cost-effective and ecologically viable approach to control pathogens is through host resistance. In wheat, high numbers of resistance loci are present but only few have been identified and cloned. A comprehensive analysis of the NB-ARC-containing genes in complete wheat genome was accomplished in this study. Complete NB-ARC encoding genes were mined from the Ensembl Plants database to predict 604 NB-ARC containing sequences using the HMM approach. Genome-wide analysis of orthologous clusters in the NB-ARC-containing sequences of wheat and other members of the Poaceae family revealed maximum homology with Oryza sativa indica and Brachypodium distachyon. The identification of overlap between orthologous clusters enabled the elucidation of the function and evolution of resistance proteins. The distributions of the NB-ARC domain-containing sequences were found to be balanced among the three wheat sub-genomes. Wheat chromosome arms 4AL and 7BL had the most NB-ARC domain-containing contigs. The spatio-temporal expression profiling studies exemplified the positive role of these genes in resistant and susceptible wheat plants during incompatible and compatible interaction in response to the leaf rust pathogen Puccinia triticina. Two NB-ARC domain-containing sequences were modelled in silico, cloned and sequenced to analyze their fine structures. The data obtained in this study will augment isolation, characterization and application NB-ARC resistance genes in marker-assisted selection based breeding programs for improving rust resistance in wheat.
Fast and accurate inference of local ancestry in Latino populations
Baran, Yael; Pasaniuc, Bogdan; Sankararaman, Sriram; Torgerson, Dara G.; Gignoux, Christopher; Eng, Celeste; Rodriguez-Cintron, William; Chapela, Rocio; Ford, Jean G.; Avila, Pedro C.; Rodriguez-Santana, Jose; Burchard, Esteban Gonzàlez; Halperin, Eran
2012-01-01
Motivation: It is becoming increasingly evident that the analysis of genotype data from recently admixed populations is providing important insights into medical genetics and population history. Such analyses have been used to identify novel disease loci, to understand recombination rate variation and to detect recent selection events. The utility of such studies crucially depends on accurate and unbiased estimation of the ancestry at every genomic locus in recently admixed populations. Although various methods have been proposed and shown to be extremely accurate in two-way admixtures (e.g. African Americans), only a few approaches have been proposed and thoroughly benchmarked on multi-way admixtures (e.g. Latino populations of the Americas). Results: To address these challenges we introduce here methods for local ancestry inference which leverage the structure of linkage disequilibrium in the ancestral population (LAMP-LD), and incorporate the constraint of Mendelian segregation when inferring local ancestry in nuclear family trios (LAMP-HAP). Our algorithms uniquely combine hidden Markov models (HMMs) of haplotype diversity within a novel window-based framework to achieve superior accuracy as compared with published methods. Further, unlike previous methods, the structure of our HMM does not depend on the number of reference haplotypes but on a fixed constant, and it is thereby capable of utilizing large datasets while remaining highly efficient and robust to over-fitting. Through simulations and analysis of real data from 489 nuclear trio families from the mainland US, Puerto Rico and Mexico, we demonstrate that our methods achieve superior accuracy compared with published methods for local ancestry inference in Latinos. Availability: http://lamp.icsi.berkeley.edu/lamp/lampld/ Contact: bpasaniu@hsph.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22495753
An Overview of Mars Vicinity Transportation Concepts for a Human Mars Mission
NASA Technical Reports Server (NTRS)
Dexter, Carol E.; Kos, Larry
1998-01-01
To send a piloted mission to Mars, transportation systems must be developed for the Earth to Orbit, trans Mars injection (TMI), capture into Mars orbit, Mars descent, surface stay, Mars ascent, trans Earth injection (TEI), and Earth return phases. This paper presents a brief overview of the transportation systems for the Human Mars Mission (HMM) only in the vicinity of Mars. This includes: capture into Mars orbit, Mars descent, surface stay, and Mars ascent. Development of feasible mission scenarios now is important for identification of critical technology areas that must be developed to support future human missions. Although there is no funded human Mars mission today, architecture studies are focusing on missions traveling to Mars between 2011 and the early 2020's.
Energy efficient cooperation in underlay RFID cognitive networks for a water smart home.
Nasir, Adnan; Hussain, Syed Imtiaz; Soong, Boon-Hee; Qaraqe, Khalid
2014-09-30
Shrinking water resources all over the world and increasing costs of water consumption have prompted water users and distribution companies to come up with water conserving strategies. We have proposed an energy-efficient smart water monitoring application in [1], using low power RFIDs. In the home environment, there exist many primary interferences within a room, such as cell-phones, Bluetooth devices, TV signals, cordless phones and WiFi devices. In order to reduce the interference from our proposed RFID network for these primary devices, we have proposed a cooperating underlay RFID cognitive network for our smart application on water. These underlay RFIDs should strictly adhere to the interference thresholds to work in parallel with the primary wireless devices [2]. This work is an extension of our previous ventures proposed in [2,3], and we enhanced the previous efforts by introducing a new system model and RFIDs. Our proposed scheme is mutually energy efficient and maximizes the signal-to-noise ratio (SNR) for the RFID link, while keeping the interference levels for the primary network below a certain threshold. A closed form expression for the probability density function (pdf) of the SNR at the destination reader/writer and outage probability are derived. Analytical results are verified through simulations. It is also shown that in comparison to non-cognitive selective cooperation, this scheme performs better in the low SNR region for cognitive networks. Moreover, the hidden Markov model's (HMM) multi-level variant hierarchical hidden Markov model (HHMM) approach is used for pattern recognition and event detection for the data received for this system [4]. Using this model, a feedback and decision algorithm is also developed. This approach has been applied to simulated water pressure data from RFID motes, which were embedded in metallic water pipes.
ADaCGH: A Parallelized Web-Based Application and R Package for the Analysis of aCGH Data
Díaz-Uriarte, Ramón; Rueda, Oscar M.
2007-01-01
Background Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. Methodology/Principal Findings ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. Conclusions/Significance ADaCGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45×); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects. PMID:17710137
ADaCGH: A parallelized web-based application and R package for the analysis of aCGH data.
Díaz-Uriarte, Ramón; Rueda, Oscar M
2007-08-15
Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. ADACGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45x); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.
Truong, Kevin; Ikura, Mitsuhiko
2003-05-06
Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
Identification and analysis of integrons and cassette arrays in bacterial genomes
Touchon, Marie; Néron, Bertrand; Rocha, Eduardo PC
2016-01-01
Abstract Integrons recombine gene arrays and favor the spread of antibiotic resistance. Their broader roles in bacterial adaptation remain mysterious, partly due to lack of computational tools. We made a program – IntegronFinder – to identify integrons with high accuracy and sensitivity. IntegronFinder is available as a standalone program and as a web application. It searches for attC sites using covariance models, for integron-integrases using HMM profiles, and for other features (promoters, attI site) using pattern matching. We searched for integrons, integron-integrases lacking attC sites, and clusters of attC sites lacking a neighboring integron-integrase in bacterial genomes. All these elements are especially frequent in genomes of intermediate size. They are missing in some key phyla, such as α-Proteobacteria, which might reflect selection against cell lineages that acquire integrons. The similarity between attC sites is proportional to the number of cassettes in the integron, and is particularly low in clusters of attC sites lacking integron-integrases. The latter are unexpectedly abundant in genomes lacking integron-integrases or their remains, and have a large novel pool of cassettes lacking homologs in the databases. They might represent an evolutionary step between the acquisition of genes within integrons and their stabilization in the new genome. PMID:27130947
2012-01-01
Background Short-chain dehydrogenases/reductases (SDRs) form one of the largest and oldest NAD(P)(H) dependent oxidoreductase families. Despite a conserved ‘Rossmann-fold’ structure, members of the SDR superfamily exhibit low sequence similarities, which constituted a bottleneck in terms of identification. Recent classification methods, relying on hidden-Markov models (HMMs), improved identification and enabled the construction of a nomenclature. However, functional annotations of plant SDRs remain scarce. Results Wide-scale analyses were performed on ten plant genomes. The combination of hidden Markov model (HMM) based analyses and similarity searches led to the construction of an exhaustive inventory of plant SDR. With 68 to 315 members found in each analysed genome, the inventory confirmed the over-representation of SDRs in plants compared to animals, fungi and prokaryotes. The plant SDRs were first classified into three major types — ‘classical’, ‘extended’ and ‘divergent’ — but a minority (10% of the predicted SDRs) could not be classified into these general types (‘unknown’ or ‘atypical’ types). In a second step, we could categorize the vast majority of land plant SDRs into a set of 49 families. Out of these 49 families, 35 appeared early during evolution since they are commonly found through all the Green Lineage. Yet, some SDR families — tropinone reductase-like proteins (SDR65C), ‘ABA2-like’-NAD dehydrogenase (SDR110C), ‘salutaridine/menthone-reductase-like’ proteins (SDR114C), ‘dihydroflavonol 4-reductase’-like proteins (SDR108E) and ‘isoflavone-reductase-like’ (SDR460A) proteins — have undergone significant functional diversification within vascular plants since they diverged from Bryophytes. Interestingly, these diversified families are either involved in the secondary metabolism routes (terpenoids, alkaloids, phenolics) or participate in developmental processes (hormone biosynthesis or catabolism, flower development), in opposition to SDR families involved in primary metabolism which are poorly diversified. Conclusion The application of HMMs to plant genomes enabled us to identify 49 families that encompass all Angiosperms (‘higher plants’) SDRs, each family being sufficiently conserved to enable simpler analyses based only on overall sequence similarity. The multiplicity of SDRs in plant kingdom is mainly explained by the diversification of large families involved in different secondary metabolism pathways, suggesting that the chemical diversification that accompanied the emergence of vascular plants acted as a driving force for SDR evolution. PMID:23167570
Sub-seasonal-to-seasonal Reservoir Inflow Forecast using Bayesian Hierarchical Hidden Markov Model
NASA Astrophysics Data System (ADS)
Mukhopadhyay, S.; Arumugam, S.
2017-12-01
Sub-seasonal-to-seasonal (S2S) (15-90 days) streamflow forecasting is an emerging area of research that provides seamless information for reservoir operation from weather time scales to seasonal time scales. From an operational perspective, sub-seasonal inflow forecasts are highly valuable as these enable water managers to decide short-term releases (15-30 days), while holding water for seasonal needs (e.g., irrigation and municipal supply) and to meet end-of-the-season target storage at a desired level. We propose a Bayesian Hierarchical Hidden Markov Model (BHHMM) to develop S2S inflow forecasts for the Tennessee Valley Area (TVA) reservoir system. Here, the hidden states are predicted by relevant indices that influence the inflows at S2S time scale. The hidden Markov model also captures the both spatial and temporal hierarchy in predictors that operate at S2S time scale with model parameters being estimated as a posterior distribution using a Bayesian framework. We present our work in two steps, namely single site model and multi-site model. For proof of concept, we consider inflows to Douglas Dam, Tennessee, in the single site model. For multisite model we consider reservoirs in the upper Tennessee valley. Streamflow forecasts are issued and updated continuously every day at S2S time scale. We considered precipitation forecasts obtained from NOAA Climate Forecast System (CFSv2) GCM as predictors for developing S2S streamflow forecasts along with relevant indices for predicting hidden states. Spatial dependence of the inflow series of reservoirs are also preserved in the multi-site model. To circumvent the non-normality of the data, we consider the HMM in a Generalized Linear Model setting. Skill of the proposed approach is tested using split sample validation against a traditional multi-site canonical correlation model developed using the same set of predictors. From the posterior distribution of the inflow forecasts, we also highlight different system behavior under varied global and local scale climatic influences from the developed BHMM.
Energy Efficient Cooperation in Underlay RFID Cognitive Networks for a Water Smart Home
Nasir, Adnan; Hussain, Syed Imtiaz; Soong, Boon-Hee; Qaraqe, Khalid
2014-01-01
Shrinking water resources all over the world and increasing costs of water consumption have prompted water users and distribution companies to come up with water conserving strategies. We have proposed an energy-efficient smart water monitoring application in [1], using low power RFIDs. In the home environment, there exist many primary interferences within a room, such as cell-phones, Bluetooth devices, TV signals, cordless phones and WiFi devices. In order to reduce the interference from our proposed RFID network for these primary devices, we have proposed a cooperating underlay RFID cognitive network for our smart application on water. These underlay RFIDs should strictly adhere to the interference thresholds to work in parallel with the primary wireless devices [2]. This work is an extension of our previous ventures proposed in [2,3], and we enhanced the previous efforts by introducing a new system model and RFIDs. Our proposed scheme is mutually energy efficient and maximizes the signal-to-noise ratio (SNR) for the RFID link, while keeping the interference levels for the primary network below a certain threshold. A closed form expression for the probability density function (pdf) of the SNR at the destination reader/writer and outage probability are derived. Analytical results are verified through simulations. It is also shown that in comparison to non-cognitive selective cooperation, this scheme performs better in the low SNR region for cognitive networks. Moreover, the hidden Markov model’s (HMM) multi-level variant hierarchical hidden Markov model (HHMM) approach is used for pattern recognition and event detection for the data received for this system [4]. Using this model, a feedback and decision algorithm is also developed. This approach has been applied to simulated water pressure data from RFID motes, which were embedded in metallic water pipes. PMID:25271565
Time-dependence of graph theory metrics in functional connectivity analysis
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J.; Haneef, Zulfi; Stern, John M.
2016-01-01
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. PMID:26518632
Time-dependence of graph theory metrics in functional connectivity analysis.
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J; Haneef, Zulfi; Stern, John M
2016-01-15
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. Copyright © 2015 Elsevier Inc. All rights reserved.
Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina
2016-01-01
Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805
Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Chang, Tzu-Hao; Bretaña, Neil; Lai, K; Weng, Julia; Lee, Tzong-Yi
2015-01-01
In eukaryotes, ubiquitin-conjugation is an important mechanism underlying proteasome-mediated degradation of proteins, and as such, plays an essential role in the regulation of many cellular processes. In the ubiquitin-proteasome pathway, E3 ligases play important roles by recognizing a specific protein substrate and catalyzing the attachment of ubiquitin to a lysine (K) residue. As more and more experimental data on ubiquitin conjugation sites become available, it becomes possible to develop prediction models that can be scaled to big data. However, no development that focuses on the investigation of ubiquitinated substrate specificities has existed. Herein, we present an approach that exploits an iteratively statistical method to identify ubiquitin conjugation sites with substrate site specificities. In this investigation, totally 6259 experimentally validated ubiquitinated proteins were obtained from dbPTM. After having filtered out homologous fragments with 40% sequence identity, the training data set contained 2658 ubiquitination sites (positive data) and 5532 non-ubiquitinated sites (negative data). Due to the difficulty in characterizing the substrate site specificities of E3 ligases by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. The profile hidden Markov model (profile HMM) was adopted to construct the predictive models learned from the identified substrate motifs. A five-fold cross validation was then used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 73.07%, 65.46%, and 67.93%, respectively. Additionally, an independent testing set, completely blind to the training data of the predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (76.13%) and outperform other ubiquitination site prediction tool. A case study demonstrated the effectiveness of the characterized substrate motifs for identifying ubiquitination sites. The proposed method presents a practical means of preliminary analysis and greatly diminishes the total number of potential targets required for further experimental confirmation. This method may help unravel their mechanisms and roles in E3 recognition and ubiquitin-mediated protein degradation.
Fonseca, Erica L.; Andrade, Bruno G. N.; Vicente, Ana C. P.
2018-01-01
The worldwide dispersion and sudden emergence of new antibiotic resistance genes (ARGs) determined the need in uncovering which environment participate most as their source and reservoir. ARGs closely related to those currently found in human pathogens occur in the resistome of anthropogenic impacted environments. However, the role of pristine environment as the origin and source of ARGs remains underexplored and controversy, particularly, the marine environments represented by the oceans. Here, due to the ocean nature, we hypothesized that the resistome of this pristine/low-impacted marine environment is represented by distant ARG homologs. To test this hypothesis we performed an in silico analysis on the Global Ocean Sampling (GOS) metagenomic project dataset focusing on the metallo-β-lactamases (MβLs) as the ARG model. MβLs have been a challenge to public health, since they hydrolyze the carbapenems, one of the last therapeutic choice in clinics. Using Hidden Markov Model (HMM) profiles, we were successful in identifying a high diversity of distant MβL homologs, related to the B1, B2, and B3 subclasses. The majority of them were distributed across the Atlantic, Indian, and Pacific Oceans being related to the chromosomally encoded MβL GOB present in Elizabethkingia genus. It was observed only a reduced number of metagenomic sequence homologs related to the acquired MβL enzymes (VIM, SPM-1, and AIM-1) that currently have impact in clinics. Therefore, low antibiotic impacted marine environment, as the ocean, are unlikely the source of ARGs that have been causing enormous threat to the public health. PMID:29675014
Fonseca, Erica L; Andrade, Bruno G N; Vicente, Ana C P
2018-01-01
The worldwide dispersion and sudden emergence of new antibiotic resistance genes (ARGs) determined the need in uncovering which environment participate most as their source and reservoir. ARGs closely related to those currently found in human pathogens occur in the resistome of anthropogenic impacted environments. However, the role of pristine environment as the origin and source of ARGs remains underexplored and controversy, particularly, the marine environments represented by the oceans. Here, due to the ocean nature, we hypothesized that the resistome of this pristine/low-impacted marine environment is represented by distant ARG homologs. To test this hypothesis we performed an in silico analysis on the Global Ocean Sampling (GOS) metagenomic project dataset focusing on the metallo-β-lactamases (MβLs) as the ARG model. MβLs have been a challenge to public health, since they hydrolyze the carbapenems, one of the last therapeutic choice in clinics. Using Hidden Markov Model (HMM) profiles, we were successful in identifying a high diversity of distant MβL homologs, related to the B1, B2, and B3 subclasses. The majority of them were distributed across the Atlantic, Indian, and Pacific Oceans being related to the chromosomally encoded MβL GOB present in Elizabethkingia genus. It was observed only a reduced number of metagenomic sequence homologs related to the acquired MβL enzymes (VIM, SPM-1, and AIM-1) that currently have impact in clinics. Therefore, low antibiotic impacted marine environment, as the ocean, are unlikely the source of ARGs that have been causing enormous threat to the public health.
Sloothaak, J; Odoni, D I; de Graaff, L H; Martins Dos Santos, V A P; Schaap, P J; Tamayo-Ramos, J A
2015-01-01
The development of biological processes that replace the existing petrochemical-based industry is one of the biggest challenges in biotechnology. Aspergillus niger is one of the main industrial producers of lignocellulolytic enzymes, which are used in the conversion of lignocellulosic feedstocks into fermentable sugars. Both the hydrolytic enzymes responsible for lignocellulose depolymerisation and the molecular mechanisms controlling their expression have been well described, but little is known about the transport systems for sugar uptake in A. niger. Understanding the transportome of A. niger is essential to achieve further improvements at strain and process design level. Therefore, this study aims to identify and classify A. niger sugar transporters, using newly developed tools for in silico and in vivo analysis of its membrane-associated proteome. In the present research work, a hidden Markov model (HMM), that shows a good performance in the identification and segmentation of functionally validated glucose transporters, was constructed. The model (HMMgluT) was used to analyse the A. niger membrane-associated proteome response to high and low glucose concentrations at a low pH. By combining the abundance patterns of the proteins found in the A. niger plasmalemma proteome with their HMMgluT scores, two new putative high-affinity glucose transporters, denoted MstG and MstH, were identified. MstG and MstH were functionally validated and biochemically characterised by heterologous expression in a S. cerevisiae glucose transport null mutant. They were shown to be a high-affinity glucose transporter (K m = 0.5 ± 0.04 mM) and a very high-affinity glucose transporter (K m = 0.06 ± 0.005 mM), respectively. This study, focusing for the first time on the membrane-associated proteome of the industrially relevant organism A. niger, shows the global response of the transportome to the availability of different glucose concentrations. Analysis of the A. niger transportome with the newly developed HMMgluT showed to be an efficient approach for the identification and classification of new glucose transporters.
An automated approach for annual layer counting in ice cores
NASA Astrophysics Data System (ADS)
Winstrup, M.; Svensson, A.; Rasmussen, S. O.; Winther, O.; Steig, E.; Axelrod, A.
2012-04-01
The temporal resolution of some ice cores is sufficient to preserve seasonal information in the ice core record. In such cases, annual layer counting represents one of the most accurate methods to produce a chronology for the core. Yet, manual layer counting is a tedious and sometimes ambiguous job. As reliable layer recognition becomes more difficult, a manual approach increasingly relies on human interpretation of the available data. Thus, much may be gained by an automated and therefore objective approach for annual layer identification in ice cores. We have developed a novel method for automated annual layer counting in ice cores, which relies on Bayesian statistics. It uses algorithms from the statistical framework of Hidden Markov Models (HMM), originally developed for use in machine speech recognition. The strength of this layer detection algorithm lies in the way it is able to imitate the manual procedures for annual layer counting, while being based on purely objective criteria for annual layer identification. With this methodology, it is possible to determine the most likely position of multiple layer boundaries in an entire section of ice core data at once. It provides a probabilistic uncertainty estimate of the resulting layer count, hence ensuring a proper treatment of ambiguous layer boundaries in the data. Furthermore multiple data series can be incorporated to be used at once, hence allowing for a full multi-parameter annual layer counting method similar to a manual approach. In this study, the automated layer counting algorithm has been applied to data from the NGRIP ice core, Greenland. The NGRIP ice core has very high temporal resolution with depth, and hence the potential to be dated by annual layer counting far back in time. In previous studies [Andersen et al., 2006; Svensson et al., 2008], manual layer counting has been carried out back to 60 kyr BP. A comparison between the counted annual layers based on the two approaches will be presented and their differences discussed. Within the estimated uncertainties, the two methodologies agree. This shows the potential for a fully automated annual layer counting method to be operational for data sections where the annual layering is unknown.