Building Simple Hidden Markov Models. Classroom Notes
ERIC Educational Resources Information Center
Ching, Wai-Ki; Ng, Michael K.
2004-01-01
Hidden Markov models (HMMs) are widely used in bioinformatics, speech recognition and many other areas. This note presents HMMs via the framework of classical Markov chain models. A simple example is given to illustrate the model. An estimation method for the transition probabilities of the hidden states is also discussed.
Phase transitions in Hidden Markov Models
NASA Astrophysics Data System (ADS)
Bechhoefer, John; Lathouwers, Emma
In Hidden Markov Models (HMMs), a Markov process is not directly accessible. In the simplest case, a two-state Markov model ``emits'' one of two ``symbols'' at each time step. We can think of these symbols as noisy measurements of the underlying state. With some probability, the symbol implies that the system is in one state when it is actually in the other. The ability to judge which state the system is in sets the efficiency of a Maxwell demon that observes state fluctuations in order to extract heat from a coupled reservoir. The state-inference problem is to infer the underlying state from such noisy measurements at each time step. We show that there can be a phase transition in such measurements: for measurement error rates below a certain threshold, the inferred state always matches the observation. For higher error rates, there can be continuous or discontinuous transitions to situations where keeping a memory of past observations improves the state estimate. We can partly understand this behavior by mapping the HMM onto a 1d random-field Ising model at zero temperature. We also present more recent work that explores a larger parameter space and more states. Research funded by NSERC, Canada.
Zipf exponent of trajectory distribution in the hidden Markov model
NASA Astrophysics Data System (ADS)
Bochkarev, V. V.; Lerner, E. Yu
2014-03-01
This paper is the first step of generalization of the previously obtained full classification of the asymptotic behavior of the probability for Markov chain trajectories for the case of hidden Markov models. The main goal is to study the power (Zipf) and nonpower asymptotics of the frequency list of trajectories of hidden Markov frequencys and to obtain explicit formulae for the exponent of the power asymptotics. We consider several simple classes of hidden Markov models. We prove that the asymptotics for a hidden Markov model and for the corresponding Markov chain can be essentially different.
Estimating Neuronal Ageing with Hidden Markov Models
NASA Astrophysics Data System (ADS)
Wang, Bing; Pham, Tuan D.
2011-06-01
Neuronal degeneration is widely observed in normal ageing, meanwhile the neurode-generative disease like Alzheimer's disease effects neuronal degeneration in a faster way which is considered as faster ageing. Early intervention of such disease could benefit subjects with potentials of positive clinical outcome, therefore, early detection of disease related brain structural alteration is required. In this paper, we propose a computational approach for modelling the MRI-based structure alteration with ageing using hidden Markov model. The proposed hidden Markov model based brain structural model encodes intracortical tissue/fluid distribution using discrete wavelet transformation and vector quantization. Further, it captures gray matter volume loss, which is capable of reflecting subtle intracortical changes with ageing. Experiments were carried out on healthy subjects to validate its accuracy and robustness. Results have shown its ability of predicting the brain age with prediction error of 1.98 years without training data, which shows better result than other age predition methods.
Hidden Markov Model Analysis of Multichromophore Photobleaching
Messina, Troy C.; Kim, Hiyun; Giurleo, Jason T.; Talaga, David S.
2007-01-01
The interpretation of single-molecule measurements is greatly complicated by the presence of multiple fluorescent labels. However, many molecular systems of interest consist of multiple interacting components. We investigate this issue using multiply labeled dextran polymers that we intentionally photobleach to the background on a single-molecule basis. Hidden Markov models allow for unsupervised analysis of the data to determine the number of fluorescent subunits involved in the fluorescence intermittency of the 6-carboxy-tetramethylrhodamine labels by counting the discrete steps in fluorescence intensity. The Bayes information criterion allows us to distinguish between hidden Markov models that differ by the number of states, that is, the number of fluorescent molecules. We determine information-theoretical limits and show via Monte Carlo simulations that the hidden Markov model analysis approaches these theoretical limits. This technique has resolving power of one fluorescing unit up to as many as 30 fluorescent dyes with the appropriate choice of dye and adequate detection capability. We discuss the general utility of this method for determining aggregation-state distributions as could appear in many biologically important systems and its adaptability to general photometric experiments. PMID:16913765
Active Inference for Binary Symmetric Hidden Markov Models
NASA Astrophysics Data System (ADS)
Allahverdyan, Armen E.; Galstyan, Aram
2015-10-01
We consider active maximum a posteriori (MAP) inference problem for hidden Markov models (HMM), where, given an initial MAP estimate of the hidden sequence, we select to label certain states in the sequence to improve the estimation accuracy of the remaining states. We focus on the binary symmetric HMM, and employ its known mapping to 1d Ising model in random fields. From the statistical physics viewpoint, the active MAP inference problem reduces to analyzing the ground state of the 1d Ising model under modified external fields. We develop an analytical approach and obtain a closed form solution that relates the expected error reduction to model parameters under the specified active inference scheme. We then use this solution to determine most optimal active inference scheme in terms of error reduction, and examine the relation of those schemes to heuristic principles of uncertainty reduction and solution unicity.
Hidden Markov models for stochastic thermodynamics
NASA Astrophysics Data System (ADS)
Bechhoefer, John
2015-07-01
The formalism of state estimation and hidden Markov models can simplify and clarify the discussion of stochastic thermodynamics in the presence of feedback and measurement errors. After reviewing the basic formalism, we use it to shed light on a recent discussion of phase transitions in the optimized response of an information engine, for which measurement noise serves as a control parameter. The HMM formalism also shows that the value of additional information displays a maximum at intermediate signal-to-noise ratios. Finally, we discuss how systems open to information flow can apparently violate causality; the HMM formalism can quantify the performance gains due to such violations.
Multiple alignment using hidden Markov models
Eddy, S.R.
1995-12-31
A simulated annealing method is described for training hidden Markov models and producing multiple sequence alignments from initially unaligned protein or DNA sequences. Simulated annealing in turn uses a dynamic programming algorithm for correctly sampling suboptimal multiple alignments according to their probability and a Boltzmann temperature factor. The quality of simulated annealing alignments is evaluated on structural alignments of ten different protein families, and compared to the performance of other HMM training methods and the ClustalW program. Simulated annealing is better able to find near-global optima in the multiple alignment probability landscape than the other tested HMM training methods. Neither ClustalW nor simulated annealing produce consistently better alignments compared to each other. Examination of the specific cases in which ClustalW outperforms simulated annealing, and vice versa, provides insight into the strengths and weaknesses of current hidden Maxkov model approaches.
Mixture Hidden Markov Models in Finance Research
NASA Astrophysics Data System (ADS)
Dias, José G.; Vermunt, Jeroen K.; Ramos, Sofia
Finite mixture models have proven to be a powerful framework whenever unobserved heterogeneity cannot be ignored. We introduce in finance research the Mixture Hidden Markov Model (MHMM) that takes into account time and space heterogeneity simultaneously. This approach is flexible in the sense that it can deal with the specific features of financial time series data, such as asymmetry, kurtosis, and unobserved heterogeneity. This methodology is applied to model simultaneously 12 time series of Asian stock markets indexes. Because we selected a heterogeneous sample of countries including both developed and emerging countries, we expect that heterogeneity in market returns due to country idiosyncrasies will show up in the results. The best fitting model was the one with two clusters at country level with different dynamics between the two regimes.
Plume mapping via hidden Markov methods.
Farrell, J A; Pang, Shuo; Li, Wei
2003-01-01
This paper addresses the problem of mapping likely locations of a chemical source using an autonomous vehicle operating in a fluid flow. The paper reviews biological plume-tracing concepts, reviews previous strategies for vehicle-based plume tracing, and presents a new plume mapping approach based on hidden Markov methods (HMM). HMM provide efficient algorithms for predicting the likelihood of odor detection versus position, the likelihood of source location versus position, the most likely path taken by the odor to a given location, and the path between two points most likely to result in odor detection. All four are useful for solving the odor source localization problem using an autonomous vehicle. The vehicle is assumed to be capable of detecting above threshold chemical concentration and sensing the fluid flow velocity at the vehicle location. The fluid flow is assumed to vary with space and time, and to have a high Reynolds number (Re>10). PMID:18238238
Probabilistic Resilience in Hidden Markov Models
NASA Astrophysics Data System (ADS)
Panerati, Jacopo; Beltrame, Giovanni; Schwind, Nicolas; Zeltner, Stefan; Inoue, Katsumi
2016-05-01
Originally defined in the context of ecological systems and environmental sciences, resilience has grown to be a property of major interest for the design and analysis of many other complex systems: resilient networks and robotics systems other the desirable capability of absorbing disruption and transforming in response to external shocks, while still providing the services they were designed for. Starting from an existing formalization of resilience for constraint-based systems, we develop a probabilistic framework based on hidden Markov models. In doing so, we introduce two new important features: stochastic evolution and partial observability. Using our framework, we formalize a methodology for the evaluation of probabilities associated with generic properties, we describe an efficient algorithm for the computation of its essential inference step, and show that its complexity is comparable to other state-of-the-art inference algorithms.
Defect Detection Using Hidden Markov Random Fields
NASA Astrophysics Data System (ADS)
Dogandžić, Aleksandar; Eua-anant, Nawanat; Zhang, Benhong
2005-04-01
We derive an approximate maximum a posteriori (MAP) method for detecting NDE defect signals using hidden Markov random fields (HMRFs). In the proposed HMRF framework, a set of spatially distributed NDE measurements is assumed to form a noisy realization of an underlying random field that has a simple structure with Markovian dependence. Here, the random field describes the defect signals to be estimated or detected. The HMRF models incorporate measurement locations into the statistical analysis, which is important in scenarios where the same defect affects measurements at multiple locations. We also discuss initialization of the proposed HMRF detector and apply to simulated eddy-current data and experimental ultrasonic C-scan data from an inspection of a cylindrical Ti 6-4 billet.
Stochastic motif extraction using hidden Markov model
Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko
1994-12-31
In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.
Hidden Markov models for threat prediction fusion
NASA Astrophysics Data System (ADS)
Ross, Kenneth N.; Chaney, Ronald D.
2000-04-01
This work addresses the often neglected, but important problem of Level 3 fusion or threat refinement. This paper describes algorithms for threat prediction and test results from a prototype threat prediction fusion engine. The threat prediction fusion engine selectively models important aspects of the battlespace state using probability-based methods and information obtained from lower level fusion engines. Our approach uses hidden Markov models of a hierarchical threat state to find the most likely Course of Action (CoA) for the opposing forces. Decision tress use features derived from the CoA probabilities and other information to estimate the level of threat presented by the opposing forces. This approach provides the user with several measures associated with the level of threat, including: probability that the enemy is following a particular CoA, potential threat presented by the opposing forces, and likely time of the threat. The hierarchical approach used for modeling helps us efficiently represent the battlespace with a structure that permits scaling the models to larger scenarios without adding prohibitive computational costs or sacrificing model fidelity.
Time series segmentation with shifting means hidden markov models
NASA Astrophysics Data System (ADS)
Kehagias, Ath.; Fortin, V.
2006-08-01
We present a new family of hidden Markov models and apply these to the segmentation of hydrological and environmental time series. The proposed hidden Markov models have a discrete state space and their structure is inspired from the shifting means models introduced by Chernoff and Zacks and by Salas and Boes. An estimation method inspired from the EM algorithm is proposed, and we show that it can accurately identify multiple change-points in a time series. We also show that the solution obtained using this algorithm can serve as a starting point for a Monte-Carlo Markov chain Bayesian estimation method, thus reducing the computing time needed for the Markov chain to converge to a stationary distribution.
MODELING PAVEMENT DETERIORATION PROCESSES BY POISSON HIDDEN MARKOV MODELS
NASA Astrophysics Data System (ADS)
Nam, Le Thanh; Kaito, Kiyoyuki; Kobayashi, Kiyoshi; Okizuka, Ryosuke
In pavement management, it is important to estimate lifecycle cost, which is composed of the expenses for repairing local damages, including potholes, and repairing and rehabilitating the surface and base layers of pavements, including overlays. In this study, a model is produced under the assumption that the deterioration process of pavement is a complex one that includes local damages, which occur frequently, and the deterioration of the surface and base layers of pavement, which progresses slowly. The variation in pavement soundness is expressed by the Markov deterioration model and the Poisson hidden Markov deterioration model, in which the frequency of local damage depends on the distribution of pavement soundness, is formulated. In addition, the authors suggest a model estimation method using the Markov Chain Monte Carlo (MCMC) method, and attempt to demonstrate the applicability of the proposed Poisson hidden Markov deterioration model by studying concrete application cases.
Unsupervised Segmentation of Hidden Semi-Markov Non Stationary Chains
NASA Astrophysics Data System (ADS)
Lapuyade-Lahorgue, Jérôme; Pieczynski, Wojciech
2006-11-01
In the classical hidden Markov chain (HMC) model we have a hidden chain X, which is a Markov one and an observed chain Y. HMC are widely used; however, in some situations they have to be replaced by the more general "hidden semi-Markov chains" (HSMC) which are particular "triplet Markov chains" (TMC) T = (X, U, Y), where the auxiliary chain U models the semi-Markovianity of X. Otherwise, non stationary classical HMC can also be modeled by a triplet Markov stationary chain with, as a consequence, the possibility of parameters' estimation. The aim of this paper is to use simultaneously both properties. We consider a non stationary HSMC and model it as a TMC T = (X, U1, U2, Y), where U1 models the semi-Markovianity and U2 models the non stationarity. The TMC T being itself stationary, all parameters can be estimated by the general "Iterative Conditional Estimation" (ICE) method, which leads to unsupervised segmentation. We present some experiments showing the interest of the new model and related processing in image segmentation area.
Nonparametric identification and maximum likelihood estimation for hidden Markov models
Alexandrovich, G.; Holzmann, H.; Leister, A.
2016-01-01
Nonparametric identification and maximum likelihood estimation for finite-state hidden Markov models are investigated. We obtain identification of the parameters as well as the order of the Markov chain if the transition probability matrices have full-rank and are ergodic, and if the state-dependent distributions are all distinct, but not necessarily linearly independent. Based on this identification result, we develop a nonparametric maximum likelihood estimation theory. First, we show that the asymptotic contrast, the Kullback–Leibler divergence of the hidden Markov model, also identifies the true parameter vector nonparametrically. Second, for classes of state-dependent densities which are arbitrary mixtures of a parametric family, we establish the consistency of the nonparametric maximum likelihood estimator. Here, identification of the mixing distributions need not be assumed. Numerical properties of the estimates and of nonparametric goodness of fit tests are investigated in a simulation study.
Multivariate longitudinal data analysis with mixed effects hidden Markov models.
Raffa, Jesse D; Dubin, Joel A
2015-09-01
Multiple longitudinal responses are often collected as a means to capture relevant features of the true outcome of interest, which is often hidden and not directly measurable. We outline an approach which models these multivariate longitudinal responses as generated from a hidden disease process. We propose a class of models which uses a hidden Markov model with separate but correlated random effects between multiple longitudinal responses. This approach was motivated by a smoking cessation clinical trial, where a bivariate longitudinal response involving both a continuous and a binomial response was collected for each participant to monitor smoking behavior. A Bayesian method using Markov chain Monte Carlo is used. Comparison of separate univariate response models to the bivariate response models was undertaken. Our methods are demonstrated on the smoking cessation clinical trial dataset, and properties of our approach are examined through extensive simulation studies. PMID:25761965
Multiple testing for neuroimaging via hidden Markov random field.
Shu, Hai; Nan, Bin; Koeppe, Robert
2015-09-01
Traditional voxel-level multiple testing procedures in neuroimaging, mostly p-value based, often ignore the spatial correlations among neighboring voxels and thus suffer from substantial loss of power. We extend the local-significance-index based procedure originally developed for the hidden Markov chain models, which aims to minimize the false nondiscovery rate subject to a constraint on the false discovery rate, to three-dimensional neuroimaging data using a hidden Markov random field model. A generalized expectation-maximization algorithm for maximizing the penalized likelihood is proposed for estimating the model parameters. Extensive simulations show that the proposed approach is more powerful than conventional false discovery rate procedures. We apply the method to the comparison between mild cognitive impairment, a disease status with increased risk of developing Alzheimer's or another dementia, and normal controls in the FDG-PET imaging study of the Alzheimer's Disease Neuroimaging Initiative. PMID:26012881
A Hidden Markov Approach to Modeling Interevent Earthquake Times
NASA Astrophysics Data System (ADS)
Chambers, D.; Ebel, J. E.; Kafka, A. L.; Baglivo, J.
2003-12-01
A hidden Markov process, in which the interevent time distribution is a mixture of exponential distributions with different rates, is explored as a model for seismicity that does not follow a Poisson process. In a general hidden Markov model, one assumes that a system can be in any of a finite number k of states and there is a random variable of interest whose distribution depends on the state in which the system resides. The system moves probabilistically among the states according to a Markov chain; that is, given the history of visited states up to the present, the conditional probability that the next state is a specified one depends only on the present state. Thus the transition probabilities are specified by a k by k stochastic matrix. Furthermore, it is assumed that the actual states are unobserved (hidden) and that only the values of the random variable are seen. From these values, one wishes to estimate the sequence of states, the transition probability matrix, and any parameters used in the state-specific distributions. The hidden Markov process was applied to a data set of 110 interevent times for earthquakes in New England from 1975 to 2000. Using the Baum-Welch method (Baum et al., Ann. Math. Statist. 41, 164-171), we estimate the transition probabilities, find the most likely sequence of states, and estimate the k means of the exponential distributions. Using k=2 states, we found the data were fit well by a mixture of two exponential distributions, with means of approximately 5 days and 95 days. The steady state model indicates that after approximately one fourth of the earthquakes, the waiting time until the next event had the first exponential distribution and three fourths of the time it had the second. Three and four state models were also fit to the data; the data were inconsistent with a three state model but were well fit by a four state model.
Probabilistic Independence Networks for Hidden Markov Probability Models
NASA Technical Reports Server (NTRS)
Smyth, Padhraic; Heckerman, Cavid; Jordan, Michael I
1996-01-01
In this paper we explore hidden Markov models(HMMs) and related structures within the general framework of probabilistic independence networks (PINs). The paper contains a self-contained review of the basic principles of PINs. It is shown that the well-known forward-backward (F-B) and Viterbi algorithms for HMMs are special cases of more general enference algorithms for arbitrary PINs.
Hidden Markov Models for Fault Detection in Dynamic Systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic
1994-01-01
Continuous monitoring of complex dynamic systems is an increasingly important issue in diverse areas such as nuclear plant safety, production line reliability, and medical health monitoring systems. Recent advances in both sensor technology and computational capabilities have made on-line permanent monitoring much more feasible than it was in the past. In this paper it is shown that a pattern recognition system combined with a finite-state hidden Markov model provides a particularly useful method for modelling temporal context in continuous monitoring. The parameters of the Markov model are derived from gross failure statistics such as the mean time between failures. The model is validated on a real-world fault diagnosis problem and it is shown that Markov modelling in this context offers significant practical benefits.
Infinite Factorial Unbounded-State Hidden Markov Model.
Valera, Isabel; Ruiz, Francisco J R; Perez-Cruz, Fernando
2016-09-01
There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markovmodels (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem. PMID:26571511
A coupled hidden Markov model for disease interactions.
Sherlock, Chris; Xifara, Tatiana; Telfer, Sandra; Begon, Mike
2013-08-01
To investigate interactions between parasite species in a host, a population of field voles was studied longitudinally, with presence or absence of six different parasites measured repeatedly. Although trapping sessions were regular, a different set of voles was caught at each session, leading to incomplete profiles for all subjects. We use a discrete time hidden Markov model for each disease with transition probabilities dependent on covariates via a set of logistic regressions. For each disease the hidden states for each of the other diseases at a given time point form part of the covariate set for the Markov transition probabilities from that time point. This allows us to gauge the influence of each parasite species on the transition probabilities for each of the other parasite species. Inference is performed via a Gibbs sampler, which cycles through each of the diseases, first using an adaptive Metropolis-Hastings step to sample from the conditional posterior of the covariate parameters for that particular disease given the hidden states for all other diseases and then sampling from the hidden states for that disease given the parameters. We find evidence for interactions between several pairs of parasites and of an acquired immune response for two of the parasites. PMID:24223436
Efficient Parallel Learning of Hidden Markov Chain Models on SMPs
NASA Astrophysics Data System (ADS)
Li, Lei; Fu, Bin; Faloutsos, Christos
Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.
Hidden Markov Modeling for Weigh-In-Motion Estimation
Abercrombie, Robert K; Ferragut, Erik M; Boone, Shane
2012-01-01
This paper describes a hidden Markov model to assist in the weight measurement error that arises from complex vehicle oscillations of a system of discrete masses. Present reduction of oscillations is by a smooth, flat, level approach and constant, slow speed in a straight line. The model uses this inherent variability to assist in determining the true total weight and individual axle weights of a vehicle. The weight distribution dynamics of a generic moving vehicle were simulated. The model estimation converged to within 1% of the true mass for simulated data. The computational demands of this method, while much greater than simple averages, took only seconds to run on a desktop computer.
AIRWAY LABELING USING A HIDDEN MARKOV TREE MODEL
Ross, James C.; Díaz, Alejandro A.; Okajima, Yuka; Wassermann, Demian; Washko, George R.; Dy, Jennifer; San José Estépar, Raúl
2014-01-01
We present a novel airway labeling algorithm based on a Hidden Markov Tree Model (HMTM). We obtain a collection of discrete points along the segmented airway tree using particles sampling [1] and establish topology using Kruskal’s minimum spanning tree algorithm. Following this, our HMTM algorithm probabilistically assigns labels to each point. While alternative methods label airway branches out to the segmental level, we describe a general method and demonstrate its performance out to the subsubsegmental level (two generations further than previously published approaches). We present results on a collection of 25 computed tomography (CT) datasets taken from a Chronic Obstructive Pulmonary Disease (COPD) study. PMID:25436039
Improved Hidden-Markov-Model Method Of Detecting Faults
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J.
1994-01-01
Method of automated, continuous monitoring to detect faults in complicated dynamic system based on hidden-Markov-model (HMM) approach. Simpler than another, recently proposed HMM method, but retains advantages of that method, including low susceptibility to false alarms, no need for mathematical model of dynamics of system under normal or faulty conditions, and ability to detect subtle changes in characteristics of monitored signals. Examples of systems monitored by use of this method include motors, turbines, and pumps critical in their applications; chemical-processing plants; powerplants; and biomedical systems.
Self-Organizing Hidden Markov Model Map (SOHMMM).
Ferles, Christos; Stafylopatis, Andreas
2013-12-01
A hybrid approach combining the Self-Organizing Map (SOM) and the Hidden Markov Model (HMM) is presented. The Self-Organizing Hidden Markov Model Map (SOHMMM) establishes a cross-section between the theoretic foundations and algorithmic realizations of its constituents. The respective architectures and learning methodologies are fused in an attempt to meet the increasing requirements imposed by the properties of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein chain molecules. The fusion and synergy of the SOM unsupervised training and the HMM dynamic programming algorithms bring forth a novel on-line gradient descent unsupervised learning algorithm, which is fully integrated into the SOHMMM. Since the SOHMMM carries out probabilistic sequence analysis with little or no prior knowledge, it can have a variety of applications in clustering, dimensionality reduction and visualization of large-scale sequence spaces, and also, in sequence discrimination, search and classification. Two series of experiments based on artificial sequence data and splice junction gene sequences demonstrate the SOHMMM's characteristics and capabilities. PMID:24001407
Colonoscopy video quality assessment using hidden Markov random fields
NASA Astrophysics Data System (ADS)
Park, Sun Young; Sargent, Dusty; Spofford, Inbar; Vosburgh, Kirby
2011-03-01
With colonoscopy becoming a common procedure for individuals aged 50 or more who are at risk of developing colorectal cancer (CRC), colon video data is being accumulated at an ever increasing rate. However, the clinically valuable information contained in these videos is not being maximally exploited to improve patient care and accelerate the development of new screening methods. One of the well-known difficulties in colonoscopy video analysis is the abundance of frames with no diagnostic information. Approximately 40% - 50% of the frames in a colonoscopy video are contaminated by noise, acquisition errors, glare, blur, and uneven illumination. Therefore, filtering out low quality frames containing no diagnostic information can significantly improve the efficiency of colonoscopy video analysis. To address this challenge, we present a quality assessment algorithm to detect and remove low quality, uninformative frames. The goal of our algorithm is to discard low quality frames while retaining all diagnostically relevant information. Our algorithm is based on a hidden Markov model (HMM) in combination with two measures of data quality to filter out uninformative frames. Furthermore, we present a two-level framework based on an embedded hidden Markov model (EHHM) to incorporate the proposed quality assessment algorithm into a complete, automated diagnostic image analysis system for colonoscopy video.
Trajectory classification using switched dynamical hidden Markov models.
Nascimento, Jacinto C; Figueiredo, Mario; Marques, Jorge S
2010-05-01
This paper proposes an approach for recognizing human activities (more specifically, pedestrian trajectories) in video sequences, in a surveillance context. A system for automatic processing of video information for surveillance purposes should be capable of detecting, recognizing, and collecting statistics of human activity, reducing human intervention as much as possible. In the method described in this paper, human trajectories are modeled as a concatenation of segments produced by a set of low level dynamical models. These low level models are estimated in an unsupervised fashion, based on a finite mixture formulation, using the expectation-maximization (EM) algorithm; the number of models is automatically obtained using a minimum message length (MML) criterion. This leads to a parsimonious set of models tuned to the complexity of the scene. We describe the switching among the low-level dynamic models by a hidden Markov chain; thus, the complete model is termed a switched dynamical hidden Markov model (SD-HMM). The performance of the proposed method is illustrated with real data from two different scenarios: a shopping center and a university campus. A set of human activities in both scenarios is successfully recognized by the proposed system. These experiments show the ability of our approach to properly describe trajectories with sudden changes. PMID:20051342
On the entropy of a hidden Markov process⋆
Jacquet, Philippe; Seroussi, Gadiel; Szpankowski, Wojciech
2008-01-01
We study the entropy rate of a hidden Markov process (HMP) defined by observing the output of a binary symmetric channel whose input is a first-order binary Markov process. Despite the simplicity of the models involved, the characterization of this entropy is a long standing open problem. By presenting the probability of a sequence under the model as a product of random matrices, one can see that the entropy rate sought is equal to a top Lyapunov exponent of the product. This offers an explanation for the elusiveness of explicit expressions for the HMP entropy rate, as Lyapunov exponents are notoriously difficult to compute. Consequently, we focus on asymptotic estimates, and apply the same product of random matrices to derive an explicit expression for a Taylor approximation of the entropy rate with respect to the parameter of the binary symmetric channel. The accuracy of the approximation is validated against empirical simulation results. We also extend our results to higher-order Markov processes and to Rényi entropies of any order. PMID:19169438
Behavior Detection using Confidence Intervals of Hidden Markov Models
Griffin, Christopher H
2009-01-01
Markov models are commonly used to analyze real-world problems. Their combination of discrete states and stochastic transitions is suited to applications with deterministic and stochastic components. Hidden Markov Models (HMMs) are a class of Markov model commonly used in pattern recognition. Currently, HMMs recognize patterns using a maximum likelihood approach. One major drawback with this approach is that data observations are mapped to HMMs without considering the number of data samples available. Another problem is that this approach is only useful for choosing between HMMs. It does not provide a criteria for determining whether or not a given HMM adequately matches the data stream. In this work, we recognize complex behaviors using HMMs and confidence intervals. The certainty of a data match increases with the number of data samples considered. Receiver Operating Characteristic curves are used to find the optimal threshold for either accepting or rejecting a HMM description. We present one example using a family of HMM's to show the utility of the proposed approach. A second example using models extracted from a database of consumer purchases provides additional evidence that this approach can perform better than existing techniques.
ENSO informed Drought Forecasting Using Nonhomogeneous Hidden Markov Chain Model
NASA Astrophysics Data System (ADS)
Kwon, H.; Yoo, J.; Kim, T.
2013-12-01
The study aims at developing a new scheme to investigate the potential use of ENSO (El Niño/Southern Oscillation) for drought forecasting. In this regard, objective of this study is to extend a previously developed nonhomogeneous hidden Markov chain model (NHMM) to identify climate states associated with drought that can be potentially used to forecast drought conditions using climate information. As a target variable for forecasting, SPI(standardized precipitation index) is mainly utilized. This study collected monthly precipitation data over 56 stations that cover more than 30 years and K-means cluster analysis using drought properties was applied to partition regions into mutually exclusive clusters. In this study, six main clusters were distinguished through the regionalization procedure. For each cluster, the NHMM was applied to estimate the transition probability of hidden states as well as drought conditions informed by large scale climate indices (e.g. SOI, Nino1.2, Nino3, Nino3.4, MJO and PDO). The NHMM coupled with large scale climate information shows promise as a technique for forecasting drought scenarios. A more detailed explanation of large scale climate patterns associated with the identified hidden states will be provided with anomaly composites of SSTs and SLPs. Acknowledgement This research was supported by a grant(11CTIPC02) from Construction Technology Innovation Program (CTIP) funded by Ministry of Land, Transport and Maritime Affairs of Korean government.
Decoding coalescent hidden Markov models in linear time
Harris, Kelley; Sheehan, Sara; Kamm, John A.; Song, Yun S.
2014-01-01
In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population. PMID:25340178
Hidden Markov Models for Detecting Aseismic Events in Southern California
NASA Astrophysics Data System (ADS)
Granat, R.
2004-12-01
We employ a hidden Markov model (HMM) to segment surface displacement time series collection by the Southern California Integrated Geodetic Network (SCIGN). These segmented time series are then used to detect regional events by observing the number of simultaneous mode changes across the network; if a large number of stations change at the same time, that indicates an event. The hidden Markov model (HMM) approach assumes that the observed data has been generated by an unobservable dynamical statistical process. The process is of a particular form such that each observation is coincident with the system being in a particular discrete state, which is interpreted as a behavioral mode. The dynamics are the model are constructed so that the next state is directly dependent only on the current state -- it is a first order Markov process. The model is completely described by a set of parameters: the initial state probabilities, the first order Markov chain state-to-state transition probabilities, and the probability distribution of observable outputs associated with each state. The result of this approach is that our segmentation decisions are based entirely on statistical changes in the behavior of the observed daily displacements. In general, finding the optimal model parameters to fit the data is a difficult problem. We present an innovative model fitting method that is unsupervised (i.e., it requires no labeled training data) and uses a regularized version of the expectation-maximization (EM) algorithm to ensure that model solutions are both robust with respect to initial conditions and of high quality. We demonstrate the reliability of the method as compared to standard model fitting methods and show that it results in lower noise in the mode change correlation signal used to detect regional events. We compare candidate events detected by this method to the seismic record and observe that most are not correlated with a significant seismic event. Our analysis
Combining Wavelet Transform and Hidden Markov Models for ECG Segmentation
NASA Astrophysics Data System (ADS)
Andreão, Rodrigo Varejão; Boudy, Jérôme
2006-12-01
This work aims at providing new insights on the electrocardiogram (ECG) segmentation problem using wavelets. The wavelet transform has been originally combined with a hidden Markov models (HMMs) framework in order to carry out beat segmentation and classification. A group of five continuous wavelet functions commonly used in ECG analysis has been implemented and compared using the same framework. All experiments were realized on the QT database, which is composed of a representative number of ambulatory recordings of several individuals and is supplied with manual labels made by a physician. Our main contribution relies on the consistent set of experiments performed. Moreover, the results obtained in terms of beat segmentation and premature ventricular beat (PVC) detection are comparable to others works reported in the literature, independently of the type of the wavelet. Finally, through an original concept of combining two wavelet functions in the segmentation stage, we achieve our best performances.
Hidden Markov model using Dirichlet process for de-identification.
Chen, Tao; Cullen, Richard M; Godwin, Marshall
2015-12-01
For the 2014 i2b2/UTHealth de-identification challenge, we introduced a new non-parametric Bayesian hidden Markov model using a Dirichlet process (HMM-DP). The model intends to reduce task-specific feature engineering and to generalize well to new data. In the challenge we developed a variational method to learn the model and an efficient approximation algorithm for prediction. To accommodate out-of-vocabulary words, we designed a number of feature functions to model such words. The results show the model is capable of understanding local context cues to make correct predictions without manual feature engineering and performs as accurately as state-of-the-art conditional random field models in a number of categories. To incorporate long-range and cross-document context cues, we developed a skip-chain conditional random field model to align the results produced by HMM-DP, which further improved the performance. PMID:26407642
Hidden Markov models for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J. (Inventor)
1993-01-01
The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data. The invention first estimates the state of the system at discrete intervals in time. A feature vector x of dimension k is estimated from sets of successive windows of sensor data. A pattern recognition component then models the instantaneous estimate of the posterior class probability given the features, p(w(sub i) perpendicular to x), 1 less than or equal to i is less than or equal to m. Finally, a hidden Markov model is used to take advantage of temporal context and estimate class probabilities conditioned on recent past history. In this hierarchical pattern of information flow, the time series data is transformed and mapped into a categorical representation (the fault classes) and integrated over time to enable robust decision-making.
Comparison of glycosyltransferase families using the profile hidden Markov model.
Kikuchi, Norihiro; Kwon, Yeon-Dae; Gotoh, Masanori; Narimatsu, Hisashi
2003-10-17
In order to investigate the relationship between glycosyltransferase families and the motif for them, we classified 47 glycosyltransferase families in the CAZy database into four superfamilies, GTS-A, -B, -C, and -D, using a profile Hidden Markov Model method. On the basis of the classification and the similarity between GTS-A and nucleotidylyltransferase family catalyzing the synthesis of nucleotide-sugar, we proposed that ancient oligosaccharide might have been synthesized by the origin of GTS-B whereas the origin of GTS-A might be the gene encoding for synthesis of nucleotide-sugar as the donor and have evolved to glycosyltransferases to catalyze the synthesis of divergent carbohydrates. We also suggested that the divergent evolution of each superfamily in the corresponding subcellular component has increased the complexities of eukaryotic carbohydrate structure. PMID:14521949
Pediatric heart sound segmentation using hidden Markov model.
Sedighian, Pouye; Subudhi, Andrew W; Scalzo, Fabien; Asgari, Shadnaz
2014-01-01
Recent advances in technology have enabled automatic cardiac auscultation using digital stethoscopes. This in turn creates the need for development of algorithms capable of automatic segmentation of heart sounds. Pediatric heart sound segmentation is a challenging task due to various confounding factors including the significant influence of respiration on children's heart sounds. The current work investigates the application of homomorphic filtering and Hidden Markov Model for the purpose of segmenting pediatric heart sounds. The efficacy of the proposed method is evaluated on the publicly available Pascal Challenge dataset and its performance is compared with those of three other existing methods. The results show that our proposed method achieves an accuracy of 92.4%±1.1% and 93.5%±1.1% in identifying the first and second heart sound components, respectively, and is superior to three other existing methods in terms of accuracy or computational complexity. PMID:25571237
Natural movement generation using hidden Markov models and principal components.
Kwon, Junghyun; Park, Frank C
2008-10-01
Recent studies have shown that the perception of natural movements-in the sense of being "humanlike"-depends on both joint and task space characteristics of the movement. This paper proposes a movement generation framework that merges two established techniques from gesture recognition and motion generation-hidden Markov models (HMMs) and principal components-into an efficient and reliable means of generating natural movements, which uniformly considers joint and task space characteristics. Given human motion data that are classified into several movement categories, for each category, the principal components extracted from the joint trajectories are used as basis elements. An HMM is, in turn, designed and trained for each movement class using the human task space motion data. Natural movements are generated as the optimal linear combination of principal components, which yields the highest probability for the trained HMM. Experimental case studies with a prototype humanoid robot demonstrate the various advantages of our proposed framework. PMID:18784005
Hidden Markov models for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J. (Inventor)
1995-01-01
The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data. The invention first estimates the state of the system at discrete intervals in time. A feature vector x of dimension k is estimated from sets of successive windows of sensor data. A pattern recognition component then models the instantaneous estimate of the posterior class probability given the features, p(w(sub i) (vertical bar)/x), 1 less than or equal to i isless than or equal to m. Finally, a hidden Markov model is used to take advantage of temporal context and estimate class probabilities conditioned on recent past history. In this hierarchical pattern of information flow, the time series data is transformed and mapped into a categorical representation (the fault classes) and integrated over time to enable robust decision-making.
Stylistic gait synthesis based on hidden Markov models
NASA Astrophysics Data System (ADS)
Tilmanne, Joëlle; Moinet, Alexis; Dutoit, Thierry
2012-12-01
In this work we present an expressive gait synthesis system based on hidden Markov models (HMMs), following and modifying a procedure originally developed for speaking style adaptation, in speech synthesis. A large database of neutral motion capture walk sequences was used to train an HMM of average walk. The model was then used for automatic adaptation to a particular style of walk using only a small amount of training data from the target style. The open source toolkit that we adapted for motion modeling also enabled us to take into account the dynamics of the data and to model accurately the duration of each HMM state. We also address the assessment issue and propose a procedure for qualitative user evaluation of the synthesized sequences. Our tests show that the style of these sequences can easily be recognized and look natural to the evaluators.
Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.
Churchill, G A; Lazareva, B
1999-01-01
Hidden Markov models (HMMs) are a class of stochastic models that have proven to be powerful tools for the analysis of molecular sequence data. A hidden Markov model can be viewed as a black box that generates sequences of observations. The unobservable internal state of the box is stochastic and is determined by a finite state Markov chain. The observable output is stochastic with distribution determined by the state of the hidden Markov chain. We present a Bayesian solution to the problem of restoring the sequence of states visited by the hidden Markov chain from a given sequence of observed outputs. Our approach is based on a Monte Carlo Markov chain algorithm that allows us to draw samples from the full posterior distribution of the hidden Markov chain paths. The problem of estimating the probability of individual paths and the associated Monte Carlo error of these estimates is addressed. The method is illustrated by considering a problem of DNA sequence multiple alignment. The special structure for the hidden Markov model used in the sequence alignment problem is considered in detail. In conclusion, we discuss certain interesting aspects of biological sequence alignments that become accessible through the Bayesian approach to HMM restoration. PMID:10421527
Volatility: A hidden Markov process in financial time series
NASA Astrophysics Data System (ADS)
Eisler, Zoltán; Perelló, Josep; Masoliver, Jaume
2007-11-01
Volatility characterizes the amplitude of price return fluctuations. It is a central magnitude in finance closely related to the risk of holding a certain asset. Despite its popularity on trading floors, volatility is unobservable and only the price is known. Diffusion theory has many common points with the research on volatility, the key of the analogy being that volatility is a time-dependent diffusion coefficient of the random walk for the price return. We present a formal procedure to extract volatility from price data by assuming that it is described by a hidden Markov process which together with the price forms a two-dimensional diffusion process. We derive a maximum-likelihood estimate of the volatility path valid for a wide class of two-dimensional diffusion processes. The choice of the exponential Ornstein-Uhlenbeck (expOU) stochastic volatility model performs remarkably well in inferring the hidden state of volatility. The formalism is applied to the Dow Jones index. The main results are that (i) the distribution of estimated volatility is lognormal, which is consistent with the expOU model, (ii) the estimated volatility is related to trading volume by a power law of the form σ∝V0.55 , and (iii) future returns are proportional to the current volatility, which suggests some degree of predictability for the size of future returns.
Volatility: a hidden Markov process in financial time series.
Eisler, Zoltán; Perelló, Josep; Masoliver, Jaume
2007-11-01
Volatility characterizes the amplitude of price return fluctuations. It is a central magnitude in finance closely related to the risk of holding a certain asset. Despite its popularity on trading floors, volatility is unobservable and only the price is known. Diffusion theory has many common points with the research on volatility, the key of the analogy being that volatility is a time-dependent diffusion coefficient of the random walk for the price return. We present a formal procedure to extract volatility from price data by assuming that it is described by a hidden Markov process which together with the price forms a two-dimensional diffusion process. We derive a maximum-likelihood estimate of the volatility path valid for a wide class of two-dimensional diffusion processes. The choice of the exponential Ornstein-Uhlenbeck (expOU) stochastic volatility model performs remarkably well in inferring the hidden state of volatility. The formalism is applied to the Dow Jones index. The main results are that (i) the distribution of estimated volatility is lognormal, which is consistent with the expOU model, (ii) the estimated volatility is related to trading volume by a power law of the form sigma proportional, variant V0.55, and (iii) future returns are proportional to the current volatility, which suggests some degree of predictability for the size of future returns. PMID:18233716
Robust Hidden Markov Models for Geophysical Data Analysis
NASA Astrophysics Data System (ADS)
Granat, R. A.
2002-12-01
We employed robust hidden Markov models (HMMs) to perform statistical analysis of seismic events and crustal deformation. These models allowed us to classify different kinds of events or modes of deformation, and furthermore gave us a statistical basis for understanding relationships between different classes. A hidden Markov model is a statistical model for ordered data (typically in time). The observed data is assumed to have been generated by an unobservable statistical process of a particular form. This process is such that each observation is coincident with the system being in a particular discrete state. Furthermore, the next state is dependent on the current state; in other words, it is a first order Markov process. The model is completely described by a set of model parameters: the initial state probabilities, the first order Markov chain state-to-state transition probabilities, and the probabilities of observable outputs associated with each state. Application of the model to data involves optimizing these model parameters with respect to some function of the observations, typically the likelihood of the observations given the model. Our work focused on the fact that this objective function typically has a number of local maxima that is exponential in the model size (the number of states). This means that not only is it very difficult to discover the global maximum, but also that results can vary widely between applications of the model. For some domains, such as speech processing, sufficient a priori information about the system is available such that this problem can be avoided. However, for general scientific analysis, such a priori information is often not available, especially in cases where the HMM is being used as an exploratory tool for scientific understanding. Such was the case for the geophysical data sets used in this work. Our approach involves analytical location of sub-optimal local maxima; once the locations of these maxima have been found
Lee, Lee-Min; Jean, Fu-Rong
2016-08-01
The hidden Markov models have been widely applied to systems with sequential data. However, the conditional independence of the state outputs will limit the output of a hidden Markov model to be a piecewise constant random sequence, which is not a good approximation for many real processes. In this paper, a high-order hidden Markov model for piecewise linear processes is proposed to better approximate the behavior of a real process. A parameter estimation method based on the expectation-maximization algorithm was derived for the proposed model. Experiments on speech recognition of noisy Mandarin digits were conducted to examine the effectiveness of the proposed method. Experimental results show that the proposed method can reduce the recognition error rate compared to a baseline hidden Markov model. PMID:27586781
Identifying Seismicity Levels via Poisson Hidden Markov Models
NASA Astrophysics Data System (ADS)
Orfanogiannaki, K.; Karlis, D.; Papadopoulos, G. A.
2010-08-01
Poisson Hidden Markov models (PHMMs) are introduced to model temporal seismicity changes. In a PHMM the unobserved sequence of states is a finite-state Markov chain and the distribution of the observation at any time is Poisson with rate depending only on the current state of the chain. Thus, PHMMs allow a region to have varying seismicity rate. We applied the PHMM to model earthquake frequencies in the seismogenic area of Killini, Ionian Sea, Greece, between period 1990 and 2006. Simulations of data from the assumed model showed that it describes quite well the true data. The earthquake catalogue is dominated by main shocks occurring in 1993, 1997 and 2002. The time plot of PHMM seismicity states not only reproduces the three seismicity clusters but also quantifies the seismicity level and underlies the degree of strength of the serial dependence of the events at any point of time. Foreshock activity becomes quite evident before the three sequences with the gradual transition to states of cascade seismicity. Traditional analysis, based on the determination of highly significant changes of seismicity rates, failed to recognize foreshocks before the 1997 main shock due to the low number of events preceding that main shock. Then, PHMM has better performance than traditional analysis since the transition from one state to another does not only depend on the total number of events involved but also on the current state of the system. Therefore, PHMM recognizes significant changes of seismicity soon after they start, which is of particular importance for real-time recognition of foreshock activities and other seismicity changes.
Optical character recognition of handwritten Arabic using hidden Markov models
NASA Astrophysics Data System (ADS)
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.
2011-04-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.
Hidden Markov chain modeling for epileptic networks identification.
Le Cam, Steven; Louis-Dorr, Valérie; Maillard, Louis
2013-01-01
The partial epileptic seizures are often considered to be caused by a wrong balance between inhibitory and excitatory interneuron connections within a focal brain area. These abnormal balances are likely to result in loss of functional connectivities between remote brain structures, while functional connectivities within the incriminated zone are enhanced. The identification of the epileptic networks underlying these hypersynchronies are expected to contribute to a better understanding of the brain mechanisms responsible for the development of the seizures. In this objective, threshold strategies are commonly applied, based on synchrony measurements computed from recordings of the electrophysiologic brain activity. However, such methods are reported to be prone to errors and false alarms. In this paper, we propose a hidden Markov chain modeling of the synchrony states with the aim to develop a reliable machine learning methods for epileptic network inference. The method is applied on a real Stereo-EEG recording, demonstrating consistent results with the clinical evaluations and with the current knowledge on temporal lobe epilepsy. PMID:24110697
Identification and classification of conopeptides using profile Hidden Markov Models.
Laht, Silja; Koua, Dominique; Kaplinski, Lauris; Lisacek, Frédérique; Stöcklin, Reto; Remm, Maido
2012-03-01
Conopeptides are small toxins produced by predatory marine snails of the genus Conus. They are studied with increasing intensity due to their potential in neurosciences and pharmacology. The number of existing conopeptides is estimated to be 1 million, but only about 1000 have been described to date. Thanks to new high-throughput sequencing technologies the number of known conopeptides is likely to increase exponentially in the near future. There is therefore a need for a fast and accurate computational method for identification and classification of the novel conopeptides in large data sets. 62 profile Hidden Markov Models (pHMMs) were built for prediction and classification of all described conopeptide superfamilies and families, based on the different parts of the corresponding protein sequences. These models showed very high specificity in detection of new peptides. 56 out of 62 models do not give a single false positive in a test with the entire UniProtKB/Swiss-Prot protein sequence database. Our study demonstrates the usefulness of mature peptide models for automatic classification with accuracy of 96% for the mature peptide models and 100% for the pro- and signal peptide models. Our conopeptide profile HMMs can be used for finding and annotation of new conopeptides from large datasets generated by transcriptome or genome sequencing. To our knowledge this is the first time this kind of computational method has been applied to predict all known conopeptide superfamilies and some conopeptide families. PMID:22244925
Efficient inference of hidden Markov models from large observation sequences
NASA Astrophysics Data System (ADS)
Priest, Benjamin W.; Cybenko, George
2016-05-01
The hidden Markov model (HMM) is widely used to model time series data. However, the conventional Baum- Welch algorithm is known to perform poorly when applied to long observation sequences. The literature contains several alternatives that seek to improve the memory or time complexity of the algorithm. However, for an HMM with N states and an observation sequence of length T, these alternatives require at best O(N) space and O(N2T) time. Given the preponderance of applications that increasingly deal with massive amounts of data, an alternative whose time is O(T)+poly(N) is desired. Recent research presents an alternative to the Baum-Welch algorithm that relies on nonnegative matrix factorization. This document examines the space complexity of this alternative approach and proposes further optimizations using approaches adopted from the matrix sketching literature. The result is a streaming algorithm whose space complexity is constant and time complexity is linear with respect to the size of the observation sequence. The paper also presents a batch algorithm that allow for even further improved space complexity at the expense of an additional pass over the observation sequence.
Supervised learning of hidden Markov models for sequence discrimination
Mamitsuka, Hiroshi
1997-12-01
We present two supervised learning algorithms for hidden Markov models (HMMs) for sequence discrimination. When we model a class of sequences with an HMM, conventional learning algorithms for HMMs have trained the HMM with training examples belonging to the class, i.e. positive examples alone, while both of our methods allow us to use negative examples as well as positive examples. One of our algorithms minimizes a kind of distance between a target likelihood of a given training sequence and an actual likelihood of the sequence, which is obtained by a given HMM, using an additive type of parameter updating based on a gradient-descent learning. The other algorithm maximizes a criterion which represents a kind of ratio of the likelihood of a positive example to the likelihood of the total example, using a multiplicative type of parameter updating which is more efficient in actual computation time than the additive type one. We compare our two methods with two conventional methods on a type of cross-validation of actual motif classification experiments. Experimental results show that in terms of the average number of classification errors, our two methods out-perform the two conventional algorithms. 14 refs., 4 figs., 1 tab.
A hidden markov model derived structural alphabet for proteins.
Camproux, A C; Gautier, R; Tufféry, P
2004-06-01
Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction. PMID:15147844
A Network of SCOP Hidden Markov Models and Its Analysis
2011-01-01
Background The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. Results In this work, taking into account the processes used to build the HMMs, we propose a working hypothesis to examine the relationships between HMMs and the families and superfamilies that they represent. Specifically, we perform an all-against-all HMM comparison using the HHsearch program (similar to BLAST) and construct a network where the nodes are HMMs and the edges connect similar HMMs. We hypothesize that the HMMs in a connected component belong to the same family or superfamily more often than expected under a random network connection model. Results show a pattern consistent with this working hypothesis. Moreover, the HMM network possesses features distinctly different from the previously documented biological networks, exemplified by the exceptionally high clustering coefficient and the large number of connected components. Conclusions The current finding may provide guidance in devising computational methods to reduce the degree of overlaps between the HMMs representing the same superfamilies, which may in turn enable more efficient large-scale sequence searches against the database of HMMs. PMID:21635719
Optical character recognition of handwritten Arabic using hidden Markov models
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M
2011-01-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.
Analysis of nanopore data using hidden Markov models
Schreiber, Jacob; Karplus, Kevin
2015-01-01
Motivation: Nanopore-based sequencing techniques can reconstruct properties of biosequences by analyzing the sequence-dependent ionic current steps produced as biomolecules pass through a pore. Typically this involves alignment of new data to a reference, where both reference construction and alignment have been performed by hand. Results: We propose an automated method for aligning nanopore data to a reference through the use of hidden Markov models. Several features that arise from prior processing steps and from the class of enzyme used can be simply incorporated into the model. Previously, the M2MspA nanopore was shown to be sensitive enough to distinguish between cytosine, methylcytosine and hydroxymethylcytosine. We validated our automated methodology on a subset of that data by automatically calculating an error rate for the distinction between the three cytosine variants and show that the automated methodology produces a 2–3% error rate, lower than the 10% error rate from previous manual segmentation and alignment. Availability and implementation: The data, output, scripts and tutorials replicating the analysis are available at https://github.com/UCSCNanopore/Data/tree/master/Automation. Contact: karplus@soe.ucsc.edu or jmschreiber91@gmail.com Supplementary information: Supplementary data are available from Bioinformatics online. PMID:25649617
Recognition of surgical skills using hidden Markov models
NASA Astrophysics Data System (ADS)
Speidel, Stefanie; Zentek, Tom; Sudra, Gunther; Gehrig, Tobias; Müller-Stich, Beat Peter; Gutt, Carsten; Dillmann, Rüdiger
2009-02-01
Minimally invasive surgery is a highly complex medical discipline and can be regarded as a major breakthrough in surgical technique. A minimally invasive intervention requires enhanced motor skills to deal with difficulties like the complex hand-eye coordination and restricted mobility. To alleviate these constraints we propose to enhance the surgeon's capabilities by providing a context-aware assistance using augmented reality techniques. To recognize and analyze the current situation for context-aware assistance, we need intraoperative sensor data and a model of the intervention. Characteristics of a situation are the performed activity, the used instruments, the surgical objects and the anatomical structures. Important information about the surgical activity can be acquired by recognizing the surgical gesture performed. Surgical gestures in minimally invasive surgery like cutting, knot-tying or suturing are here referred to as surgical skills. We use the motion data from the endoscopic instruments to classify and analyze the performed skill and even use it for skill evaluation in a training scenario. The system uses Hidden Markov Models (HMM) to model and recognize a specific surgical skill like knot-tying or suturing with an average recognition rate of 92%.
A clustering approach for estimating parameters of a profile hidden Markov model.
Aghdam, Rosa; Pezeshk, Hamid; Malekpour, Seyed Amir; Shemehsavar, Soudabeh; Eslahchi, Changiz
2013-01-01
A Profile Hidden Markov Model (PHMM) is a standard form of a Hidden Markov Models used for modeling protein and DNA sequence families based on multiple alignment. In this paper, we implement Baum-Welch algorithm and the Bayesian Monte Carlo Markov Chain (BMCMC) method for estimating parameters of small artificial PHMM. In order to improve the prediction accuracy of the estimation of the parameters of the PHMM, we classify the training data using the weighted values of sequences in the PHMM then apply an algorithm for estimating parameters of the PHMM. The results show that the BMCMC method performs better than the Maximum Likelihood estimation. PMID:23865165
Smith, D A; Steffen, W; Simmons, R M; Sleep, J
2001-01-01
In single-molecule experiments on the interaction between myosin and actin, mechanical events are embedded in Brownian noise. Methods of detecting events have progressed from simple manual detection of shifts in the position record to threshold-based selection of intermittent periods of reduction in noise. However, none of these methods provides a "best fit" to the data. We have developed a Hidden-Markov algorithm that assumes a simple kinetic model for the actin-myosin interaction and provides automatic, threshold-free, maximum-likelihood detection of events. The method is developed for the case of a weakly trapped actin-bead dumbbell interacting with a stationary myosin molecule (Finer, J. T., R. M. Simmons, and J. A. Spudich. 1994. Nature. 368:113-119). The algorithm operates on the variance of bead position signals in a running window, and is tested using Monte Carlo simulations to formulate ways of determining the optimum window width. The working stroke is derived and corrected for actin-bead link compliance. With experimental data, we find that modulation of myosin binding by the helical structure of the actin filament complicates the determination of the working stroke; however, under conditions that produce a Gaussian distribution of bound levels (cf. Molloy, J. E., J. E. Burns, J. Kendrick-Jones, R. T. Tregear, and D. C. S. White. 1995. Nature. 378:209-212), four experiments gave working strokes in the range 5.4-6.3 nm for rabbit skeletal muscle myosin S1. PMID:11606292
Group association test using a hidden Markov model.
Cheng, Yichen; Dai, James Y; Kooperberg, Charles
2016-04-01
In the genomic era, group association tests are of great interest. Due to the overwhelming number of individual genomic features, the power of testing for association of a single genomic feature at a time is often very small, as are the effect sizes for most features. Many methods have been proposed to test association of a trait with a group of features within a functional unit as a whole, e.g. all SNPs in a gene, yet few of these methods account for the fact that generally a substantial proportion of the features are not associated with the trait. In this paper, we propose to model the association for each feature in the group as a mixture of features with no association and features with non-zero associations to explicitly account for the possibility that a fraction of features may not be associated with the trait while other features in the group are. The feature-level associations are first estimated by generalized linear models; the sequence of these estimated associations is then modeled by a hidden Markov chain. To test for global association, we develop a modified likelihood ratio test based on a log-likelihood function that ignores higher order dependency plus a penalty term. We derive the asymptotic distribution of the likelihood ratio test under the null hypothesis. Furthermore, we obtain the posterior probability of association for each feature, which provides evidence of feature-level association and is useful for potential follow-up studies. In simulations and data application, we show that our proposed method performs well when compared with existing group association tests especially when there are only few features associated with the outcome. PMID:26420797
Ensemble hidden Markov models with application to landmine detection
NASA Astrophysics Data System (ADS)
Hamdi, Anis; Frigui, Hichem
2015-12-01
We introduce an ensemble learning method for temporal data that uses a mixture of hidden Markov models (HMM). We hypothesize that the data are generated by K models, each of which reflects a particular trend in the data. The proposed approach, called ensemble HMM (eHMM), is based on clustering within the log-likelihood space and has two main steps. First, one HMM is fit to each of the N individual training sequences. For each fitted model, we evaluate the log-likelihood of each sequence. This results in an N-by-N log-likelihood distance matrix that will be partitioned into K groups using a relational clustering algorithm. In the second step, we learn the parameters of one HMM per cluster. We propose using and optimizing various training approaches for the different K groups depending on their size and homogeneity. In particular, we investigate the maximum likelihood (ML), the minimum classification error (MCE), and the variational Bayesian (VB) training approaches. Finally, to test a new sequence, its likelihood is computed in all the models and a final confidence value is assigned by combining the models' outputs using an artificial neural network. We propose both discrete and continuous versions of the eHMM. Our approach was evaluated on a real-world application for landmine detection using ground-penetrating radar (GPR). Results show that both the continuous and discrete eHMM can identify meaningful and coherent HMM mixture components that describe different properties of the data. Each HMM mixture component models a group of data that share common attributes. These attributes are reflected in the mixture model's parameters. The results indicate that the proposed method outperforms the baseline HMM that uses one model for each class in the data.
Target characterization using hidden Markov models and classifiers
Kil, D.H.; Shin, F.B.; Fricke, J.R.
1996-06-01
We investigate various projection spaces and extract key parameters or features from each space to characterize low-frequency active (LFA) target returns in a low-dimensional space. The projection spaces encompass (1) time-embedded phase map, (2) segmented matched filter output, (3) various time-frequency distribution functions, such as Reduced Interference Distribution, to capture time-varying echo signatures, and (4) principal component inversion for signal cleaning and characterization. We utilize both dynamic and static features and parameterize them with a hybrid classification methodology consisting of hidden Markov models, classifiers, and data fusion. This clue identification and evaluation process is complemented by concurrent work on target physics to enhance our understanding of the target echo formation process. As a function of target aspect, we can observe (1) back scatter dominated by axial n=0 modes propagating back and forth along the length of the shell, (2) direct scatter from shell discontinuities, (3) helical or creeping waves from phase matching between the acoustic waves and membrane waves (both shear and compressional), and (4) the ``array response`` of the shell, with coherent superposition of elemental scattering sites along the shell leading to a peak response near broadside. As a function of target structures (the empty shell and the ribbed/complex shells), we see considerable complexity brought about by multiple reflections of the membrane waves between the rings. We show the merit of fusing parameters estimated from these projection spaces in characterizing LFA target returns using the MIT/NRL scaled model data. Our hybrid classifiers outperform the matched filter-based recognizer by an average of 5-25%;. This improvement can be attributed to a combination of good features that maximize inter-class discrimination and appropriate classifier topologies that exploit the underlying multi-dimensional feature probability density function.
Wang, Hongyan; Zhou, Xiaobo
2013-04-01
By altering the electrostatic charge of histones or providing binding sites to protein recognition molecules, Chromatin marks have been proposed to regulate gene expression, a property that has motivated researchers to link these marks to cis-regulatory elements. With the help of next generation sequencing technologies, we can now correlate one specific chromatin mark with regulatory elements (e.g. enhancers or promoters) and also build tools, such as hidden Markov models, to gain insight into mark combinations. However, hidden Markov models have limitation for their character of generative models and assume that a current observation depends only on a current hidden state in the chain. Here, we employed two graphical probabilistic models, namely the linear conditional random field model and multivariate hidden Markov model, to mark gene regions with different states based on recurrent and spatially coherent character of these eight marks. Both models revealed chromatin states that may correspond to enhancers and promoters, transcribed regions, transcriptional elongation, and low-signal regions. We also found that the linear conditional random field model was more effective than the hidden Markov model in recognizing regulatory elements, such as promoter-, enhancer-, and transcriptional elongation-associated regions, which gives us a better choice. PMID:23237214
Comparison of the Beta and the Hidden Markov Models of Trust in Dynamic Environments
NASA Astrophysics Data System (ADS)
Moe, Marie E. G.; Helvik, Bjarne E.; Knapskog, Svein J.
Computational trust and reputation models are used to aid the decision-making process in complex dynamic environments, where we are unable to obtain perfect information about the interaction partners. In this paper we present a comparison of our proposed hidden Markov trust model to the Beta reputation system. The hidden Markov trust model takes the time between observations into account, it also distinguishes between system states and uses methods previously applied to intrusion detection for the prediction of which state an agent is in. We show that the hidden Markov trust model performs better when it comes to the detection of changes in behavior of agents, due to its larger richness in model features. This means that our trust model may be more realistic in dynamic environments. However, the increased model complexity also leads to bigger challenges in estimating parameter values for the model. We also show that the hidden Markov trust model can be parameterized so that it responds similarly to the Beta reputation system.
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms
ERIC Educational Resources Information Center
Anderson, John R.
2012-01-01
Multivariate pattern analysis can be combined with Hidden Markov Model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first "mind reading" application…
Post processing with first- and second-order hidden Markov models
NASA Astrophysics Data System (ADS)
Taghva, Kazem; Poudel, Srijana; Malreddy, Spandana
2013-01-01
In this paper, we present the implementation and evaluation of first order and second order Hidden Markov Models to identify and correct OCR errors in the post processing of books. Our experiments show that the first order model approximately corrects 10% of the errors with 100% precision, while the second order model corrects a higher percentage of errors with much lower precision.
Estimation of the occurrence rate of strong earthquakes based on hidden semi-Markov models
NASA Astrophysics Data System (ADS)
Votsi, I.; Limnios, N.; Tsaklidis, G.; Papadimitriou, E.
2012-04-01
The present paper aims at the application of hidden semi-Markov models (HSMMs) in an attempt to reveal key features for the earthquake generation, associated with the actual stress field, which is not accessible to direct observation. The models generalize the hidden Markov models by considering the hidden process to form actually a semi-Markov chain. Considering that the states of the models correspond to levels of actual stress fields, the stress field level at the occurrence time of each strong event is revealed. The dataset concerns a well catalogued seismically active region incorporating a variety of tectonic styles. More specifically, the models are applied in Greece and its surrounding lands, concerning a complete data sample with strong (M≥ 6.5) earthquakes that occurred in the study area since 1845 up to present. The earthquakes that occurred are grouped according to their magnitudes and the cases of two and three magnitude ranges for a corresponding number of states are examined. The parameters of the HSMMs are estimated and their confidence intervals are calculated based on their asymptotic behavior. The rate of the earthquake occurrence is introduced through the proposed HSMMs and its maximum likelihood estimator is calculated. The asymptotic properties of the estimator are studied, including the uniformly strongly consistency and the asymptotical normality. The confidence interval for the proposed estimator is given. We assume the state space of both the observable and the hidden process to be finite, the hidden Markov chain to be homogeneous and stationary and the observations to be conditionally independent. The hidden states at the occurrence time of each strong event are revealed and the rate of occurrence of an anticipated earthquake is estimated on the basis of the proposed HSMMs. Moreover, the mean time for the first occurrence of a strong anticipated earthquake is estimated and its confidence interval is calculated.
Pan, Xiaoliang; Schwartz, Steven D
2016-07-14
Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate. Recent isotope-edited IR spectroscopy suggests that conformational heterogeneity exists within the Michaelis complex of LDH, and this heterogeneity affects the propensity toward the on-enzyme chemical step for each Michaelis substate. By combining molecular dynamics simulations with Markov and hidden Markov models, we obtained a detailed kinetic network of the substates of the Michaelis complex of LDH. The ensemble-average electric fields exerted onto the vibrational probe were calculated to provide a direct comparison with the vibrational spectroscopy. Structural features of the Michaelis substates were also analyzed on atomistic scales. Our work not only clearly demonstrates the conformational heterogeneity in the Michaelis complex of LDH and its coupling to the reactivities of the substates, but it also suggests a methodology to simultaneously resolve kinetics and structures on atomistic scales, which can be directly compared with the vibrational spectroscopy. PMID:27347759
STDP Installs in Winner-Take-All Circuits an Online Approximation to Hidden Markov Model Learning
Kappel, David; Nessler, Bernhard; Maass, Wolfgang
2014-01-01
In order to cross a street without being run over, we need to be able to extract very fast hidden causes of dynamically changing multi-modal sensory stimuli, and to predict their future evolution. We show here that a generic cortical microcircuit motif, pyramidal cells with lateral excitation and inhibition, provides the basis for this difficult but all-important information processing capability. This capability emerges in the presence of noise automatically through effects of STDP on connections between pyramidal cells in Winner-Take-All circuits with lateral excitation. In fact, one can show that these motifs endow cortical microcircuits with functional properties of a hidden Markov model, a generic model for solving such tasks through probabilistic inference. Whereas in engineering applications this model is adapted to specific tasks through offline learning, we show here that a major portion of the functionality of hidden Markov models arises already from online applications of STDP, without any supervision or rewards. We demonstrate the emergent computing capabilities of the model through several computer simulations. The full power of hidden Markov model learning can be attained through reward-gated STDP. This is due to the fact that these mechanisms enable a rejection sampling approximation to theoretically optimal learning. We investigate the possible performance gain that can be achieved with this more accurate learning method for an artificial grammar task. PMID:24675787
Fusion of Hidden Markov Random Field models and its Bayesian estimation.
Destrempes, François; Angers, Jean-François; Mignotte, Max
2006-10-01
In this paper, we present a Hidden Markov Random Field (HMRF) data-fusion model. The proposed model is applied to the segmentation of natural images based on the fusion of colors and textons into Julesz ensembles. The corresponding Exploration/ Selection/Estimation (ESE) procedure for the estimation of the parameters is presented. This method achieves the estimation of the parameters of the Gaussian kernels, the mixture proportions, the region labels, the number of regions, and the Markov hyper-parameter. Meanwhile, we present a new proof of the asymptotic convergence of the ESE procedure, based on original finite time bounds for the rate of convergence. PMID:17022259
NASA Astrophysics Data System (ADS)
Vaglica, Gabriella; Lillo, Fabrizio; Mantegna, Rosario N.
2010-07-01
Large trades in a financial market are usually split into smaller parts and traded incrementally over extended periods of time. We address these large trades as hidden orders. In order to identify and characterize hidden orders, we fit hidden Markov models to the time series of the sign of the tick-by-tick inventory variation of market members of the Spanish Stock Exchange. Our methodology probabilistically detects trading sequences, which are characterized by a significant majority of buy or sell transactions. We interpret these patches of sequential buying or selling transactions as proxies of the traded hidden orders. We find that the time, volume and number of transaction size distributions of these patches are fat tailed. Long patches are characterized by a large fraction of market orders and a low participation rate, while short patches have a large fraction of limit orders and a high participation rate. We observe the existence of a buy-sell asymmetry in the number, average length, average fraction of market orders and average participation rate of the detected patches. The detected asymmetry is clearly dependent on the local market trend. We also compare the hidden Markov model patches with those obtained with the segmentation method used in Vaglica et al (2008 Phys. Rev. E 77 036110), and we conclude that the former ones can be interpreted as a partition of the latter ones.
Hidden Markov Models for Zero-Inflated Poisson Counts with an Application to Substance Use
DeSantis, Stacia M.; Bandyopadhyay, Dipankar
2011-01-01
Paradigms for substance abuse cue-reactivity research involve short term pharmacological or stressful stimulation designed to elicit stress and craving responses in cocaine-dependent subjects. It is unclear as to whether stress induced from participation in such studies increases drug-seeking behavior. We propose a 2-state Hidden Markov model to model the number of cocaine abuses per week before and after participation in a stress- and cue-reactivity study. The hypothesized latent state corresponds to ‘high’ or ‘low’ use. To account for a preponderance of zeros, we assume a zero-inflated Poisson model for the count data. Transition probabilities depend on the prior week’s state, fixed demographic variables, and time-varying covariates. We adopt a Bayesian approach to model fitting, and use the conditional predictive ordinate statistic to demonstrate that the zero-inflated Poisson hidden Markov model outperforms other models for longitudinal count data. PMID:21538455
The Application of Wavelet-Domain Hidden Markov Tree Model in Diabetic Retinal Image Denoising
Cui, Dong; Liu, Minmin; Hu, Lei; Liu, Keju; Guo, Yongxin; Jiao, Qing
2015-01-01
The wavelet-domain Hidden Markov Tree Model can properly describe the dependence and correlation of fundus angiographic images’ wavelet coefficients among scales. Based on the construction of the fundus angiographic images Hidden Markov Tree Models and Gaussian Mixture Models, this paper applied expectation-maximum algorithm to estimate the wavelet coefficients of original fundus angiographic images and the Bayesian estimation to achieve the goal of fundus angiographic images denoising. As is shown in the experimental result, compared with the other algorithms as mean filter and median filter, this method effectively improved the peak signal to noise ratio of fundus angiographic images after denoising and preserved the details of vascular edge in fundus angiographic images. PMID:26628926
A hidden Markov model for space-time precipitation
Zucchini, W. ); Guttorp, P. )
1991-08-01
Stochastic models for precipitation events in space and time over mesoscale spatial areas have important applications in hydrology, both as input to runoff models and as parts of general circulation models (GCMs) of global climate. A family of multivariate models for the occurrence/nonoccurrence of precipitation at N sites is constructed by assuming a different probability of events at the sites for each of a number of unobservable climate states. The climate process is assumed to follow a Markov chain. Simple formulae for first- and second-order parameter functions are derived, and used to find starting values for a numerical maximization of the likelihood. The method is illustrated by applying it to data for one site in Washington and to data for a network in the Great plains.
Markov Chain Monte Carlo Sampling Methods for 1D Seismic and EM Data Inversion
Energy Science and Technology Software Center (ESTSC)
2008-09-22
This software provides several Markov chain Monte Carlo sampling methods for the Bayesian model developed for inverting 1D marine seismic and controlled source electromagnetic (CSEM) data. The current software can be used for individual inversion of seismic AVO and CSEM data and for joint inversion of both seismic and EM data sets. The structure of the software is very general and flexible, and it allows users to incorporate their own forward simulation codes and rockmore » physics model codes easily into this software. Although the softwae was developed using C and C++ computer languages, the user-supplied codes can be written in C, C++, or various versions of Fortran languages. The software provides clear interfaces for users to plug in their own codes. The output of this software is in the format that the R free software CODA can directly read to build MCMC objects.« less
A method of hidden Markov model optimization for use with geophysical data sets
NASA Technical Reports Server (NTRS)
Granat, R. A.
2003-01-01
Geophysics research has been faced with a growing need for automated techniques with which to process large quantities of data. A successful tool must meet a number of requirements: it should be consistent, require minimal parameter tuning, and produce scientifically meaningful results in reasonable time. We introduce a hidden Markov model (HMM)-based method for analysis of geophysical data sets that attempts to address these issues.
NASA Astrophysics Data System (ADS)
Choi, Yeontaek; Sim, Seungwoo; Lee, Sang-Hee
2014-06-01
The locomotion behavior of Caenorhabditis elegans has been extensively studied to understand the relationship between the changes in the organism's neural activity and the biomechanics. However, so far, we have not yet achieved the understanding. This is because the worm complicatedly responds to the environmental factors, especially chemical stress. Constructing a mathematical model is helpful for the understanding the locomotion behavior in various surrounding conditions. In the present study, we built three hidden Markov models for the crawling behavior of C. elegans in a controlled environment with no chemical treatment and in a polluted environment by formaldehyde, toluene, and benzene (0.1 ppm and 0.5 ppm for each case). The organism's crawling activity was recorded using a digital camcorder for 20 min at a rate of 24 frames per second. All shape patterns were quantified by branch length similarity entropy and classified into five groups by using the self-organizing map. To evaluate and establish the hidden Markov models, we compared correlation coefficients between the simulated behavior (i.e. temporal pattern sequence) generated by the models and the actual crawling behavior. The comparison showed that the hidden Markov models are successful to characterize the crawling behavior. In addition, we briefly discussed the possibility of using the models together with the entropy to develop bio-monitoring systems for determining water quality.
Hidden Markov models and other machine learning approaches in computational molecular biology
Baldi, P.
1995-12-31
This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. Computational tools are increasingly needed to process the massive amounts of data, to organize and classify sequences, to detect weak similarities, to separate coding from non-coding regions, and reconstruct the underlying evolutionary history. The fundamental problem in machine learning is the same as in scientific reasoning in general, as well as statistical modeling: to come up with a good model for the data. In this tutorial four classes of models are reviewed. They are: Hidden Markov models; artificial Neural Networks; Belief Networks; and Stochastic Grammars. When dealing with DNA and protein primary sequences, Hidden Markov models are one of the most flexible and powerful alignments and data base searches. In this tutorial, attention is focused on the theory of Hidden Markov Models, and how to apply them to problems in molecular biology.
Detecting critical state before phase transition of complex systems by hidden Markov model
NASA Astrophysics Data System (ADS)
Liu, Rui; Chen, Pei; Li, Yongjun; Chen, Luonan
Identifying the critical state or pre-transition state just before the occurrence of a phase transition is a challenging task, because the state of the system may show little apparent change before this critical transition during the gradual parameter variations. Such dynamics of phase transition is generally composed of three stages, i.e., before-transition state, pre-transition state, and after-transition state, which can be considered as three different Markov processes. Thus, based on this dynamical feature, we present a novel computational method, i.e., hidden Markov model (HMM), to detect the switching point of the two Markov processes from the before-transition state (a stationary Markov process) to the pre-transition state (a time-varying Markov process), thereby identifying the pre-transition state or early-warning signals of the phase transition. To validate the effectiveness, we apply this method to detect the signals of the imminent phase transitions of complex systems based on the simulated datasets, and further identify the pre-transition states as well as their critical modules for three real datasets, i.e., the acute lung injury triggered by phosgene inhalation, MCF-7 human breast cancer caused by heregulin, and HCV-induced dysplasia and hepatocellular carcinoma.
Noé, Frank; Wu, Hao; Prinz, Jan-Hendrik; Plattner, Nuria
2013-11-14
Markov state models (MSMs) have been successful in computing metastable states, slow relaxation timescales and associated structural changes, and stationary or kinetic experimental observables of complex molecules from large amounts of molecular dynamics simulation data. However, MSMs approximate the true dynamics by assuming a Markov chain on a clusters discretization of the state space. This approximation is difficult to make for high-dimensional biomolecular systems, and the quality and reproducibility of MSMs has, therefore, been limited. Here, we discard the assumption that dynamics are Markovian on the discrete clusters. Instead, we only assume that the full phase-space molecular dynamics is Markovian, and a projection of this full dynamics is observed on the discrete states, leading to the concept of Projected Markov Models (PMMs). Robust estimation methods for PMMs are not yet available, but we derive a practically feasible approximation via Hidden Markov Models (HMMs). It is shown how various molecular observables of interest that are often computed from MSMs can be computed from HMMs/PMMs. The new framework is applicable to both, simulation and single-molecule experimental data. We demonstrate its versatility by applications to educative model systems, a 1 ms Anton MD simulation of the bovine pancreatic trypsin inhibitor protein, and an optical tweezer force probe trajectory of an RNA hairpin. PMID:24320261
NASA Astrophysics Data System (ADS)
Noé, Frank; Wu, Hao; Prinz, Jan-Hendrik; Plattner, Nuria
2013-11-01
Markov state models (MSMs) have been successful in computing metastable states, slow relaxation timescales and associated structural changes, and stationary or kinetic experimental observables of complex molecules from large amounts of molecular dynamics simulation data. However, MSMs approximate the true dynamics by assuming a Markov chain on a clusters discretization of the state space. This approximation is difficult to make for high-dimensional biomolecular systems, and the quality and reproducibility of MSMs has, therefore, been limited. Here, we discard the assumption that dynamics are Markovian on the discrete clusters. Instead, we only assume that the full phase-space molecular dynamics is Markovian, and a projection of this full dynamics is observed on the discrete states, leading to the concept of Projected Markov Models (PMMs). Robust estimation methods for PMMs are not yet available, but we derive a practically feasible approximation via Hidden Markov Models (HMMs). It is shown how various molecular observables of interest that are often computed from MSMs can be computed from HMMs/PMMs. The new framework is applicable to both, simulation and single-molecule experimental data. We demonstrate its versatility by applications to educative model systems, a 1 ms Anton MD simulation of the bovine pancreatic trypsin inhibitor protein, and an optical tweezer force probe trajectory of an RNA hairpin.
Bayesian Clustering Using Hidden Markov Random Fields in Spatial Population Genetics
François, Olivier; Ancelet, Sophie; Guillot, Gilles
2006-01-01
We introduce a new Bayesian clustering algorithm for studying population structure using individually geo-referenced multilocus data sets. The algorithm is based on the concept of hidden Markov random field, which models the spatial dependencies at the cluster membership level. We argue that (i) a Markov chain Monte Carlo procedure can implement the algorithm efficiently, (ii) it can detect significant geographical discontinuities in allele frequencies and regulate the number of clusters, (iii) it can check whether the clusters obtained without the use of spatial priors are robust to the hypothesis of discontinuous geographical variation in allele frequencies, and (iv) it can reduce the number of loci required to obtain accurate assignments. We illustrate and discuss the implementation issues with the Scandinavian brown bear and the human CEPH diversity panel data set. PMID:16888334
A path-independent method for barrier option pricing in hidden Markov models
NASA Astrophysics Data System (ADS)
Rashidi Ranjbar, Hedieh; Seifi, Abbas
2015-12-01
This paper presents a method for barrier option pricing under a Black-Scholes model with Markov switching. We extend the option pricing method of Buffington and Elliott to price continuously monitored barrier options under a Black-Scholes model with regime switching. We use a regime switching random Esscher transform in order to determine an equivalent martingale pricing measure, and then solve the resulting multidimensional integral for pricing barrier options. We have calculated prices for down-and-out call options under a two-state hidden Markov model using two different Monte-Carlo simulation approaches and the proposed method. A comparison of the results shows that our method is faster than Monte-Carlo simulation methods.
NASA Astrophysics Data System (ADS)
Dong, Ming; He, David
2007-07-01
Diagnostics and prognostics are two important aspects in a condition-based maintenance (CBM) program. However, these two tasks are often separately performed. For example, data might be collected and analysed separately for diagnosis and prognosis. This practice increases the cost and reduces the efficiency of CBM and may affect the accuracy of the diagnostic and prognostic results. In this paper, a statistical modelling methodology for performing both diagnosis and prognosis in a unified framework is presented. The methodology is developed based on segmental hidden semi-Markov models (HSMMs). An HSMM is a hidden Markov model (HMM) with temporal structures. Unlike HMM, an HSMM does not follow the unrealistic Markov chain assumption and therefore provides more powerful modelling and analysis capability for real problems. In addition, an HSMM allows modelling the time duration of the hidden states and therefore is capable of prognosis. To facilitate the computation in the proposed HSMM-based diagnostics and prognostics, new forward-backward variables are defined and a modified forward-backward algorithm is developed. The existing state duration estimation methods are inefficient because they require a huge storage and computational load. Therefore, a new approach is proposed for training HSMMs in which state duration probabilities are estimated on the lattice (or trellis) of observations and states. The model parameters are estimated through the modified forward-backward training algorithm. The estimated state duration probability distributions combined with state-changing point detection can be used to predict the useful remaining life of a system. The evaluation of the proposed methodology was carried out through a real world application: health monitoring of hydraulic pumps. In the tests, the recognition rates for all states are greater than 96%. For each individual pump, the recognition rate is increased by 29.3% in comparison with HMMs. Because of the temporal
(abstract) Modeling Protein Families and Human Genes: Hidden Markov Models and a Little Beyond
NASA Technical Reports Server (NTRS)
Baldi, Pierre
1994-01-01
We will first give a brief overview of Hidden Markov Models (HMMs) and their use in Computational Molecular Biology. In particular, we will describe a detailed application of HMMs to the G-Protein-Coupled-Receptor Superfamily. We will also describe a number of analytical results on HMMs that can be used in discrimination tests and database mining. We will then discuss the limitations of HMMs and some new directions of research. We will conclude with some recent results on the application of HMMs to human gene modeling and parsing.
Alignment of multiple proteins with an ensemble of Hidden Markov Models
Song, Yinglei; Qu, Junfeng; Hura, Gurdeep S.
2011-01-01
In this paper, we developed a new method that progressively construct and update a set of alignments by adding sequences in certain order to each of the existing alignments. Each of the existing alignments is modelled with a profile Hidden Markov Model (HMM) and an added sequence is aligned to each of these profile HMMs. We introduced an integer parameter for the number of profile HMMs. The profile HMMs are then updated based on the alignments with leading scores. Our experiments on BaliBASE showed that our approach could efficiently explore the alignment space and significantly improve the alignment accuracy. PMID:20376922
Memetic Approaches for Optimizing Hidden Markov Models: A Case Study in Time Series Prediction
NASA Astrophysics Data System (ADS)
Bui, Lam Thu; Barlow, Michael
We propose a methodology for employing memetics (local search) within the framework of evolutionary algorithms to optimize parameters of hidden markov models. With this proposal, the rate and frequency of using local search are automatically changed over time either at a population or individual level. At the population level, we allow the rate of using local search to decay over time to zero (at the final generation). At the individual level, each individual is equipped with information of when it will do local search and for how long. This information evolves over time alongside the main elements of the chromosome representing the individual.
Hidden Markov Model-based Pedestrian Navigation System using MEMS Inertial Sensors
NASA Astrophysics Data System (ADS)
Zhang, Yingjun; Liu, Wen; Yang, Xuefeng; Xing, Shengwei
2015-02-01
In this paper, a foot-mounted pedestrian navigation system using MEMS inertial sensors is implemented, where the zero-velocity detection is abstracted into a hidden Markov model with 4 states and 15 observations. Moreover, an observations extraction algorithm has been developed to extract observations from sensor outputs; sample sets are used to train and optimize the model parameters by the Baum-Welch algorithm. Finally, a navigation system is developed, and the performance of the pedestrian navigation system is evaluated using indoor and outdoor field tests, and the results show that position error is less than 3% of total distance travelled.
Estimating the pen trajectories of static signatures using hidden Markov models.
Nel, Emli-Mari; du Preez, Johan A; Herbst, B M
2005-11-01
Static signatures originate as handwritten images on documents and by definition do not contain any dynamic information. This lack of information makes static signature verification systems significantly less reliable than their dynamic counterparts. This study involves extracting dynamic information from static images, specifically the pen trajectory while the signature was created. We assume that a dynamic version of the static image is available (typically obtained during an earlier registration process). We then derive a hidden Markov model from the static image and match it to the dynamic version of the image. This match results in the estimated pen trajectory of the static image. PMID:16285373
NASA Astrophysics Data System (ADS)
Granat, R. A.; Clayton, R.; Kedar, S.; Kaneko, Y.
2003-12-01
We employ a robust hidden Markov model (HMM) based technique to perform statistical pattern analysis of suspected seismic and aseismic events in the poorly explored period band of minutes to hours. The technique allows us to classify known events and provides a statistical basis for finding and cataloging similar events represented elsewhere in the observations. In this work, we focus on data collected by the Southern California TriNet system. The hidden Markov model (HMM) approach assumes that the observed data has been generated by an unobservable dynamical statistical process. The process is of a particular form such that each observation is coincident with the system being in a particular discrete state. The dynamics are the model are constructed so that the next state is directly dependent only on the current state -- it is a first order Markov process. The model is completely described by a set of parameters: the initial state probabilities, the first order Markov chain state-to-state transition probabilities, and the probability distribution of observable outputs associated with each state. Application of the model to data involves optimizing these model parameters with respect to some function of the observations, typically the likelihood of the observations given the model. Our work focused on the fact that this objective function has a number of local maxima that is exponential in the model size (the number of states). This means that not only is it very difficult to discover the global maximum, but also that results can vary widely between applications of the model. For some domains which employ HMMs for such purposes, such as speech processing, sufficient a priori information about the system is available to avoid this problem. However, for seismic data in general such a priori information is not available. Our approach involves analytical location of sub-optimal local maxima; once the locations of these maxima have been found, then we can employ a
Statistical Inference in Hidden Markov Models Using k-Segment Constraints
Titsias, Michalis K.; Holmes, Christopher C.; Yau, Christopher
2016-01-01
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward–backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online. PMID:27226674
Strong and Weak 2D Topological Superconductivity in Hidden Quasi-1D Systems
NASA Astrophysics Data System (ADS)
Yang, Fan; Yao, Hong
2014-03-01
Partly motivated by the newly discovered family of bismuth-based superconductors including LaO1-xFxBiS2, we study possible 2D topological superconductivities (TSC) in hidden quasi-1D systems with spin-orbit couplings. By doing RPA calculations and renormalization group (RG) treatment, we theoretically find that in a large portion of the phase diagram with varying interaction strengths and spin-orbit coupling the ground states favors superconductivity with odd-parity pairing, which results in either chiral TSC or time reversal invariant weak-Z2 TSC. We shall discuss several ways to experimentally identify these strong and weak 2D topological superconductivity. Possible applications to the bismuth-based superconductors LaO1-xFxBiS2 will also be remarked.
Zhang, Yu-Chen; Zhang, Shao-Wu; Liu, Lian; Liu, Hui; Zhang, Lin; Cui, Xiaodong; Huang, Yufei; Meng, Jia
2015-01-01
With the development of new sequencing technology, the entire N6-methyl-adenosine (m6A) RNA methylome can now be unbiased profiled with methylated RNA immune-precipitation sequencing technique (MeRIP-Seq), making it possible to detect differential methylation states of RNA between two conditions, for example, between normal and cancerous tissue. However, as an affinity-based method, MeRIP-Seq has yet provided base-pair resolution; that is, a single methylation site determined from MeRIP-Seq data can in practice contain multiple RNA methylation residuals, some of which can be regulated by different enzymes and thus differentially methylated between two conditions. Since existing peak-based methods could not effectively differentiate multiple methylation residuals located within a single methylation site, we propose a hidden Markov model (HMM) based approach to address this issue. Specifically, the detected RNA methylation site is further divided into multiple adjacent small bins and then scanned with higher resolution using a hidden Markov model to model the dependency between spatially adjacent bins for improved accuracy. We tested the proposed algorithm on both simulated data and real data. Result suggests that the proposed algorithm clearly outperforms existing peak-based approach on simulated systems and detects differential methylation regions with higher statistical significance on real dataset. PMID:26301253
NASA Astrophysics Data System (ADS)
Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea; Montalto, Placido; Patanè, Domenico; Privitera, Eugenio
2016-07-01
From January 2011 to December 2015, Mt. Etna was mainly characterized by a cyclic eruptive behavior with more than 40 lava fountains from New South-East Crater. Using the RMS (Root Mean Square) of the seismic signal recorded by stations close to the summit area, an automatic recognition of the different states of volcanic activity (QUIET, PRE-FOUNTAIN, FOUNTAIN, POST-FOUNTAIN) has been applied for monitoring purposes. Since values of the RMS time series calculated on the seismic signal are generated from a stochastic process, we can try to model the system generating its sampled values, assumed to be a Markov process, using Hidden Markov Models (HMMs). HMMs analysis seeks to recover the sequence of hidden states from the observations. In our framework, observations are characters generated by the Symbolic Aggregate approXimation (SAX) technique, which maps RMS time series values with symbols of a pre-defined alphabet. The main advantages of the proposed framework, based on HMMs and SAX, with respect to other automatic systems applied on seismic signals at Mt. Etna, are the use of multiple stations and static thresholds to well characterize the volcano states. Its application on a wide seismic dataset of Etna volcano shows the possibility to guess the volcano states. The experimental results show that, in most of the cases, we detected lava fountains in advance.
biomvRhsmm: Genomic Segmentation with Hidden Semi-Markov Model
Murani, Eduard; Ponsuksili, Siriluck
2014-01-01
High-throughput technologies like tiling array and next-generation sequencing (NGS) generate continuous homogeneous segments or signal peaks in the genome that represent transcripts and transcript variants (transcript mapping and quantification), regions of deletion and amplification (copy number variation), or regions characterized by particular common features like chromatin state or DNA methylation ratio (epigenetic modifications). However, the volume and output of data produced by these technologies present challenges in analysis. Here, a hidden semi-Markov model (HSMM) is implemented and tailored to handle multiple genomic profile, to better facilitate genome annotation by assisting in the detection of transcripts, regulatory regions, and copy number variation by holistic microarray or NGS. With support for various data distributions, instead of limiting itself to one specific application, the proposed hidden semi-Markov model is designed to allow modeling options to accommodate different types of genomic data and to serve as a general segmentation engine. By incorporating genomic positions into the sojourn distribution of HSMM, with optional prior learning using annotation or previous studies, the modeling output is more biologically sensible. The proposed model has been compared with several other state-of-the-art segmentation models through simulation benchmarking, which shows that our efficient implementation achieves comparable or better sensitivity and specificity in genomic segmentation. PMID:24995333
Sun, Shuying; Yu, Xiaoqing
2016-03-01
DNA methylation is an epigenetic event that plays an important role in regulating gene expression. It is important to study DNA methylation, especially differential methylation patterns between two groups of samples (e.g. patients vs. normal individuals). With next generation sequencing technologies, it is now possible to identify differential methylation patterns by considering methylation at the single CG site level in an entire genome. However, it is challenging to analyze large and complex NGS data. In order to address this difficult question, we have developed a new statistical method using a hidden Markov model and Fisher's exact test (HMM-Fisher) to identify differentially methylated cytosines and regions. We first use a hidden Markov chain to model the methylation signals to infer the methylation state as Not methylated (N), Partly methylated (P), and Fully methylated (F) for each individual sample. We then use Fisher's exact test to identify differentially methylated CG sites. We show the HMM-Fisher method and compare it with commonly cited methods using both simulated data and real sequencing data. The results show that HMM-Fisher outperforms the current available methods to which we have compared. HMM-Fisher is efficient and robust in identifying heterogeneous DM regions. PMID:26854292
Robertson, Colin; Sawford, Kate; Gunawardana, Walimunige S. N.; Nelson, Trisalyn A.; Nathoo, Farouk; Stephen, Craig
2011-01-01
Surveillance systems tracking health patterns in animals have potential for early warning of infectious disease in humans, yet there are many challenges that remain before this can be realized. Specifically, there remains the challenge of detecting early warning signals for diseases that are not known or are not part of routine surveillance for named diseases. This paper reports on the development of a hidden Markov model for analysis of frontline veterinary sentinel surveillance data from Sri Lanka. Field veterinarians collected data on syndromes and diagnoses using mobile phones. A model for submission patterns accounts for both sentinel-related and disease-related variability. Models for commonly reported cattle diagnoses were estimated separately. Region-specific weekly average prevalence was estimated for each diagnoses and partitioned into normal and abnormal periods. Visualization of state probabilities was used to indicate areas and times of unusual disease prevalence. The analysis suggests that hidden Markov modelling is a useful approach for surveillance datasets from novel populations and/or having little historical baselines. PMID:21949763
NASA Astrophysics Data System (ADS)
Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea; Montalto, Placido; Patanè, Domenico; Privitera, Eugenio
2016-04-01
From January 2011 to December 2015, Mt. Etna was mainly characterized by a cyclic eruptive behavior with more than 40 lava fountains from New South-East Crater. Using the RMS (Root Mean Square) of the seismic signal recorded by stations close to the summit area, an automatic recognition of the different states of volcanic activity (QUIET, PRE-FOUNTAIN, FOUNTAIN, POST-FOUNTAIN) has been applied for monitoring purposes. Since values of the RMS time series calculated on the seismic signal are generated from a stochastic process, we can try to model the system generating its sampled values, assumed to be a Markov process, using Hidden Markov Models (HMMs). HMMs analysis seeks to recover the sequence of hidden states from the observations. In our framework, observations are characters generated by the Symbolic Aggregate approXimation (SAX) technique, which maps RMS time series values with symbols of a pre-defined alphabet. The main advantages of the proposed framework, based on HMMs and SAX, with respect to other automatic systems applied on seismic signals at Mt. Etna, are the use of multiple stations and static thresholds to well characterize the volcano states. Its application on a wide seismic dataset of Etna volcano shows the possibility to guess the volcano states. The experimental results show that, in most of the cases, we detected lava fountains in advance.
Prediction of earthquake hazard by hidden Markov model (around Bilecik, NW Turkey)
NASA Astrophysics Data System (ADS)
Can, Ceren; Ergun, Gul; Gokceoglu, Candan
2014-09-01
Earthquakes are one of the most important natural hazards to be evaluated carefully in engineering projects, due to the severely damaging effects on human-life and human-made structures. The hazard of an earthquake is defined by several approaches and consequently earthquake parameters such as peak ground acceleration occurring on the focused area can be determined. In an earthquake prone area, the identification of the seismicity patterns is an important task to assess the seismic activities and evaluate the risk of damage and loss along with an earthquake occurrence. As a powerful and flexible framework to characterize the temporal seismicity changes and reveal unexpected patterns, Poisson hidden Markov model provides a better understanding of the nature of earthquakes. In this paper, Poisson hidden Markov model is used to predict the earthquake hazard in Bilecik (NW Turkey) as a result of its important geographic location. Bilecik is in close proximity to the North Anatolian Fault Zone and situated between Ankara and Istanbul, the two biggest cites of Turkey. Consequently, there are major highways, railroads and many engineering structures are being constructed in this area. The annual frequencies of earthquakes occurred within a radius of 100 km area centered on Bilecik, from January 1900 to December 2012, with magnitudes (M) at least 4.0 are modeled by using Poisson-HMM. The hazards for the next 35 years from 2013 to 2047 around the area are obtained from the model by forecasting the annual frequencies of M ≥ 4 earthquakes.
Prediction of earthquake hazard by hidden Markov model (around Bilecik, NW Turkey)
NASA Astrophysics Data System (ADS)
Can, Ceren Eda; Ergun, Gul; Gokceoglu, Candan
2014-09-01
Earthquakes are one of the most important natural hazards to be evaluated carefully in engineering projects, due to the severely damaging effects on human-life and human-made structures. The hazard of an earthquake is defined by several approaches and consequently earthquake parameters such as peak ground acceleration occurring on the focused area can be determined. In an earthquake prone area, the identification of the seismicity patterns is an important task to assess the seismic activities and evaluate the risk of damage and loss along with an earthquake occurrence. As a powerful and flexible framework to characterize the temporal seismicity changes and reveal unexpected patterns, Poisson hidden Markov model provides a better understanding of the nature of earthquakes. In this paper, Poisson hidden Markov model is used to predict the earthquake hazard in Bilecik (NW Turkey) as a result of its important geographic location. Bilecik is in close proximity to the North Anatolian Fault Zone and situated between Ankara and Istanbul, the two biggest cites of Turkey. Consequently, there are major highways, railroads and many engineering structures are being constructed in this area. The annual frequencies of earthquakes occurred within a radius of 100 km area centered on Bilecik, from January 1900 to December 2012, with magnitudes ( M) at least 4.0 are modeled by using Poisson-HMM. The hazards for the next 35 years from 2013 to 2047 around the area are obtained from the model by forecasting the annual frequencies of M ≥ 4 earthquakes.
Quasi-hidden Markov model and its applications in cluster analysis of earthquake catalogs
NASA Astrophysics Data System (ADS)
Wu, Zhengxiao
2011-12-01
We identify a broad class of models, quasi-hidden Markov models (QHMMs), which include hidden Markov models (HMMs) as special cases. Applying the QHMM framework, this paper studies how an earthquake cluster propagates statistically. Two QHMMs are used to describe two different propagating patterns. The "mother-and-kids" model regards the first shock in an earthquake cluster as "mother" and the aftershocks as "kids," which occur in a neighborhood centered by the mother. In the "domino" model, however, the next aftershock strikes in a neighborhood centered by the most recent previous earthquake in the cluster, and therefore aftershocks act like dominoes. As the likelihood of QHMMs can be efficiently computed via the forward algorithm, likelihood-based model selection criteria can be calculated to compare these two models. We demonstrate this procedure using data from the central New Zealand region. For this data set, the mother-and-kids model yields a higher likelihood as well as smaller AIC and BIC. In other words, in the aforementioned area the next aftershock is more likely to occur near the first shock than near the latest aftershock in the cluster. This provides an answer, though not entirely satisfactorily, to the question "where will the next aftershock be?". The asymptotic consistency of the model selection procedure in the paper is duly established, namely that, when the number of the observations goes to infinity, with probability one the procedure picks out the model with the smaller deviation from the true model (in terms of relative entropy rate).
Hidden Markov Model analysis of force/torque information in telemanipulation
Hannaford, B. ); Lee, P. )
1991-10-01
A new model is developed for prediction and analysis of sensor information recorded during robotic performance of tasks by telemanipulation. The model uses the Hidden Markov Model (stochastic functions of Markov nets; HMM) to describe the task structure, the operator or intelligent controller's goal structure, and the sensor signals such as forces and torques arising from interaction with the environment. The Markov process portion encodes the task sequence/subgoal structure, and the observation densities associated with each subgoal state encode the expected sensor signals associated with carrying out that subgoal. Methodology is described for construction of the model parameters based on engineering knowledge of the task. The Viterbi algorithm is used for model based analysis of force signals measured during experimental teleoperation and achieves excellent segmentation of the data into subgoal phases. The Baum-Welch algorithm is used to identify the most likely HMM from a given experiment. The HMM achieves a structured, knowledge-base model with explicit uncertainties and mature, optimal identification algorithms.
Unsupervised SAR images change detection with hidden Markov chains on a sliding window
NASA Astrophysics Data System (ADS)
Bouyahia, Zied; Benyoussef, Lamia; Derrode, Stéphane
2007-10-01
This work deals with unsupervised change detection in bi-date Synthetic Aperture Radar (SAR) images. Whatever the indicator of change used, e.g. log-ratio or Kullback-Leibler divergence, we have observed poor quality change maps for some events when using the Hidden Markov Chain (HMC) model we focus on in this work. The main reason comes from the stationary assumption involved in this model - and in most Markovian models such as Hidden Markov Random Fields-, which can not be justified in most observed scenes: changed areas are not necessarily stationary in the image. Besides the few non stationary Markov models proposed in the literature, the aim of this paper is to describe a pragmatic solution to tackle stationarity by using a sliding window strategy. In this algorithm, the criterion image is scanned pixel by pixel, and a classical HMC model is applied only on neighboring pixels. By moving the window through the image, the process is able to produce a change map which can better exhibit non stationary changes than the classical HMC applied directly on the whole criterion image. Special care is devoted to the estimation of the number of classes in each window, which can vary from one (no change) to three (positive change, negative change and no change) by using the corrected Akaike Information Criterion (AICc) suited to small samples. The quality assessment of the proposed approach is achieved with speckle-simulated images in which simulated changes is introduced. The windowed strategy is also evaluated with a pair of RADARSAT images bracketing the Nyiragongo volcano eruption event in January 2002. The available ground truth confirms the effectiveness of the proposed approach compared to a classical HMC-based strategy.
Maruotti, Antonello; Rocci, Roberto
2012-04-30
Hidden Markov models (HMMs) are frequently used to analyse longitudinal data, where the same set of subjects is repeatedly observed over time. In this context, several sources of heterogeneity may arise at individual and/or time level, which affect the hidden process, that is, the transition probabilities between the hidden states. In this paper, we propose the use of a finite mixture of non-homogeneous HMMs (NH-HMMs) to face the heterogeneity problem. The non-homogeneity of the model allows us to take into account observed sources of heterogeneity by means of a proper set of covariates, time and/or individual dependent, explaining the variations in the transition probabilities. Moreover, we handle the unobserved sources of heterogeneity at the individual level, due to, for example, omitted covariates, by introducing a random term with a discrete distribution. The resulting model is a finite mixture of NH-HMM that can be used to classify individuals according to their dynamic behaviour or to estimate a mixed NH-HMM without any assumption regarding the distribution of the random term following the non-parametric maximum likelihood approach. We test the effectiveness of the proposal through a simulation study and an application to real data on alcohol abuse. PMID:22302505
NASA Astrophysics Data System (ADS)
Turner, Sean; Galelli, Stefano; Wilcox, Karen
2015-04-01
Water reservoir systems are often affected by recurring large-scale ocean-atmospheric anomalies, known as teleconnections, that cause prolonged periods of climatological drought. Accurate forecasts of these events -- at lead times in the order of weeks and months -- may enable reservoir operators to take more effective release decisions to improve the performance of their systems. In practice this might mean a more reliable water supply system, a more profitable hydropower plant or a more sustainable environmental release policy. To this end, climate indices, which represent the oscillation of the ocean-atmospheric system, might be gainfully employed within reservoir operating models that adapt the reservoir operation as a function of the climate condition. This study develops a Stochastic Dynamic Programming (SDP) approach that can incorporate climate indices using a Hidden Markov Model. The model simulates the climatic regime as a hidden state following a Markov chain, with the state transitions driven by variation in climatic indices, such as the Southern Oscillation Index. Time series analysis of recorded streamflow data reveals the parameters of separate autoregressive models that describe the inflow to the reservoir under three representative climate states ("normal", "wet", "dry"). These models then define inflow transition probabilities for use in a classic SDP approach. The key advantage of the Hidden Markov Model is that it allows conditioning the operating policy not only on the reservoir storage and the antecedent inflow, but also on the climate condition, thus potentially allowing adaptability to a broader range of climate conditions. In practice, the reservoir operator would effect a water release tailored to a specific climate state based on available teleconnection data and forecasts. The approach is demonstrated on the operation of a realistic, stylised water reservoir with carry-over capacity in South-East Australia. Here teleconnections relating
ERIC Educational Resources Information Center
Stifter, Cynthia A.; Rovine, Michael
2015-01-01
The focus of the present longitudinal study, to examine mother-infant interaction during the administration of immunizations at 2 and 6?months of age, used hidden Markov modelling, a time series approach that produces latent states to describe how mothers and infants work together to bring the infant to a soothed state. Results revealed a…
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms
Anderson, John R.
2011-01-01
Multivariate pattern analysis can be combined with hidden Markov model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first “mind reading” application involves using fMRI activity to track what students are doing as they solve a sequence of algebra problems. The methodology achieves considerable accuracy at determining both what problem-solving step the students are taking and whether they are performing that step correctly. The second “model discovery” application involves using statistical model evaluation to determine how many substates are involved in performing a step of algebraic problem solving. This research indicates that different steps involve different numbers of substates and these substates are associated with different fluency in algebra problem solving. PMID:21820455
Detecting microcalcifications in digital mammograms using wavelet domain hidden Markov tree model.
Regentova, Emma; Zhang, Lei; Zheng, Jun; Veni, Gopaalkrishna
2006-01-01
In this paper we investigate the performance of statistical modeling of digital mammograms by means of wavelet domain hidden Markov tree model (WHMT) for its inclusion to a computer-aided diagnostic prompting system for detecting microcalcification (MC) clusters. The system incorporates: (1) gross-segmentation of mammograms for obtaining the breast region; (2) eliminating the pepper-type noise, (3) block-wise wavelet transform of the breast signal and likelihood calculation; (4) image segmentation; (5) postprocessing for retaining MC clusters. FROC curves are obtained for all MC clusters containing mammograms of mini-MIAS database. 100% of true positive cases are detected by the system at 2.9 false positives per case. PMID:17945686
3D+t brain MRI segmentation using robust 4D Hidden Markov Chain.
Lavigne, François; Collet, Christophe; Armspach, Jean-Paul
2014-01-01
In recent years many automatic methods have been developed to help physicians diagnose brain disorders, but the problem remains complex. In this paper we propose a method to segment brain structures on two 3D multi-modal MR images taken at different times (longitudinal acquisition). A bias field correction is performed with an adaptation of the Hidden Markov Chain (HMC) allowing us to take into account the temporal correlation in addition to spatial neighbourhood information. To improve the robustness of the segmentation of the principal brain structures and to detect Multiple Sclerosis Lesions as outliers the Trimmed Likelihood Estimator (TLE) is used during the process. The method is validated on 3D+t brain MR images. PMID:25571045
McCallum, Kenneth Jordan; Wang, Ji-Ping
2013-07-01
Copy number variations (CNVs) are a significant source of genetic variation and have been found frequently associated with diseases such as cancers and autism. High-throughput sequencing data are increasingly being used to detect and quantify CNVs; however, the distributional properties of the data are not fully understood. A hidden Markov model (HMM) is proposed using inhomogeneous emission distributions based on negative binomial regression to account for the sequencing biases. The model is tested on the whole genome sequencing data and simulated data sets. An algorithm for CNV detection is implemented in the R package CNVfinder. The model based on negative binomial regression is shown to provide a good fit to the data and provides competitive performance compared with methods based on normalization of read counts. PMID:23428932
Ficz, Gabriella; Wolf, Verena; Walter, Jörn
2016-01-01
DNA methylation and demethylation are opposing processes that when in balance create stable patterns of epigenetic memory. The control of DNA methylation pattern formation by replication dependent and independent demethylation processes has been suggested to be influenced by Tet mediated oxidation of 5mC. Several alternative mechanisms have been proposed suggesting that 5hmC influences either replication dependent maintenance of DNA methylation or replication independent processes of active demethylation. Using high resolution hairpin oxidative bisulfite sequencing data, we precisely determine the amount of 5mC and 5hmC and model the contribution of 5hmC to processes of demethylation in mouse ESCs. We develop an extended hidden Markov model capable of accurately describing the regional contribution of 5hmC to demethylation dynamics. Our analysis shows that 5hmC has a strong impact on replication dependent demethylation, mainly by impairing methylation maintenance. PMID:27224554
Adaptation of hidden Markov models for recognizing speech of reduced frame rate.
Lee, Lee-Min; Jean, Fu-Rong
2013-12-01
The frame rate of the observation sequence in distributed speech recognition applications may be reduced to suit a resource-limited front-end device. In order to use models trained using full-frame-rate data in the recognition of reduced frame-rate (RFR) data, we propose a method for adapting the transition probabilities of hidden Markov models (HMMs) to match the frame rate of the observation. Experiments on the recognition of clean and noisy connected digits are conducted to evaluate the proposed method. Experimental results show that the proposed method can effectively compensate for the frame-rate mismatch between the training and the test data. Using our adapted model to recognize the RFR speech data, one can significantly reduce the computation time and achieve the same level of accuracy as that of a method, which restores the frame rate using data interpolation. PMID:23757520
NASA Astrophysics Data System (ADS)
Attaluri, Pavan K.; Chen, Zhengxin; Weerakoon, Aruna M.; Lu, Guoqing
Multiple criteria decision making (MCDM) has significant impact in bioinformatics. In the research reported here, we explore the integration of decision tree (DT) and Hidden Markov Model (HMM) for subtype prediction of human influenza A virus. Infection with influenza viruses continues to be an important public health problem. Viral strains of subtype H3N2 and H1N1 circulates in humans at least twice annually. The subtype detection depends mainly on the antigenic assay, which is time-consuming and not fully accurate. We have developed a Web system for accurate subtype detection of human influenza virus sequences. The preliminary experiment showed that this system is easy-to-use and powerful in identifying human influenza subtypes. Our next step is to examine the informative positions at the protein level and extend its current functionality to detect more subtypes. The web functions can be accessed at http://glee.ist.unomaha.edu/.
Identifying bubble collapse in a hydrothermal system using hidden Markov models
NASA Astrophysics Data System (ADS)
Dawson, Phillip B.; Benítez, M. C.; Lowenstern, Jacob B.; Chouet, Bernard A.
2012-01-01
Beginning in July 2003 and lasting through September 2003, the Norris Geyser Basin in Yellowstone National Park exhibited an unusual increase in ground temperature and hydrothermal activity. Using hidden Markov model theory, we identify over five million high-frequency (>15 Hz) seismic events observed at a temporary seismic station deployed in the basin in response to the increase in hydrothermal activity. The source of these seismic events is constrained to within ˜100 m of the station, and produced ˜3500-5500 events per hour with mean durations of ˜0.35-0.45 s. The seismic event rate, air temperature, hydrologic temperatures, and surficial water flow of the geyser basin exhibited a marked diurnal pattern that was closely associated with solar thermal radiance. We interpret the source of the seismicity to be due to the collapse of small steam bubbles in the hydrothermal system, with the rate of collapse being controlled by surficial temperatures and daytime evaporation rates.
Development of the hidden Markov models based Lithuanian speech recognition system
NASA Astrophysics Data System (ADS)
Ringeliene, Z.; Lipeika, A.
2010-09-01
The paper presents a prototype of the speaker-independent Lithuanian isolated word recognition system. The system is based on the hidden Markov models, a powerful statistical method for modeling speech signals. The prototype system can be used for Lithuanian words recognition investigations and is a good starting point for the development of a more sophisticated recognition system. The system graphical user interface is easy to control. Visualization of the entire recognition process is useful for analyzing of the recognition results. Based on this recognizer, a system for Web browser control by voice was developed. The program, which implements control by voice commands, was integrated in the speech recognition system. The system performance was evaluated by using different sets of acoustic models and vocabularies.
HMM-DM: identifying differentially methylated regions using a hidden Markov model.
Yu, Xiaoqing; Sun, Shuying
2016-03-01
DNA methylation is an epigenetic modification involved in organism development and cellular differentiation. Identifying differential methylations can help to study genomic regions associated with diseases. Differential methylation studies on single-CG resolution have become possible with the bisulfite sequencing (BS) technology. However, there is still a lack of efficient statistical methods for identifying differentially methylated (DM) regions in BS data. We have developed a new approach named HMM-DM to detect DM regions between two biological conditions using BS data. This new approach first uses a hidden Markov model (HMM) to identify DM CG sites accounting for spatial correlation across CG sites and variation across samples, and then summarizes identified sites into regions. We demonstrate through a simulation study that our approach has a superior performance compared to BSmooth. We also illustrate the application of HMM-DM using a real breast cancer dataset. PMID:26887041
NASA Astrophysics Data System (ADS)
Jiang, Huiming; Chen, Jin; Dong, Guangming
2016-05-01
Hidden Markov model (HMM) has been widely applied in bearing performance degradation assessment. As a machine learning-based model, its accuracy, subsequently, is dependent on the sensitivity of the features used to estimate the degradation performance of bearings. It's a big challenge to extract effective features which are not influenced by other qualities or attributes uncorrelated with the bearing degradation condition. In this paper, a bearing performance degradation assessment method based on HMM and nuisance attribute projection (NAP) is proposed. NAP can filter out the effect of nuisance attributes in feature space through projection. The new feature space projected by NAP is more sensitive to bearing health changes and barely influenced by other interferences occurring in operation condition. To verify the effectiveness of the proposed method, two different experimental databases are utilized. The results show that the combination of HMM and NAP can effectively improve the accuracy and robustness of the bearing performance degradation assessment system.
Damage evaluation by a guided wave-hidden Markov model based method
NASA Astrophysics Data System (ADS)
Mei, Hanfei; Yuan, Shenfang; Qiu, Lei; Zhang, Jinjin
2016-02-01
Guided wave based structural health monitoring has shown great potential in aerospace applications. However, one of the key challenges of practical engineering applications is the accurate interpretation of the guided wave signals under time-varying environmental and operational conditions. This paper presents a guided wave-hidden Markov model based method to improve the damage evaluation reliability of real aircraft structures under time-varying conditions. In the proposed approach, an HMM based unweighted moving average trend estimation method, which can capture the trend of damage propagation from the posterior probability obtained by HMM modeling is used to achieve a probabilistic evaluation of the structural damage. To validate the developed method, experiments are performed on a hole-edge crack specimen under fatigue loading condition and a real aircraft wing spar under changing structural boundary conditions. Experimental results show the advantage of the proposed method.
Hidden Markov models and neural networks for fault detection in dynamic systems
NASA Technical Reports Server (NTRS)
Smyth, Padhraic
1994-01-01
Neural networks plus hidden Markov models (HMM) can provide excellent detection and false alarm rate performance in fault detection applications, as shown in this viewgraph presentation. Modified models allow for novelty detection. Key contributions of neural network models are: (1) excellent nonparametric discrimination capability; (2) a good estimator of posterior state probabilities, even in high dimensions, and thus can be embedded within overall probabilistic model (HMM); and (3) simple to implement compared to other nonparametric models. Neural network/HMM monitoring model is currently being integrated with the new Deep Space Network (DSN) antenna controller software and will be on-line monitoring a new DSN 34-m antenna (DSS-24) by July, 1994.
Non-intrusive gesture recognition system combining with face detection based on Hidden Markov Model
NASA Astrophysics Data System (ADS)
Jin, Jing; Wang, Yuanqing; Xu, Liujing; Cao, Liqun; Han, Lei; Zhou, Biye; Li, Minggao
2014-11-01
A non-intrusive gesture recognition human-machine interaction system is proposed in this paper. In order to solve the hand positioning problem which is a difficulty in current algorithms, face detection is used for the pre-processing to narrow the search area and find user's hand quickly and accurately. Hidden Markov Model (HMM) is used for gesture recognition. A certain number of basic gesture units are trained as HMM models. At the same time, an improved 8-direction feature vector is proposed and used to quantify characteristics in order to improve the detection accuracy. The proposed system can be applied in interaction equipments without special training for users, such as household interactive television
Switched Fault Diagnosis Approach for Industrial Processes based on Hidden Markov Model
NASA Astrophysics Data System (ADS)
Wang, Lin; Yang, Chunjie; Sun, Youxian; Pan, Yijun; An, Ruqiao
2015-11-01
Traditional fault diagnosis methods based on hidden Markov model (HMM) use a unified method for feature extraction, such as principal component analysis (PCA), kernel principal component analysis (KPCA) and independent component analysis (ICA). However, every method has its own limitations. For example, PCA cannot extract nonlinear relationships among process variables. So it is inappropriate to extract all features of variables by only one method, especially when data characteristics are very complex. This article proposes a switched feature extraction procedure using PCA and KPCA based on nonlinearity measure. By the proposed method, we are able to choose the most suitable feature extraction method, which could improve the accuracy of fault diagnosis. A simulation from the Tennessee Eastman (TE) process demonstrates that the proposed approach is superior to the traditional one based on HMM and could achieve more accurate classification of various process faults.
Grinding Wheel Condition Monitoring with Hidden Markov Model-Based Clustering Methods
Liao, T. W.; Hua, G; Qu, Jun; Blau, Peter Julian
2006-01-01
Hidden Markov model (HMM) is well known for sequence modeling and has been used for condition monitoring. However, HMM-based clustering methods are developed only recently. This article proposes a HMM-based clustering method for monitoring the condition of grinding wheel used in grinding operations. The proposed method first extract features from signals based on discrete wavelet decomposition using a moving window approach. It then generates a distance (dissimilarity) matrix using HMM. Based on this distance matrix several hierarchical and partitioning-based clustering algorithms are applied to obtain clustering results. The proposed methodology was tested with feature sequences extracted from acoustic emission signals. The results show that clustering accuracy is dependent upon cutting condition. Higher material removal rate seems to produce more discriminatory signals/features than lower material removal rate. The effect of window size, wavelet decomposition level, wavelet basis, clustering algorithm, and data normalization were also studied.
Hidden Markov model approach to skill learning and its application to telerobotics
Yang, J. . Robotics Inst. Univ. of Akron, OH . Dept. of Electrical Engineering); Xu, Y. . Robotics Inst.); Chen, C.S. . Dept. of Electrical Engineering)
1994-10-01
In this paper, the authors discuss the problem of how human skill can be represented as a parametric model using a hidden Markov model (HMM), and how an HMM-based skill model can be used to learn human skill. HMM is feasible to characterize a doubly stochastic process--measurable action and immeasurable mental states--that is involved in the skill learning. The authors formulated the learning problem as a multidimensional HMM and developed a testbed for a variety of skill learning applications. Based on ''the most likely performance'' criterion, the best action sequence can be selected from all previously measured action data by modeling the skill as an HMM. The proposed method has been implemented in the teleoperation control of a space station robot system, and some important implementation issues have been discussed. The method allows a robot to learn human skill certain tasks and to improve motion performance.
Recognition of amyotrophic lateral sclerosis disease using factorial hidden Markov model.
Khorasani, Abed; Daliri, Mohammad Reza; Pooyan, Mohammad
2016-02-01
Amyotrophic lateral sclerosis (ALS) is a common disease among neurological disorders that can change the pattern of gait in human. One of the effective methods for recognition and analysis of gait patterns in ALS patients is utilizing stride interval time series. With proper preprocessing for removing unwanted artifacts from the raw stride interval times and then extracting meaningful features from these data, the factorial hidden Markov model (FHMM) was used to distinguish ALS patients from healthy subjects. The results of classification accuracy evaluated using the leave-one-out (LOO) cross-validation algorithm showed that the FHMM method provides better recognition of ALS and healthy subjects compared to standard HMM. Moreover, comparing our method with a state-of-the art method named least square support vector machine (LS-SVM) showed the efficiency of the FHMM in distinguishing ALS subjects from healthy ones. PMID:26110481
Nastou, Katerina C; Tsaousis, Georgios N; Papandreou, Nikos C; Hamodrakas, Stavros J
2016-07-01
A large number of modular domains that exhibit specific lipid binding properties are present in many membrane proteins involved in trafficking and signal transduction. These domains are present in either eukaryotic peripheral membrane or transmembrane proteins and are responsible for the non-covalent interactions of these proteins with membrane lipids. Here we report a profile Hidden Markov Model based method capable of detecting Membrane Binding Proteins (MBPs) from information encoded in their amino acid sequence, called MBPpred. The method identifies MBPs that contain one or more of the Membrane Binding Domains (MBDs) that have been described to date, and further classifies these proteins based on their position in respect to the membrane, either as peripheral or transmembrane. MBPpred is available online at http://bioinformatics.biol.uoa.gr/MBPpred. This method was applied in selected eukaryotic proteomes, in order to examine the characteristics they exhibit in various eukaryotic kingdoms and phyla. PMID:27048983
Prestat, Emmanuel; David, Maude M; Hultman, Jenni; Taş, Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D; Jumpponen, Ari; Tringe, Susannah G; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K
2014-10-29
A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. 'profiles') were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/. PMID:25260589
Hidden Markov model analysis of force/torque information in telemanipulation
NASA Technical Reports Server (NTRS)
Hannaford, Blake; Lee, Paul
1991-01-01
A model for the prediction and analysis of sensor information recorded during robotic performance of telemanipulation tasks is presented. The model uses the hidden Markov model to describe the task structure, the operator's or intelligent controller's goal structure, and the sensor signals. A methodology for constructing the model parameters based on engineering knowledge of the task is described. It is concluded that the model and its optimal state estimation algorithm, the Viterbi algorithm, are very succesful at the task of segmenting the data record into phases corresponding to subgoals of the task. The model provides a rich modeling structure within a statistical framework, which enables it to represent complex systems and be robust to real-world sensory signals.
A hidden Markov model combined with climate indices for multidecadal streamflow simulation
NASA Astrophysics Data System (ADS)
Bracken, C.; Rajagopalan, B.; Zagona, E.
2014-10-01
Hydroclimate time series often exhibit very low year-to-year autocorrelation while showing prolonged wet and dry epochs reminiscent of regime-shifting behavior. Traditional stochastic time series models cannot capture the regime-shifting features thereby misrepresenting the risk of prolonged wet and dry periods, consequently impacting management and planning efforts. Upper Colorado River Basin (UCRB) annual flow series highlights this clearly. To address this, a simulation framework is developed using a hidden Markov (HM) model in combination with large-scale climate indices that drive multidecadal variability. We demonstrate this on the UCRB flows and show that the simulations are able to capture the regime features by reproducing the multidecadal spectral features present in the data where a basic HM model without climate information cannot.
NASA Astrophysics Data System (ADS)
Zhou, Haitao; Chen, Jin; Dong, Guangming; Wang, Hongchao; Yuan, Haodong
2016-01-01
Due to the important role rolling element bearings play in rotating machines, condition monitoring and fault diagnosis system should be established to avoid abrupt breakage during operation. Various features from time, frequency and time-frequency domain are usually used for bearing or machinery condition monitoring. In this study, NCA-based feature extraction (FE) approach is proposed to reduce the dimensionality of original feature set and avoid the "curse of dimensionality". Furthermore, coupled hidden Markov model (CHMM) based on multichannel data acquisition is applied to diagnose bearing or machinery fault. Two case studies are presented to validate the proposed approach both in bearing fault diagnosis and fault severity classification. The experiment results show that the proposed NCA-CHMM can remove redundant information, fuse data from different channels and improve the diagnosis results.
Hand Gesture Spotting Based on 3D Dynamic Features Using Hidden Markov Models
NASA Astrophysics Data System (ADS)
Elmezain, Mahmoud; Al-Hamadi, Ayoub; Michaelis, Bernd
In this paper, we propose an automatic system that handles hand gesture spotting and recognition simultaneously in stereo color image sequences without any time delay based on Hidden Markov Models (HMMs). Color and 3D depth map are used to segment hand regions. The hand trajectory will determine in further step using Mean-shift algorithm and Kalman filter to generate 3D dynamic features. Furthermore, k-means clustering algorithm is employed for the HMMs codewords. To spot meaningful gestures accurately, a non-gesture model is proposed, which provides confidence limit for the calculated likelihood by other gesture models. The confidence measures are used as an adaptive threshold for spotting meaningful gestures. Experimental results show that the proposed system can successfully recognize isolated gestures with 98.33% and meaningful gestures with 94.35% reliability for numbers (0-9).
Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression.
Wiedenhoeft, John; Brugel, Eric; Schliep, Alexander
2016-05-01
By integrating Haar wavelets with Hidden Markov Models, we achieve drastically reduced running times for Bayesian inference using Forward-Backward Gibbs sampling. We show that this improves detection of genomic copy number variants (CNV) in array CGH experiments compared to the state-of-the-art, including standard Gibbs sampling. The method concentrates computational effort on chromosomal segments which are difficult to call, by dynamically and adaptively recomputing consecutive blocks of observations likely to share a copy number. This makes routine diagnostic use and re-analysis of legacy data collections feasible; to this end, we also propose an effective automatic prior. An open source software implementation of our method is available at http://schlieplab.org/Software/HaMMLET/ (DOI: 10.5281/zenodo.46262). This paper was selected for oral presentation at RECOMB 2016, and an abstract is published in the conference proceedings. PMID:27177143
Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression
Wiedenhoeft, John; Brugel, Eric; Schliep, Alexander
2016-01-01
By integrating Haar wavelets with Hidden Markov Models, we achieve drastically reduced running times for Bayesian inference using Forward-Backward Gibbs sampling. We show that this improves detection of genomic copy number variants (CNV) in array CGH experiments compared to the state-of-the-art, including standard Gibbs sampling. The method concentrates computational effort on chromosomal segments which are difficult to call, by dynamically and adaptively recomputing consecutive blocks of observations likely to share a copy number. This makes routine diagnostic use and re-analysis of legacy data collections feasible; to this end, we also propose an effective automatic prior. An open source software implementation of our method is available at http://schlieplab.org/Software/HaMMLET/ (DOI: 10.5281/zenodo.46262). This paper was selected for oral presentation at RECOMB 2016, and an abstract is published in the conference proceedings. PMID:27177143
Statistical Mechanics of Transcription-Factor Binding Site Discovery Using Hidden Markov Models
Mehta, Pankaj; Schwab, David J.; Sengupta, Anirvan M.
2011-01-01
Hidden Markov Models (HMMs) are a commonly used tool for inference of transcription factor (TF) binding sites from DNA sequence data. We exploit the mathematical equivalence between HMMs for TF binding and the “inverse” statistical mechanics of hard rods in a one-dimensional disordered potential to investigate learning in HMMs. We derive analytic expressions for the Fisher information, a commonly employed measure of confidence in learned parameters, in the biologically relevant limit where the density of binding sites is low. We then use techniques from statistical mechanics to derive a scaling principle relating the specificity (binding energy) of a TF to the minimum amount of training data necessary to learn it. PMID:22851788
A computationally efficient approach for hidden-Markov model-augmented fingerprint-based positioning
NASA Astrophysics Data System (ADS)
Roth, John; Tummala, Murali; McEachen, John
2016-09-01
This paper presents a computationally efficient approach for mobile subscriber position estimation in wireless networks. A method of data scaling assisted by timing adjust is introduced in fingerprint-based location estimation under a framework which allows for minimising computational cost. The proposed method maintains a comparable level of accuracy to the traditional case where no data scaling is used and is evaluated in a simulated environment under varying channel conditions. The proposed scheme is studied when it is augmented by a hidden-Markov model to match the internal parameters to the channel conditions that present, thus minimising computational cost while maximising accuracy. Furthermore, the timing adjust quantity, available in modern wireless signalling messages, is shown to be able to further reduce computational cost and increase accuracy when available. The results may be seen as a significant step towards integrating advanced position-based modelling with power-sensitive mobile devices.
Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K.
2014-09-26
A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.
NASA Astrophysics Data System (ADS)
Liu, Qinming; Dong, Ming; Lv, Wenyuan; Geng, Xiuli; Li, Yupeng
2015-12-01
Health prognosis for equipment is considered as a key process of the condition-based maintenance strategy. This paper presents an integrated framework for multi-sensor equipment diagnosis and prognosis based on adaptive hidden semi-Markov model (AHSMM). Unlike hidden semi-Markov model (HSMM), the basic algorithms in an AHSMM are first modified in order for decreasing computation and space complexity. Then, the maximum likelihood linear regression transformations method is used to train the output and duration distributions to re-estimate all unknown parameters. The AHSMM is used to identify the hidden degradation state and obtain the transition probabilities among health states and durations. Finally, through the proposed hazard rate equations, one can predict the useful remaining life of equipment with multi-sensor information. Our main results are verified in real world applications: monitoring hydraulic pumps from Caterpillar Inc. The results show that the proposed methods are more effective for multi-sensor monitoring equipment health prognosis.
Modeling carbachol-induced hippocampal network synchronization using hidden Markov models
NASA Astrophysics Data System (ADS)
Dragomir, Andrei; Akay, Yasemin M.; Akay, Metin
2010-10-01
In this work we studied the neural state transitions undergone by the hippocampal neural network using a hidden Markov model (HMM) framework. We first employed a measure based on the Lempel-Ziv (LZ) estimator to characterize the changes in the hippocampal oscillation patterns in terms of their complexity. These oscillations correspond to different modes of hippocampal network synchronization induced by the cholinergic agonist carbachol in the CA1 region of mice hippocampus. HMMs are then used to model the dynamics of the LZ-derived complexity signals as first-order Markov chains. Consequently, the signals corresponding to our oscillation recordings can be segmented into a sequence of statistically discriminated hidden states. The segmentation is used for detecting transitions in neural synchronization modes in data recorded from wild-type and triple transgenic mice models (3xTG) of Alzheimer's disease (AD). Our data suggest that transition from low-frequency (delta range) continuous oscillation mode into high-frequency (theta range) oscillation, exhibiting repeated burst-type patterns, occurs always through a mode resembling a mixture of the two patterns, continuous with burst. The relatively random patterns of oscillation during this mode may reflect the fact that the neuronal network undergoes re-organization. Further insight into the time durations of these modes (retrieved via the HMM segmentation of the LZ-derived signals) reveals that the mixed mode lasts significantly longer (p < 10-4) in 3xTG AD mice. These findings, coupled with the documented cholinergic neurotransmission deficits in the 3xTG mice model, may be highly relevant for the case of AD.
Fuzzy hidden Markov chains segmentation for volume determination and quantitation in PET
NASA Astrophysics Data System (ADS)
Hatt, M.; Lamare, F.; Boussion, N.; Turzo, A.; Collet, C.; Salzenstein, F.; Roux, C.; Jarritt, P.; Carson, K.; Cheze-LeRest, C.; Visvikis, D.
2007-07-01
Accurate volume of interest (VOI) estimation in PET is crucial in different oncology applications such as response to therapy evaluation and radiotherapy treatment planning. The objective of our study was to evaluate the performance of the proposed algorithm for automatic lesion volume delineation; namely the fuzzy hidden Markov chains (FHMC), with that of current state of the art in clinical practice threshold based techniques. As the classical hidden Markov chain (HMC) algorithm, FHMC takes into account noise, voxel intensity and spatial correlation, in order to classify a voxel as background or functional VOI. However the novelty of the fuzzy model consists of the inclusion of an estimation of imprecision, which should subsequently lead to a better modelling of the 'fuzzy' nature of the object of interest boundaries in emission tomography data. The performance of the algorithms has been assessed on both simulated and acquired datasets of the IEC phantom, covering a large range of spherical lesion sizes (from 10 to 37 mm), contrast ratios (4:1 and 8:1) and image noise levels. Both lesion activity recovery and VOI determination tasks were assessed in reconstructed images using two different voxel sizes (8 mm3 and 64 mm3). In order to account for both the functional volume location and its size, the concept of % classification errors was introduced in the evaluation of volume segmentation using the simulated datasets. Results reveal that FHMC performs substantially better than the threshold based methodology for functional volume determination or activity concentration recovery considering a contrast ratio of 4:1 and lesion sizes of <28 mm. Furthermore differences between classification and volume estimation errors evaluated were smaller for the segmented volumes provided by the FHMC algorithm. Finally, the performance of the automatic algorithms was less susceptible to image noise levels in comparison to the threshold based techniques. The analysis of both
Doan, Tan N; Kong, David C M; Marshall, Caroline; Kirkpatrick, Carl M J; McBryde, Emma S
2015-01-01
Little is known about the transmission dynamics of Acinetobacter baumannii in hospitals, despite such information being critical for designing effective infection control measures. In the absence of comprehensive epidemiological data, mathematical modelling is an attractive approach to understanding transmission process. The statistical challenge in estimating transmission parameters from infection data arises from the fact that most patients are colonised asymptomatically and therefore the transmission process is not fully observed. Hidden Markov models (HMMs) can overcome this problem. We developed a continuous-time structured HMM to characterise the transmission dynamics, and to quantify the relative importance of different acquisition sources of A. baumannii in intensive care units (ICUs) in three hospitals in Melbourne, Australia. The hidden states were the total number of patients colonised with A. baumannii (both detected and undetected). The model input was monthly incidence data of the number of detected colonised patients (observations). A Bayesian framework with Markov chain Monte Carlo algorithm was used for parameter estimations. We estimated that 96-98% of acquisition in Hospital 1 and 3 was due to cross-transmission between patients; whereas most colonisation in Hospital 2 was due to other sources (sporadic acquisition). On average, it takes 20 and 31 days for each susceptible individual in Hospital 1 and Hospital 3 to become colonised as a result of cross-transmission, respectively; whereas it takes 17 days to observe one new colonisation from sporadic acquisition in Hospital 2. The basic reproduction ratio (R0) for Hospital 1, 2 and 3 was 1.5, 0.02 and 1.6, respectively. Our study is the first to characterise the transmission dynamics of A. baumannii using mathematical modelling. We showed that HMMs can be applied to sparse hospital infection data to estimate transmission parameters despite unobserved events and imperfect detection of the organism
Local Autoencoding for Parameter Estimation in a Hidden Potts-Markov Random Field.
Song, Sanming; Si, Bailu; Herrmann, J Michael; Feng, Xisheng
2016-05-01
A local-autoencoding (LAE) method is proposed for the parameter estimation in a Hidden Potts-Markov random field model. Due to sampling cost, Markov chain Monte Carlo methods are rarely used in real-time applications. Like other heuristic methods, LAE is based on a conditional independence assumption. It adapts, however, the parameters in a block-by-block style with a simple Hebbian learning rule. Experiments with given label fields show that the LAE is able to converge in far less time than required for a scan. It is also possible to derive an estimate for LAE based on a Cramer–Rao bound that is similar to the classical maximum pseudolikelihood method. As a general algorithm, LAE can be used to estimate the parameters in anisotropic label fields. Furthermore, LAE is not limited to the classical Potts model and can be applied to other types of Potts models by simple label field transformations and straightforward learning rule extensions. Experimental results on image segmentations demonstrate the efficiency and generality of the LAE algorithm. PMID:27019491
The analysis of disease biomarker data using a mixed hidden Markov model (Open Access publication)
Detilleux, Johann C
2008-01-01
A mixed hidden Markov model (HMM) was developed for predicting breeding values of a biomarker (here, somatic cell score) and the individual probabilities of health and disease (here, mastitis) based upon the measurements of the biomarker. At a first level, the unobserved disease process (Markov model) was introduced and at a second level, the measurement process was modeled, making the link between the unobserved disease states and the observed biomarker values. This hierarchical formulation allows joint estimation of the parameters of both processes. The flexibility of this approach is illustrated on the simulated data. Firstly, lactation curves for the biomarker were generated based upon published parameters (mean, variance, and probabilities of infection) for cows with known clinical conditions (health or mastitis due to Escherichia coli or Staphylococcus aureus). Next, estimation of the parameters was performed via Gibbs sampling, assuming the health status was unknown. Results from the simulations and mathematics show that the mixed HMM is appropriate to estimate the quantities of interest although the accuracy of the estimates is moderate when the prevalence of the disease is low. The paper ends with some indications for further developments of the methodology. PMID:18694546
Estimating parameters of hidden Markov models based on marked individuals: use of robust design data
Kendall, William L.; White, Gary C.; Hines, James E.; Langtimm, Catherine A.; Yoshizaki, Jun
2012-01-01
Development and use of multistate mark-recapture models, which provide estimates of parameters of Markov processes in the face of imperfect detection, have become common over the last twenty years. Recently, estimating parameters of hidden Markov models, where the state of an individual can be uncertain even when it is detected, has received attention. Previous work has shown that ignoring state uncertainty biases estimates of survival and state transition probabilities, thereby reducing the power to detect effects. Efforts to adjust for state uncertainty have included special cases and a general framework for a single sample per period of interest. We provide a flexible framework for adjusting for state uncertainty in multistate models, while utilizing multiple sampling occasions per period of interest to increase precision and remove parameter redundancy. These models also produce direct estimates of state structure for each primary period, even for the case where there is just one sampling occasion. We apply our model to expected value data, and to data from a study of Florida manatees, to provide examples of the improvement in precision due to secondary capture occasions. We also provide user-friendly software to implement these models. This general framework could also be used by practitioners to consider constrained models of particular interest, or model the relationship between within-primary period parameters (e.g., state structure) and between-primary period parameters (e.g., state transition probabilities).
Kendall, William L; White, Gary C; Hines, James E; Langtimm, Catherine A; Yoshizaki, Jun
2012-04-01
Development and use of multistate mark-recapture models, which provide estimates of parameters of Markov processes in the face of imperfect detection, have become common over the last 20 years. Recently, estimating parameters of hidden Markov models, where the state of an individual can be uncertain even when it is detected, has received attention. Previous work has shown that ignoring state uncertainty biases estimates of survival and state transition probabilities, thereby reducing the power to detect effects. Efforts to adjust for state uncertainty have included special cases and a general framework for a single sample per period of interest. We provide a flexible framework for adjusting for state uncertainty in multistate models, while utilizing multiple sampling occasions per period of interest to increase precision and remove parameter redundancy. These models also produce direct estimates of state structure for each primary period, even for the case where there is just one sampling occasion. We apply our model to expected-value data, and to data from a study of Florida manatees, to provide examples of the improvement in precision due to secondary capture occasions. We have also implemented these models in program MARK. This general framework could also be used by practitioners to consider constrained models of particular interest, or to model the relationship between within-primary-period parameters (e.g., state structure) and between-primary-period parameters (e.g., state transition probabilities). PMID:22690641
Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks
Zhu, Shijia; Wang, Yadong
2015-01-01
Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings. PMID:26680653
A Hidden Markov Model for Urban-Scale Traffic Estimation Using Floating Car Data.
Wang, Xiaomeng; Peng, Ling; Chi, Tianhe; Li, Mengzhu; Yao, Xiaojing; Shao, Jing
2015-01-01
Urban-scale traffic monitoring plays a vital role in reducing traffic congestion. Owing to its low cost and wide coverage, floating car data (FCD) serves as a novel approach to collecting traffic data. However, sparse probe data represents the vast majority of the data available on arterial roads in most urban environments. In order to overcome the problem of data sparseness, this paper proposes a hidden Markov model (HMM)-based traffic estimation model, in which the traffic condition on a road segment is considered as a hidden state that can be estimated according to the conditions of road segments having similar traffic characteristics. An algorithm based on clustering and pattern mining rather than on adjacency relationships is proposed to find clusters with road segments having similar traffic characteristics. A multi-clustering strategy is adopted to achieve a trade-off between clustering accuracy and coverage. Finally, the proposed model is designed and implemented on the basis of a real-time algorithm. Results of experiments based on real FCD confirm the applicability, accuracy, and efficiency of the model. In addition, the results indicate that the model is practicable for traffic estimation on urban arterials and works well even when more than 70% of the probe data are missing. PMID:26710073
Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks
NASA Astrophysics Data System (ADS)
Zhu, Shijia; Wang, Yadong
2015-12-01
Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.
A Hidden Markov Model for Urban-Scale Traffic Estimation Using Floating Car Data
Wang, Xiaomeng; Peng, Ling; Chi, Tianhe; Li, Mengzhu; Yao, Xiaojing; Shao, Jing
2015-01-01
Urban-scale traffic monitoring plays a vital role in reducing traffic congestion. Owing to its low cost and wide coverage, floating car data (FCD) serves as a novel approach to collecting traffic data. However, sparse probe data represents the vast majority of the data available on arterial roads in most urban environments. In order to overcome the problem of data sparseness, this paper proposes a hidden Markov model (HMM)-based traffic estimation model, in which the traffic condition on a road segment is considered as a hidden state that can be estimated according to the conditions of road segments having similar traffic characteristics. An algorithm based on clustering and pattern mining rather than on adjacency relationships is proposed to find clusters with road segments having similar traffic characteristics. A multi-clustering strategy is adopted to achieve a trade-off between clustering accuracy and coverage. Finally, the proposed model is designed and implemented on the basis of a real-time algorithm. Results of experiments based on real FCD confirm the applicability, accuracy, and efficiency of the model. In addition, the results indicate that the model is practicable for traffic estimation on urban arterials and works well even when more than 70% of the probe data are missing. PMID:26710073
Detecting Gait Phases from RGB-D Images Based on Hidden Markov Model
Heravi, Hamed; Ebrahimi, Afshin; Olyaee, Ehsan
2016-01-01
Gait contains important information about the status of the human body and physiological signs. In many medical applications, it is important to monitor and accurately analyze the gait of the patient. Since walking shows the reproducibility signs in several phases, separating these phases can be used for the gait analysis. In this study, a method based on image processing for extracting phases of human gait from RGB-Depth images is presented. The sequence of depth images from the front view has been processed to extract the lower body depth profile and distance features. Feature vector extracted from image is the same as observation vector of hidden Markov model, and the phases of gait are considered as hidden states of the model. After training the model using the images which are randomly selected as training samples, the phase estimation of gait becomes possible using the model. The results confirm the rate of 60–40% of two major phases of the gait and also the mid-stance phase is recognized with 85% precision. PMID:27563572
Michalopoulos, Kostas; Zervakis, Michalis; Deiber, Marie-Pierre; Bourbakis, Nikolaos
2016-09-01
We present a novel synergistic methodology for the spatio-temporal analysis of single Electroencephalogram (EEG) trials. This new methodology is based on the novel synergy of Local Global Graph (LG graph) to characterize define the structural features of the EEG topography as a global descriptor for robust comparison of dominant topographies (microstates) and Hidden Markov Models (HMM) to model the topographic sequence in a unique way. In particular, the LG graph descriptor defines similarity and distance measures that can be successfully used for the difficult comparison of the extracted LG graphs in the presence of noise. In addition, hidden states represent periods of stationary distribution of topographies that constitute the equivalent of the microstates in the model. The transitions between the different microstates and the formed syntactic patterns can reveal differences in the processing of the input stimulus between different pathologies. We train the HMM model to learn the transitions between the different microstates and express the syntactic patterns that appear in the single trials in a compact and efficient way. We applied this methodology in single trials consisting of normal subjects and patients with Progressive Mild Cognitive Impairment (PMCI) to discriminate these two groups. The classification results show that this approach is capable to efficiently discriminate between control and Progressive MCI single trials. Results indicate that HMMs provide physiologically meaningful results that can be used in the syntactic analysis of Event Related Potentials. PMID:27255799
Zhu, Shijia; Wang, Yadong
2015-01-01
Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is 'stationarity', and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings. PMID:26680653
Detecting Gait Phases from RGB-D Images Based on Hidden Markov Model.
Heravi, Hamed; Ebrahimi, Afshin; Olyaee, Ehsan
2016-01-01
Gait contains important information about the status of the human body and physiological signs. In many medical applications, it is important to monitor and accurately analyze the gait of the patient. Since walking shows the reproducibility signs in several phases, separating these phases can be used for the gait analysis. In this study, a method based on image processing for extracting phases of human gait from RGB-Depth images is presented. The sequence of depth images from the front view has been processed to extract the lower body depth profile and distance features. Feature vector extracted from image is the same as observation vector of hidden Markov model, and the phases of gait are considered as hidden states of the model. After training the model using the images which are randomly selected as training samples, the phase estimation of gait becomes possible using the model. The results confirm the rate of 60-40% of two major phases of the gait and also the mid-stance phase is recognized with 85% precision. PMID:27563572
NASA Astrophysics Data System (ADS)
Andriyas, S.; McKee, M.
2014-12-01
Anticipating farmers' irrigation decisions can provide the possibility of improving the efficiency of canal operations in on-demand irrigation systems. Although multiple factors are considered during irrigation decision making, for any given farmer there might be one factor playing a major role. Identification of that biophysical factor which led to a farmer deciding to irrigate is difficult because of high variability of those factors during the growing season. Analysis of the irrigation decisions of a group of farmers for a single crop can help to simplify the problem. We developed a hidden Markov model (HMM) to analyze irrigation decisions and explore the factor and level at which the majority of farmers decide to irrigate. The model requires observed variables as inputs and the hidden states. The chosen model inputs were relatively easily measured, or estimated, biophysical data, including such factors (i.e., those variables which are believed to affect irrigation decision-making) as cumulative evapotranspiration, soil moisture depletion, soil stress coefficient, and canal flows. Irrigation decision series were the hidden states for the model. The data for the work comes from the Canal B region of the Lower Sevier River Basin, near Delta, Utah. The main crops of the region are alfalfa, barley, and corn. A portion of the data was used to build and test the model capability to explore that factor and the level at which the farmer takes the decision to irrigate for future irrigation events. Both group and individual level behavior can be studied using HMMs. The study showed that the farmers cannot be classified into certain classes based on their irrigation decisions, but vary in their behavior from irrigation-to-irrigation across all years and crops. HMMs can be used to analyze what factor and, subsequently, what level of that factor on which the farmer most likely based the irrigation decision. The study shows that the HMM is a capable tool to study a process
Neuwald, Andrew F; Liu, Jun S
2004-01-01
Background Certain protein families are highly conserved across distantly related organisms and belong to large and functionally diverse superfamilies. The patterns of conservation present in these protein sequences presumably are due to selective constraints maintaining important but unknown structural mechanisms with some constraints specific to each family and others shared by a larger subset or by the entire superfamily. To exploit these patterns as a source of functional information, we recently devised a statistically based approach called contrast hierarchical alignment and interaction network (CHAIN) analysis, which infers the strengths of various categories of selective constraints from co-conserved patterns in a multiple alignment. The power of this approach strongly depends on the quality of the multiple alignments, which thus motivated development of theoretical concepts and strategies to improve alignment of conserved motifs within large sets of distantly related sequences. Results Here we describe a hidden Markov model (HMM), an algebraic system, and Markov chain Monte Carlo (MCMC) sampling strategies for alignment of multiple sequence motifs. The MCMC sampling strategies are useful both for alignment optimization and for adjusting position specific background amino acid frequencies for alignment uncertainties. Associated statistical formulations provide an objective measure of alignment quality as well as automatic gap penalty optimization. Improved alignments obtained in this way are compared with PSI-BLAST based alignments within the context of CHAIN analysis of three protein families: Giα subunits, prolyl oligopeptidases, and transitional endoplasmic reticulum (p97) AAA+ ATPases. Conclusion While not entirely replacing PSI-BLAST based alignments, which likewise may be optimized for CHAIN analysis using this approach, these motif-based methods often more accurately align very distantly related sequences and thus can provide a better measure of
Partially ordered mixed hidden Markov model for the disablement process of older adults
Ip, Edward H.; Zhang, Qiang; Rejeski, W. Jack; Harris, Tamara B.; Kritchevsky, Stephen
2013-01-01
At both the individual and societal levels, the health and economic burden of disability in older adults is enormous in developed countries, including the U.S. Recent studies have revealed that the disablement process in older adults often comprises episodic periods of impaired functioning and periods that are relatively free of disability, amid a secular and natural trend of decline in functioning. Rather than an irreversible, progressive event that is analogous to a chronic disease, disability is better conceptualized and mathematically modeled as states that do not necessarily follow a strict linear order of good-to-bad. Statistical tools, including Markov models, which allow bidirectional transition between states, and random effects models, which allow individual-specific rate of secular decline, are pertinent. In this paper, we propose a mixed effects, multivariate, hidden Markov model to handle partially ordered disability states. The model generalizes the continuation ratio model for ordinal data in the generalized linear model literature and provides a formal framework for testing the effects of risk factors and/or an intervention on the transitions between different disability states. Under a generalization of the proportional odds ratio assumption, the proposed model circumvents the problem of a potentially large number of parameters when the number of states and the number of covariates are substantial. We describe a maximum likelihood method for estimating the partially ordered, mixed effects model and show how the model can be applied to a longitudinal data set that consists of N = 2,903 older adults followed for 10 years in the Health Aging and Body Composition Study. We further statistically test the effects of various risk factors upon the probabilities of transition into various severe disability states. The result can be used to inform geriatric and public health science researchers who study the disablement process. PMID:24058222
Martinez-Murcia, Francisco J; Górriz, Juan M; Ramírez, Javier; Ortiz, Andres
2016-11-01
The usage of biomedical imaging in the diagnosis of dementia is increasingly widespread. A number of works explore the possibilities of computational techniques and algorithms in what is called computed aided diagnosis. Our work presents an automatic parametrization of the brain structure by means of a path generation algorithm based on hidden Markov models (HMMs). The path is traced using information of intensity and spatial orientation in each node, adapting to the structure of the brain. Each path is itself a useful way to characterize the distribution of the tissue inside the magnetic resonance imaging (MRI) image by, for example, extracting the intensity levels at each node or generating statistical information of the tissue distribution. Additionally, a further processing consisting of a modification of the grey level co-occurrence matrix (GLCM) can be used to characterize the textural changes that occur throughout the path, yielding more meaningful values that could be associated to Alzheimer's disease (AD), as well as providing a significant feature reduction. This methodology achieves moderate performance, up to 80.3% of accuracy using a single path in differential diagnosis involving Alzheimer-affected subjects versus controls belonging to the Alzheimer's disease neuroimaging initiative (ADNI). PMID:27354189
Griffin, William A.; Li, Xun
2016-01-01
Sequential affect dynamics generated during the interaction of intimate dyads, such as married couples, are associated with a cascade of effects—some good and some bad—on each partner, close family members, and other social contacts. Although the effects are well documented, the probabilistic structures associated with micro-social processes connected to the varied outcomes remain enigmatic. Using extant data we developed a method of classifying and subsequently generating couple dynamics using a Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM). Our findings indicate that several key aspects of existing models of marital interaction are inadequate: affect state emissions and their durations, along with the expected variability differences between distressed and nondistressed couples are present but highly nuanced; and most surprisingly, heterogeneity among highly satisfied couples necessitate that they be divided into subgroups. We review how this unsupervised learning technique generates plausible dyadic sequences that are sensitive to relationship quality and provide a natural mechanism for computational models of behavioral and affective micro-social processes. PMID:27187319
Application of hidden Markov models to biological data mining: a case study
NASA Astrophysics Data System (ADS)
Yin, Michael M.; Wang, Jason T.
2000-04-01
In this paper we present an example of biological data mining: the detection of splicing junction acceptors in eukaryotic genes. Identification or prediction of transcribed sequences from within genomic DNA has been a major rate-limiting step in the pursuit of genes. Programs currently available are far from being powerful enough to elucidate the gene structure completely. Here we develop a hidden Markov model (HMM) to represent the degeneracy features of splicing junction acceptor sites in eukaryotic genes. The HMM system is fully trained using an expectation maximization (EM) algorithm and the system performance is evaluated using the 10-way cross- validation method. Experimental results show that our HMM system can correctly classify more than 94% of the candidate sequences (including true and false acceptor sites) into right categories. About 90% of the true acceptor sites and 96% of the false acceptor sites in the test data are classified correctly. These results are very promising considering that only the local information in DNA is used. The proposed model will be a very important component of an effective and accurate gene structure detection system currently being developed in our lab.
Yada, T; Hirosawa, M
1996-12-31
The gene-finding programs developed so far have not paid much attention to the detection of short protein coding regions (CDSs). However, the detection of short CDSs is important for the study of photosynthesis. We utilized GeneHacker, a gene-finding program based on the hidden Markov model (HMM), to detect short CDSs (from 90 to 300 bases) in a 1.0 mega contiguous sequence of cyanobacterium Synechocystis sp. strain PCC6803 which carries a complete set of genes for oxygenic photosynthesis. GeneHacker differs from other gene-finding programs based on the HMM in that it utilizes di-codon statistics as well. GeneHacker successfully detected seven out of the eight short CDSs annotated in this sequence and was clearly superior to GeneMark in this range of length. GeneHacker detected 94 potentially new CDSs, 9 of which have counterparts in the genetic databases. Four of the nine CDSs were less than 150 bases and were photosynthesis-related genes. The results show the effectiveness of GeneHacker in detecting very short CDSs corresponding to genes. PMID:9097038
A hidden Markov model that finds genes in E. coli DNA.
Krogh, A; Mian, I S; Haussler, D
1994-01-01
A hidden Markov model (HMM) has been developed to find protein coding genes in E. coli DNA using E. coli genome DNA sequence from the EcoSeq6 database maintained by Kenn Rudd. This HMM includes states that model the codons and their frequencies in E. coli genes, as well as the patterns found in the intergenic region, including repetitive extragenic palindromic sequences and the Shine-Delgarno motif. To account for potential sequencing errors and or frameshifts in raw genomic DNA sequence, it allows for the (very unlikely) possibility of insertions and deletions of individual nucleotides within a codon. The parameters of the HMM are estimated using approximately one million nucleotides of annotated DNA in EcoSeq6 and the model tested on a disjoint set of contigs containing about 325,000 nucleotides. The HMM finds the exact locations of about 80% of the known E. coli genes, and approximate locations for about 10%. It also finds several potentially new genes, and locates several places were insertion or deletion errors/and or frameshifts may be present in the contigs. PMID:7984429
Zacher, Benedikt; Lidschreiber, Michael; Cramer, Patrick; Gagneur, Julien; Tresch, Achim
2014-01-01
DNA replication, transcription and repair involve the recruitment of protein complexes that change their composition as they progress along the genome in a directed or strand-specific manner. Chromatin immunoprecipitation in conjunction with hidden Markov models (HMMs) has been instrumental in understanding these processes, as they segment the genome into discrete states that can be related to DNA-associated protein complexes. However, current HMM-based approaches are not able to assign forward or reverse direction to states or properly integrate strand-specific (e.g., RNA expression) with non-strand-specific (e.g., ChIP) data, which is indispensable to accurately characterize directed processes. To overcome these limitations, we introduce bidirectional HMMs which infer directed genomic states from occupancy profiles de novo. Application to RNA polymerase II-associated factors in yeast and chromatin modifications in human T cells recovers the majority of transcribed loci, reveals gene-specific variations in the yeast transcription cycle and indicates the existence of directed chromatin state patterns at transcribed, but not at repressed, regions in the human genome. In yeast, we identify 32 new transcribed loci, a regulated initiation–elongation transition, the absence of elongation factors Ctk1 and Paf1 from a class of genes, a distinct transcription mechanism for highly expressed genes and novel DNA sequence motifs associated with transcription termination. We anticipate bidirectional HMMs to significantly improve the analyses of genome-associated directed processes. PMID:25527639
NASA Astrophysics Data System (ADS)
Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant
2015-07-01
Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.
Hypovigilance detection for UCAV operators based on a hidden Markov model.
Choi, Yerim; Kwon, Namyeon; Lee, Sungjun; Shin, Yongwook; Ryo, Chuh Yeop; Park, Jonghun; Shin, Dongmin
2014-01-01
With the advance of military technology, the number of unmanned combat aerial vehicles (UCAVs) has rapidly increased. However, it has been reported that the accident rate of UCAVs is much higher than that of manned combat aerial vehicles. One of the main reasons for the high accident rate of UCAVs is the hypovigilance problem which refers to the decrease in vigilance levels of UCAV operators while maneuvering. In this paper, we propose hypovigilance detection models for UCAV operators based on EEG signal to minimize the number of occurrences of hypovigilance. To enable detection, we have applied hidden Markov models (HMMs), two of which are used to indicate the operators' dual states, normal vigilance and hypovigilance, and, for each operator, the HMMs are trained as a detection model. To evaluate the efficacy and effectiveness of the proposed models, we conducted two experiments on the real-world data obtained by using EEG-signal acquisition devices, and they yielded satisfactory results. By utilizing the proposed detection models, the problem of hypovigilance of UCAV operators and the problem of high accident rate of UCAVs can be addressed. PMID:24963338
Newton, Richard; Hinds, Jason; Wernisch, Lorenz
2006-01-01
Whole genome DNA microarray genomotyping experiments compare the gene content of different species or strains of bacteria. A statistical approach to analysing the results of these experiments was developed, based on a Hidden Markov model (HMM), which takes adjacency of genes along the genome into account when calling genes present or absent. The model was implemented in the statistical language R and applied to three datasets. The method is numerically stable with good convergence properties. Error rates are reduced compared with approaches that ignore spatial information. Moreover, the HMM circumvents a problem encountered in a conventional analysis: determining the cut-off value to use to classify a gene as absent. An Apache Struts web interface for the R script was created for the benefit of users unfamiliar with R. The application may be found at http://hmmgd.cryst.bbk.ac.uk/hmmgd. The source code illustrating how to run R scripts from an Apache Struts-based web application is available from the corresponding author on request. The application is also available for local installation if required. PMID:17140267
Bayesian hidden Markov models to identify RNA-protein interaction sites in PAR-CLIP.
Yun, Jonghyun; Wang, Tao; Xiao, Guanghua
2014-06-01
The photoactivatable ribonucleoside enhanced cross-linking immunoprecipitation (PAR-CLIP) has been increasingly used for the global mapping of RNA-protein interaction sites. There are two key features of the PAR-CLIP experiments: The sequence read tags are likely to form an enriched peak around each RNA-protein interaction site; and the cross-linking procedure is likely to introduce a specific mutation in each sequence read tag at the interaction site. Several ad hoc methods have been developed to identify the RNA-protein interaction sites using either sequence read counts or mutation counts alone; however, rigorous statistical methods for analyzing PAR-CLIP are still lacking. In this article, we propose an integrative model to establish a joint distribution of observed read and mutation counts. To pinpoint the interaction sites at single base-pair resolution, we developed a novel modeling approach that adopts non-homogeneous hidden Markov models to incorporate the nucleotide sequence at each genomic location. Both simulation studies and data application showed that our method outperforms the ad hoc methods, and provides reliable inferences for the RNA-protein binding sites from PAR-CLIP data. PMID:24571656
Combining hidden Markov models for comparing the dynamics of multiple sleep electroencephalograms.
Langrock, Roland; Swihart, Bruce J; Caffo, Brian S; Punjabi, Naresh M; Crainiceanu, Ciprian M
2013-08-30
In this manuscript, we consider methods for the analysis of populations of electroencephalogram signals during sleep for the study of sleep disorders using hidden Markov models (HMMs). Notably, we propose an easily implemented method for simultaneously modeling multiple time series that involve large amounts of data. We apply these methods to study sleep-disordered breathing (SDB) in the Sleep Heart Health Study (SHHS), a landmark study of SDB and cardiovascular consequences. We use the entire, longitudinally collected, SHHS cohort to develop HMM population parameters, which we then apply to obtain subject-specific Markovian predictions. From these predictions, we create several indices of interest, such as transition frequencies between latent states. Our HMM analysis of electroencephalogram signals uncovers interesting findings regarding differences in brain activity during sleep between those with and without SDB. These findings include stability of the percent time spent in HMM latent states across matched diseased and non-diseased groups and differences in the rate of transitioning. PMID:23348835
An Enhanced Informed Watermarking Scheme Using the Posterior Hidden Markov Model
2014-01-01
Designing a practical watermarking scheme with high robustness, feasible imperceptibility, and large capacity remains one of the most important research topics in robust watermarking. This paper presents a posterior hidden Markov model (HMM-) based informed image watermarking scheme, which well enhances the practicability of the prior-HMM-based informed watermarking with favorable robustness, imperceptibility, and capacity. To make the encoder and decoder use the (nearly) identical posterior HMM, each cover image at the encoder and each received image at the decoder are attacked with JPEG compression at an equivalently small quality factor (QF). The attacked images are then employed to estimate HMM parameter sets for both the encoder and decoder, respectively. Numerical simulations show that a small QF of 5 is an optimum setting for practical use. Based on this posterior HMM, we develop an enhanced posterior-HMM-based informed watermarking scheme. Extensive experimental simulations show that the proposed scheme is comparable to its prior counterpart in which the HMM is estimated with the original image, but it avoids the transmission of the prior HMM from the encoder to the decoder. This thus well enhances the practical application of HMM-based informed watermarking systems. Also, it is demonstrated that the proposed scheme has the robustness comparable to the state-of-the-art with significantly reduced computation time. PMID:24574883
Karchin, Rachel; Cline, Melissa; Mandel-Gutfreund, Yael; Karplus, Kevin
2003-06-01
An important problem in computational biology is predicting the structure of the large number of putative proteins discovered by genome sequencing projects. Fold-recognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins homologous to the target. Remote homologs that may have significant structural similarity are often not detectable by sequence similarities alone. To address this, we incorporated predicted local structure, a generalization of secondary structure, into two-track profile hidden Markov models (HMMs). We did not rely on a simple helix-strand-coil definition of secondary structure, but experimented with a variety of local structure descriptions, following a principled protocol to establish which descriptions are most useful for improving fold recognition and alignment quality. On a test set of 1298 nonhomologous proteins, HMMs incorporating a 3-letter STRIDE alphabet improved fold recognition accuracy by 15% over amino-acid-only HMMs and 23% over PSI-BLAST, measured by ROC-65 numbers. We compared two-track HMMs to amino-acid-only HMMs on a difficult alignment test set of 200 protein pairs (structurally similar with 3-24% sequence identity). HMMs with a 6-letter STRIDE secondary track improved alignment quality by 62%, relative to DALI structural alignments, while HMMs with an STR track (an expanded DSSP alphabet that subdivides strands into six states) improved by 40% relative to CE. PMID:12784210
NASA Astrophysics Data System (ADS)
Zhou, Haitao; Chen, Jin; Dong, Guangming; Wang, Ran
2016-05-01
Many existing signal processing methods usually select a predefined basis function in advance. This basis functions selection relies on a priori knowledge about the target signal, which is always infeasible in engineering applications. Dictionary learning method provides an ambitious direction to learn basis atoms from data itself with the objective of finding the underlying structure embedded in signal. As a special case of dictionary learning methods, shift-invariant dictionary learning (SIDL) reconstructs an input signal using basis atoms in all possible time shifts. The property of shift-invariance is very suitable to extract periodic impulses, which are typical symptom of mechanical fault signal. After learning basis atoms, a signal can be decomposed into a collection of latent components, each is reconstructed by one basis atom and its corresponding time-shifts. In this paper, SIDL method is introduced as an adaptive feature extraction technique. Then an effective approach based on SIDL and hidden Markov model (HMM) is addressed for machinery fault diagnosis. The SIDL-based feature extraction is applied to analyze both simulated and experiment signal with specific notch size. This experiment shows that SIDL can successfully extract double impulses in bearing signal. The second experiment presents an artificial fault experiment with different bearing fault type. Feature extraction based on SIDL method is performed on each signal, and then HMM is used to identify its fault type. This experiment results show that the proposed SIDL-HMM has a good performance in bearing fault diagnosis.
Detection and diagnosis of bearing and cutting tool faults using hidden Markov models
NASA Astrophysics Data System (ADS)
Boutros, Tony; Liang, Ming
2011-08-01
Over the last few decades, the research for new fault detection and diagnosis techniques in machining processes and rotating machinery has attracted increasing interest worldwide. This development was mainly stimulated by the rapid advance in industrial technologies and the increase in complexity of machining and machinery systems. In this study, the discrete hidden Markov model (HMM) is applied to detect and diagnose mechanical faults. The technique is tested and validated successfully using two scenarios: tool wear/fracture and bearing faults. In the first case the model correctly detected the state of the tool (i.e., sharp, worn, or broken) whereas in the second application, the model classified the severity of the fault seeded in two different engine bearings. The success rate obtained in our tests for fault severity classification was above 95%. In addition to the fault severity, a location index was developed to determine the fault location. This index has been applied to determine the location (inner race, ball, or outer race) of a bearing fault with an average success rate of 96%. The training time required to develop the HMMs was less than 5 s in both the monitoring cases.
Enhancing speech recognition using improved particle swarm optimization based hidden Markov model.
Selvaraj, Lokesh; Ganesan, Balakrishnan
2014-01-01
Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy. PMID:25478588
Wissel, Tobias; Pfeiffer, Tim; Frysch, Robert; Knight, Robert T.; Chang, Edward F.; Hinrichs, Hermann; Rieger, Jochem W.; Rose, Georg
2013-01-01
Objective Support Vector Machines (SVM) have developed into a gold standard for accurate classification in Brain-Computer-Interfaces (BCI). The choice of the most appropriate classifier for a particular application depends on several characteristics in addition to decoding accuracy. Here we investigate the implementation of Hidden Markov Models (HMM)for online BCIs and discuss strategies to improve their performance. Approach We compare the SVM, serving as a reference, and HMMs for classifying discrete finger movements obtained from the Electrocorticograms of four subjects doing a finger tapping experiment. The classifier decisions are based on a subset of low-frequency time domain and high gamma oscillation features. Main results We show that decoding optimization between the two approaches is due to the way features are extracted and selected and less dependent on the classifier. An additional gain in HMM performance of up to 6% was obtained by introducing model constraints. Comparable accuracies of up to 90% were achieved with both SVM and HMM with the high gamma cortical response providing the most important decoding information for both techniques. Significance We discuss technical HMM characteristics and adaptations in the context of the presented data as well as for general BCI applications. Our findings suggest that HMMs and their characteristics are promising for efficient online brain-computer interfaces. PMID:24045504
Classifying movement behaviour in relation to environmental conditions using hidden Markov models.
Patterson, Toby A; Basson, Marinelle; Bravington, Mark V; Gunn, John S
2009-11-01
1. Linking the movement and behaviour of animals to their environment is a central problem in ecology. Through the use of electronic tagging and tracking (ETT), collection of in situ data from free-roaming animals is now commonplace, yet statistical approaches enabling direct relation of movement observations to environmental conditions are still in development. 2. In this study, we examine the hidden Markov model (HMM) for behavioural analysis of tracking data. HMMs allow for prediction of latent behavioural states while directly accounting for the serial dependence prevalent in ETT data. Updating the probability of behavioural switches with tag or remote-sensing data provides a statistical method that links environmental data to behaviour in a direct and integrated manner. 3. It is important to assess the reliability of state categorization over the range of time-series lengths typically collected from field instruments and when movement behaviours are similar between movement states. Simulation with varying lengths of times series data and contrast between average movements within each state was used to test the HMMs ability to estimate movement parameters. 4. To demonstrate the methods in a realistic setting, the HMMs were used to categorize resident and migratory phases and the relationship between movement behaviour and ocean temperature using electronic tagging data from southern bluefin tuna (Thunnus maccoyii). Diagnostic tools to evaluate the suitability of different models and inferential methods for investigating differences in behaviour between individuals are also demonstrated. PMID:19563470
Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; et al
2014-09-26
A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less
Discovering short linear protein motif based on selective training of profile hidden Markov models.
Song, Tao; Gu, Hong
2015-07-21
Short linear motifs (SLiMs) in proteins are relatively conservative sequence patterns within disordered regions of proteins, typically 3-10 amino acids in length. They play an important role in mediating protein-protein interactions. Discovering SLiMs by computational methods has attracted more and more attention, most of which were based on regular expressions and profiles. In this paper, a de novo motif discovery method was proposed based on profile hidden Markov models (HMMs), which can not only provide the emission probabilities of amino acids in the defined positions of SLiMs, but also model the undefined positions. We adopted the ordered region masking and the relative local conservation (RLC) masking to improve the signal to noise ratio of the query sequences while applying evolutionary weighting to make the important sequences in evolutionary process get more attention by the selective training of profile HMMs. The experimental results show that our method and the profile-based method returned different subsets within a SLiMs dataset, and the performance of the two approaches are equivalent on a more realistic discovery dataset. Profile HMM-based motif discovery methods complement the existing methods and provide another way for SLiMs analysis. PMID:25791288
Complex RNA Folding Kinetics Revealed by Single-Molecule FRET and Hidden Markov Models
2014-01-01
We have developed a hidden Markov model and optimization procedure for photon-based single-molecule FRET data, which takes into account the trace-dependent background intensities. This analysis technique reveals an unprecedented amount of detail in the folding kinetics of the Diels–Alderase ribozyme. We find a multitude of extended (low-FRET) and compact (high-FRET) states. Five states were consistently and independently identified in two FRET constructs and at three Mg2+ concentrations. Structures generally tend to become more compact upon addition of Mg2+. Some compact structures are observed to significantly depend on Mg2+ concentration, suggesting a tertiary fold stabilized by Mg2+ ions. One compact structure was observed to be Mg2+-independent, consistent with stabilization by tertiary Watson–Crick base pairing found in the folded Diels–Alderase structure. A hierarchy of time scales was discovered, including dynamics of 10 ms or faster, likely due to tertiary structure fluctuations, and slow dynamics on the seconds time scale, presumably associated with significant changes in secondary structure. The folding pathways proceed through a series of intermediate secondary structures. There exist both compact pathways and more complex ones, which display tertiary unfolding, then secondary refolding, and, subsequently, again tertiary refolding. PMID:24568646
Sim, SeungWoo; Kang, Seung-Ho; Lee, Sang-Hee
2015-02-01
Subterranean termites live underground and build tunnel networks to obtain food and nesting space. After obtaining food, termites return to their nests to transfer it. The efficiency of termite movement through the tunnels is directly connected to their survival. Tunnels should therefore be optimized to ensure highly efficient returns. An optimization factor that strongly affects movement efficiency is tunnel curvature. In the present study, we investigated traveling behavior in tunnels with different curvatures. We then characterized traveling behavior at the level of the individual using hidden Markov models (HMMs) constructed from the experimental data. To observe traveling behavior, we designed 5-cm long artificial tunnels that had different curvatures. The tunnels had widths (W) of 2, 3, or 4mm, and the linear distances between the two ends of the tunnels were (D) 20, 30, 40, or 50mm. High values of D indicate low curvature. We systematically observed the traveling behavior of Coptotermes formosanus shiraki and Reticulitermes speratus kyushuensis and measured the time (τ) required for a termite to pass through the tunnel. Using HMM models, we calculated τ for different tunnels and compared the results with the τ of real termites. We characterized the traveling behavior in terms of transition probability matrices (TPM) and emission probability matrices (EPM) of HMMs. We briefly discussed the construction of a sinusoidal-like tunnels in relation to the energy required for termites to pass through tunnels and provided suggestions for the development of more sophisticated HMMs to better understand termite foraging behavior. PMID:25562190
Griffin, William A; Li, Xun
2016-01-01
Sequential affect dynamics generated during the interaction of intimate dyads, such as married couples, are associated with a cascade of effects-some good and some bad-on each partner, close family members, and other social contacts. Although the effects are well documented, the probabilistic structures associated with micro-social processes connected to the varied outcomes remain enigmatic. Using extant data we developed a method of classifying and subsequently generating couple dynamics using a Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM). Our findings indicate that several key aspects of existing models of marital interaction are inadequate: affect state emissions and their durations, along with the expected variability differences between distressed and nondistressed couples are present but highly nuanced; and most surprisingly, heterogeneity among highly satisfied couples necessitate that they be divided into subgroups. We review how this unsupervised learning technique generates plausible dyadic sequences that are sensitive to relationship quality and provide a natural mechanism for computational models of behavioral and affective micro-social processes. PMID:27187319
Protein modeling with hybrid Hidden Markov Model/Neurel network architectures
Baldi, P.; Chauvin, Y.
1995-12-31
Hidden Markov Models (HMMs) are useful in a number of tasks in computational molecular biology, and in particular to model and align protein families. We argue that HMMs are somewhat optimal within a certain modeling hierarchy. Single first order HMMs, however, have two potential limitations: a large number of unstructured parameters, and a built-in inability to deal with long-range dependencies. Hybrid HMM/Neural Network (NN) architectures attempt to overcome these limitations. In hybrid HMM/NN, the HMM parameters are computed by a NN. This provides a reparametrization that allows for flexible control of model complexity, and incorporation of constraints. The approach is tested on the immunoglobulin family. A hybrid model is trained, and a multiple alignment derived, with less than a fourth of the number of parameters used with previous single HMMs. To capture dependencies, however, one must resort to a larger hybrid model class, where the data is modeled by multiple HMMs. The parameters of the HMMs, and their modulation as a function of input or context, is again calculated by a NN.
A Hidden Markov Model for avalanche forecasting on Chowkibal-Tangdhar road axis in Indian Himalayas
NASA Astrophysics Data System (ADS)
Joshi, Jagdish Chandra; Srivastava, Sunita
2014-12-01
A numerical avalanche prediction scheme using Hidden Markov Model (HMM) has been developed for Chowkibal-Tangdhar road axis in J&K, India. The model forecast is in the form of different levels of avalanche danger (no, low, medium, and high) with a lead time of two days. Snow and meteorological data (maximum temperature, minimum temperature, fresh snow, fresh snow duration, standing snow) of past 12 winters (1992-2008) have been used to derive the model input variables (average temperature, fresh snow in 24 hrs, snow fall intensity, standing snow, Snow Temperature Index (STI) of the top layer, and STI of buried layer). As in HMMs, there are two sequences: a state sequence and a state dependent observation sequence; in the present model, different levels of avalanche danger are considered as different states of the model and Avalanche Activity Index (AAI) of a day, derived from the model input variables, as an observation. Validation of the model with independent data of two winters (2008-2009, 2009-2010) gives 80% accuracy for both day-1 and day-2. Comparison of various forecasting quality measures and Heidke Skill Score of the HMM and the NN model indicate better forecasting skill of the HMM.
Song, Changyue; Liu, Kaibo; Zhang, Xi; Chen, Lili; Xian, Xiaochen
2016-07-01
Obstructive sleep apnea (OSA) syndrome is a common sleep disorder suffered by an increasing number of people worldwide. As an alternative to polysomnography (PSG) for OSA diagnosis, the automatic OSA detection methods used in the current practice mainly concentrate on feature extraction and classifier selection based on collected physiological signals. However, one common limitation in these methods is that the temporal dependence of signals are usually ignored, which may result in critical information loss for OSA diagnosis. In this study, we propose a novel OSA detection approach based on ECG signals by considering temporal dependence within segmented signals. A discriminative hidden Markov model (HMM) and corresponding parameter estimation algorithms are provided. In addition, subject-specific transition probabilities within the model are employed to characterize the subject-to-subject differences of potential OSA patients. To validate our approach, 70 recordings obtained from the Physionet Apnea-ECG database were used. Accuracies of 97.1% for per-recording classification and 86.2% for per-segment OSA detection with satisfactory sensitivity and specificity were achieved. Compared with other existing methods that simply ignore the temporal dependence of signals, the proposed HMM-based detection approach delivers more satisfactory detection performance and could be extended to other disease diagnosis applications. PMID:26560867
Jones, Jonathan-Lee; Essa, Ehab; Xie, Xianghua
2015-08-01
We present a novel method to segment the lymph vessel wall in confocal microscopy images using Optimal Surface Segmentation (OSS) and hidden Markov Models (HMM). OSS is used to preform a pre-segmentation on the images, to act as the initial state for the HMM. We utilize a steerable filter to determine edge based filters for both of these segmentations, and use these features to build Gaussian probability distributions for both the vessel walls and the background. From this we infer the emission probability for the HMM, and the transmission probability is learned using a Baum-Welch algorithm. We transform the segmentation problem into one of cost minimization, with each node in the graph corresponding to one state, and the weight for each node being defined using its emission probability. We define the inter-relations between neighboring nodes using the transmission probability. Having constructed the problem, it is solved using the Viterbi algorithm, allowing the vessel to be reconstructed. The optimal solution can be found in polynomial time. We present qualitative and quantitative analysis to show the performance of the proposed method. PMID:26736778
Luo, Yuxuan; Feng, Jianjiang; Xu, Miao; Zhou, Jie; Min, James K; Xiong, Guanglei
2015-08-01
Computed tomography angiography (CTA) allows for not only diagnosis of coronary artery disease (CAD) with high spatial resolution but also monitoring the remodeling of vessel walls in the progression of CAD. Alignment of coronary arteries in CTA images acquired at different times (with a 3-7 years interval) is required to visualize and analyze the geometric and structural changes quantitatively. Previous work in image registration primarily focused on large anatomical structures and leads to suboptimal results when applying to registration of coronary arteries. In this paper, we develop a novel method to directly align the straightened coronary arteries in the cylindrical coordinate system guided by the extracted centerlines. By using a Hidden Markov Model (HMM), image intensity information from CTA and geometric information of extracted coronary arteries are combined to align coronary arteries. After registration, the pathological features in two straightened coronary arteries can be directly visualized side by side by synchronizing the corresponding cross-sectional slices and circumferential rotation angles. By evaluating with manually labeled landmarks, the average distance error is 1.6 mm. PMID:26736676
Identifying spatiotemporal migration patterns of non-volcanic tremors using hidden Markov models
NASA Astrophysics Data System (ADS)
Zhuang, J.; Wang, T.; Obara, K.; Tsuruoka, H.
2015-12-01
Tremor activity has been recently detected in various tectonic areas worldwide, and is spatially segmented and temporally recurrent. We design a type of hidden Markov models (HMMs) to investigate this phenomenon, where each state represents a distinct segment of tremor sources. We systematically analyze the tremor data from the Tokai region in southwest Japan using this model and find that tremors in this region concentrate around several distinct centers. We find: (1) The system is classified into three classes, background (quiescent), quasi-quiescent, and active states; (2) The region can be separated into two subsystems, the southwest and northeast parts, with most of the active transitions being among the states in each subsystem and the other transitions mainly to the quiescent/quasi-quiescent states; and (3) Tremor activity lasts longer in the northeastern part than in the southwest part. The success of this analysis indicates the power of HMMs in revealing the underlying physical process that drives non-volcanic tremors. Figure： The migration pattern for the HMM with 8 states. Top panel: Observed distances with the center μi of each state overlayed as the red line and ±σi on the left-hand side of the panel in green lines; Middle panel: the tracked most likely state sequence of the 8-state HMM; Bottom panel: the estimated probability of the data being in each state, with blank representing the probability of being in State 1 (the null state).
Glas, Julia; Dümcke, Sebastian; Zacher, Benedikt; Poron, Don; Gagneur, Julien; Tresch, Achim
2016-03-18
Hidden Markov models (HMMs) have been extensively used to dissect the genome into functionally distinct regions using data such as RNA expression or DNA binding measurements. It is a challenge to disentangle processes occurring on complementary strands of the same genomic region. We present the double-stranded HMM (dsHMM), a model for the strand-specific analysis of genomic processes. We applied dsHMM to yeast using strand specific transcription data, nucleosome data, and protein binding data for a set of 11 factors associated with the regulation of transcription.The resulting annotation recovers the mRNA transcription cycle (initiation, elongation, termination) while correctly predicting strand-specificity and directionality of the transcription process. We find that pre-initiation complex formation is an essentially undirected process, giving rise to a large number of bidirectional promoters and to pervasive antisense transcription. Notably, 12% of all transcriptionally active positions showed simultaneous activity on both strands. Furthermore, dsHMM reveals that antisense transcription is specifically suppressed by Nrd1, a yeast termination factor. PMID:26578558
Automatic sleep staging based on ECG signals using hidden Markov models.
Ying Chen; Xin Zhu; Wenxi Chen
2015-08-01
This study is designed to investigate the feasibility of automatic sleep staging using features only derived from electrocardiography (ECG) signal. The study was carried out using the framework of hidden Markov models (HMMs). The mean, and SD values of heart rates (HRs) computed from each 30-second epoch served as the features. The two feature sequences were first detrended by ensemble empirical mode decomposition (EEMD), formed as a two-dimensional feature vector, and then converted into code vectors by vector quantization (VQ) method. The output VQ indexes were utilized to estimate parameters for HMMs. The proposed model was tested and evaluated on a group of healthy individuals using leave-one-out cross-validation. The automatic sleep staging results were compared with PSG estimated ones. Results showed accuracies of 82.2%, 76.0%, 76.1% and 85.5% for deep, light, REM and wake sleep, respectively. The findings proved that HRs-based HMM approach is feasible for automatic sleep staging and can pave a way for developing more efficient, robust, and simple sleep staging system suitable for home application. PMID:26736316
He, Zhiquan; Ma, Wenji; Zhang, Jingfen; Xu, Dong
2015-01-01
Protein structure Quality Assessment (QA) is an essential component in protein structure prediction and analysis. The relationship between protein sequence and structure often serves as a basis for protein structure QA. In this work, we developed a new Hidden Markov Model (HMM) to assess the compatibility of protein sequence and structure for capturing their complex relationship. More specifically, the emission of the HMM consists of protein local structures in angular space, secondary structures, and sequence profiles. This model has two capabilities: (1) encoding local structure of each position by jointly considering sequence and structure information, and (2) assigning a global score to estimate the overall quality of a predicted structure, as well as local scores to assess the quality of specific regions of a structure, which provides useful guidance for targeted structure refinement. We compared the HMM model to state-of-art single structure quality assessment methods OPUSCA, DFIRE, GOAP, and RW in protein structure selection. Computational results showed our new score HMM.Z can achieve better overall selection performance on the benchmark datasets. PMID:26221066
Hidden semi-Markov models reveal multiphasic movement of the endangered Florida panther.
van de Kerk, Madelon; Onorato, David P; Criffield, Marc A; Bolker, Benjamin M; Augustine, Ben C; McKinley, Scott A; Oli, Madan K
2015-03-01
Animals must move to find food and mates, and to avoid predators; movement thus influences survival and reproduction, and ultimately determines fitness. Precise description of movement and understanding of spatial and temporal patterns as well as relationships with intrinsic and extrinsic factors is important both for theoretical and applied reasons. We applied hidden semi-Markov models (HSMM) to hourly geographic positioning system (GPS) location data to understand movement patterns of the endangered Florida panther (Puma concolor coryi) and to discern factors influencing these patterns. Three distinct movement modes were identified: (1) Resting mode, characterized by short step lengths and turning angles around 180(o); (2) Moderately active (or intermediate) mode characterized by intermediate step lengths and variable turning angles, and (3) Traveling mode, characterized by long step lengths and turning angles around 0(o). Males and females, and females with and without kittens, exhibited distinctly different movement patterns. Using the Viterbi algorithm, we show that differences in movement patterns of male and female Florida panthers were a consequence of sex-specific differences in diurnal patterns of state occupancy and sex-specific differences in state-specific movement parameters, whereas the differences between females with and without dependent kittens were caused solely by variation in state occupancy. Our study demonstrates the use of HSMM methodology to precisely describe movement and to dissect differences in movement patterns according to sex, and reproductive status. PMID:25251870
Adaptive hidden Markov model with anomaly States for price manipulation detection.
Cao, Yi; Li, Yuhua; Coleman, Sonya; Belatreche, Ammar; McGinnity, Thomas Martin
2015-02-01
Price manipulation refers to the activities of those traders who use carefully designed trading behaviors to manually push up or down the underlying equity prices for making profits. With increasing volumes and frequency of trading, price manipulation can be extremely damaging to the proper functioning and integrity of capital markets. The existing literature focuses on either empirical studies of market abuse cases or analysis of particular manipulation types based on certain assumptions. Effective approaches for analyzing and detecting price manipulation in real time are yet to be developed. This paper proposes a novel approach, called adaptive hidden Markov model with anomaly states (AHMMAS) for modeling and detecting price manipulation activities. Together with wavelet transformations and gradients as the feature extraction methods, the AHMMAS model caters to price manipulation detection and basic manipulation type recognition. The evaluation experiments conducted on seven stock tick data from NASDAQ and the London Stock Exchange and 10 simulated stock prices by stochastic differential equation show that the proposed AHMMAS model can effectively detect price manipulation patterns and outperforms the selected benchmark models. PMID:25608293
NASA Astrophysics Data System (ADS)
Nishiura, Takanobu; Nakamura, Satoshi
2003-10-01
Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self-moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)-based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound-added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional HMM composition with speech HMMs and a noise (environmental sound) HMM trained using noise periods prior to the target speech in a captured signal. [Work supported by Ministry of Public Management, Home Affairs, Posts and Telecommunications of Japan.
Capturing the state transitions of seizure-like events using Hidden Markov models.
Guirgis, Mirna; Serletis, Demitre; Carlen, Peter L; Bardakjian, Berj L
2011-01-01
The purpose of this study was to investigate the number of states present in the progression of a seizure-like event (SLE). Of particular interest is to determine if there are more than two clearly defined states, as this would suggest that there is a distinct state preceding an SLE. Whole-intact hippocampus from C57/BL mice was used to model epileptiform activity induced by the perfusion of a low Mg(2+)/high K(+) solution while extracellular field potentials were recorded from CA3 pyramidal neurons. Hidden Markov models (HMM) were used to model the state transitions of the recorded SLEs by incorporating various features of the Hilbert transform into the training algorithm; specifically, 2- and 3-state HMMs were explored. Although the 2-state model was able to distinguish between SLE and nonSLE behavior, it provided no improvements compared to visual inspection alone. However, the 3-state model was able to capture two distinct nonSLE states that visual inspection failed to discriminate. Moreover, by developing an HMM based system a priori knowledge of the state transitions was not required making this an ideal platform for seizure prediction algorithms. PMID:22254742
Reverse engineering a social agent-based hidden markov model--visage.
Chen, Hung-Ching Justin; Goldberg, Mark; Magdon-Ismail, Malik; Wallace, William A
2008-12-01
We present a machine learning approach to discover the agent dynamics that drives the evolution of the social groups in a community. We set up the problem by introducing an agent-based hidden Markov model for the agent dynamics: an agent's actions are determined by micro-laws. Nonetheless, We learn the agent dynamics from the observed communications without knowing state transitions. Our approach is to identify the appropriate micro-laws corresponding to an identification of the appropriate parameters in the model. The model identification problem is then formulated as a mixed optimization problem. To solve the problem, we develop a multistage learning process for determining the group structure, the group evolution, and the micro-laws of a community based on the observed set of communications among actors, without knowing the semantic contents. Finally, to test the quality of our approximations and the feasibility of the approach, we present the results of extensive experiments on synthetic data as well as the results on real communities, such as Enron email and Movie newsgroups. Insight into agent dynamics helps us understand the driving forces behind social evolution. PMID:19145665
Characterization of the crawling activity of Caenorhabditis elegans using a Hidden Markov model.
Lee, Sang-Hee; Kang, Seung-Ho
2015-12-01
The locomotion behavior of Caenorhabditis elegans has been studied extensively to understand the respective roles of neural control and biomechanics as well as the interaction between them. Constructing a mathematical model is helpful to understand the locomotion behavior in various surrounding conditions that are difficult to realize in experiments. In this study, we built three hidden Markov models (HMMs) for the crawling behavior of C. elegans in a controlled environment with no chemical treatment and in a formaldehyde-treated environment (0.1 and 0.5 ppm). The organism's crawling activity was recorded using a digital camcorder for 20 min at a rate of 24 frames per second. All shape patterns were quantified by branch length similarity (BLS) entropy and classified into four groups using the self-organizing map (SOM). Comparison of the simulated behavior generated by HMMs and the actual crawling behavior demonstrated that the HMM coupled with the SOM was successful in characterizing the crawling behavior. In addition, we briefly discussed the possibility of using the HMM together with BLS entropy to develop bio-monitoring systems to determine water quality. PMID:26319806
Extracting duration information in a picture category decoding task using hidden Markov Models
Pfeiffer, Tim; Heinze, Nicolai; Frysch, Robert; Deouell, Leon Y; Schoenfeld, Mircea A; Knight, Robert T; Rose, Georg
2016-01-01
Objective Adapting classifiers for the purpose of brain signal decoding is a major challenge in brain–computer-interface (BCI) research. In a previous study we showed in principle that hidden Markov models (HMM) are a suitable alternative to the well-studied static classifiers. However, since we investigated a rather straightforward task, advantages from modeling of the signal could not be assessed. Approach Here, we investigate a more complex data set in order to find out to what extent HMMs, as a dynamic classifier, can provide useful additional information. We show for a visual decoding problem that besides category information, HMMs can simultaneously decode picture duration without an additional training required. This decoding is based on a strong correlation that we found between picture duration and the behavior of the Viterbi paths. Main results Decoding accuracies of up to 80% could be obtained for category and duration decoding with a single classifier trained on category information only. Significance The extraction of multiple types of information using a single classifier enables the processing of more complex problems, while preserving good training results even on small databases. Therefore, it provides a convenient framework for online real-life BCI utilizations. PMID:26859831
Rotation-invariant image retrieval using hidden Markov tree for remote sensing data
NASA Astrophysics Data System (ADS)
Miao, Congcong; Zhao, Yindi
2014-11-01
The rapid increase in quantity of available remote sensing data brought an urgent need for intelligent retrieval techniques for remote sensing images. As one of the basic visual characteristics and important information sources of remote sensing images, texture is widely used in the scheme of remote sensing image retrieval. Since many images or regions with identical texture features usually show the diversity of direction, the consideration of rotation-invariance in the description of texture features is of significance both theoretically and practically. To address these issues, we develop a rotation-invariant image retrieval method based on the texture features of remote sensing images. We use the steerable pyramid transform to get the multi-scale and multi-orientation representation of texture images. Then we employ the hidden Markov tree (HMT) model, which provides a good tool to describe texture feature, to capture the dependencies across scales and orientations, by which the statistical properties of the transform domain coefficients can be obtained. Utilizing the inherent tree structure of the HMT and its fast training and likelihood computation algorithms, we can extract the rotation-invariant features of texture images. Similarity between the query image and each candidate image in the database can be measured by computing the Kullback-Leibler distance between the corresponding models. We evaluate the retrieval effectiveness of the algorithm with Brodatz texture database and remote sensing images. The experimental results show that this method has satisfactory performance in image retrieval and less sensitivity to texture rotation.
NASA Astrophysics Data System (ADS)
Hamdi, Anis; Missaoui, Oualid; Frigui, Hichem; Gader, Paul
2010-04-01
We propose a landmine detection algorithm that uses ensemble discrete hidden Markov models with context dependent training schemes. We hypothesize that the data are generated by K models. These different models reflect the fact that mines and clutter objects have different characteristics depending on the mine type, soil and weather conditions, and burial depth. Model identification is based on clustering in the log-likelihood space. First, one HMM is fit to each of the N individual sequence. For each fitted model, we evaluate the log-likelihood of each sequence. This will result in an N x N log-likelihood distance matrix that will be partitioned into K groups. In the second step, we learn the parameters of one discrete HMM per group. We propose using and optimizing various training approaches for the different K groups depending on their size and homogeneity. In particular, we will investigate the maximum likelihood, and the MCE-based discriminative training approaches. Results on large and diverse Ground Penetrating Radar data collections show that the proposed method can identify meaningful and coherent HMM models that describe different properties of the data. Each HMM models a group of alarm signatures that share common attributes such as clutter, mine type, and burial depth. Our initial experiments have also indicated that the proposed mixture model outperform the baseline HMM that uses one model for the mine and one model for the background.
High range resolution radar target identification using the Prony model and hidden Markov models
NASA Astrophysics Data System (ADS)
Dewitt, Mark R.
1992-12-01
Fully polarized Xpatch signatures are transformed to two left circularly polarized signals. These two signals are then filtered by a linear FM pulse compression ('chirp') transfer function, corrupted by AWGN, and filtered by a filter matched to the 'chirp' transfer function. The bandwidth of the 'chirp' radar is about 750 MHz. Range profile feature extraction is performed using the TLS Prony Model parameter estimation technique developed at Ohio State University. Using the Prony Model, each scattering center is described by a polarization ellipse, relative energy, frequency response, and range. This representation of the target is vector quantized using a K-means clustering algorithm. Sequences of vector quantized scattering centers as well as sequences of vector quantized range profiles are used to synthesize target specific Hidden Markov Models (HMM's). The identification decision is made by determining which HMM has the highest probability of generating the unknown sequence. The data consist of synthesized Xpatch signatures of two targets which have been difficult to separate with other RTI algorithms. The RTI algorithm developed is clearly able to separate these two targets over a 10 by 10 degree (1 degree granularity) aspect angle window off the nose for SNR's as low as 0 dB. The classification rate is 100 percent for SNR's of 5 - 20 dB, 95 percent for a SNR of 0 dB and it drops rapidly for SNR's lower than 0 dB.
Enhancing Speech Recognition Using Improved Particle Swarm Optimization Based Hidden Markov Model
Selvaraj, Lokesh; Ganesan, Balakrishnan
2014-01-01
Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy. PMID:25478588
Identifying bubble collapse in a hydrothermal system using hidden Markov models
Dawson, P.B.; Benitez, M.C.; Lowenstern, J. B.; Chouet, B.A.
2012-01-01
Beginning in July 2003 and lasting through September 2003, the Norris Geyser Basin in Yellowstone National Park exhibited an unusual increase in ground temperature and hydrothermal activity. Using hidden Markov model theory, we identify over five million high-frequency (>15Hz) seismic events observed at a temporary seismic station deployed in the basin in response to the increase in hydrothermal activity. The source of these seismic events is constrained to within ???100 m of the station, and produced ???3500-5500 events per hour with mean durations of ???0.35-0.45s. The seismic event rate, air temperature, hydrologic temperatures, and surficial water flow of the geyser basin exhibited a marked diurnal pattern that was closely associated with solar thermal radiance. We interpret the source of the seismicity to be due to the collapse of small steam bubbles in the hydrothermal system, with the rate of collapse being controlled by surficial temperatures and daytime evaporation rates. copyright 2012 by the American Geophysical Union.
NASA Astrophysics Data System (ADS)
Uğuz, Harun; Kodaz, Halife
Doppler ultrasound has been usually preferred for investigation of the artery conditions in the last two decade, since it is a non-invasive method which is not risky. In this study, a biomedical system based on Discrete Hidden Markov Model (DHMM) has been developed in order to classify the internal carotid artery Doppler signals recorded from 191 subjects (136 of them had suffered from internal carotid artery stenosis and rest of them had been healthy subjects). Developed system comprises of three stages. In the first stage, for feature extraction, obtained Doppler signals were separated to its sub-bands using Discrete Wavelet Transform (DWT). In the second stage, entropy of each sub-band was calculated using Shannon entropy algorithm to reduce the dimensionality of the feature vectors via DWT. In the third stage, the reduced features of carotid artery Doppler signals were used as input patterns of the DHMM classifier. Our proposed method reached 97.38% classification accuracy with 5 fold cross validation (CV) technique. The classification results showed that purposed method is effective for classification of internal carotid artery Doppler signals.
Maaskola, Jonas; Rajewsky, Nikolaus
2014-01-01
We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized. PMID:25389269
Extracting duration information in a picture category decoding task using hidden Markov Models
NASA Astrophysics Data System (ADS)
Pfeiffer, Tim; Heinze, Nicolai; Frysch, Robert; Deouell, Leon Y.; Schoenfeld, Mircea A.; Knight, Robert T.; Rose, Georg
2016-04-01
Objective. Adapting classifiers for the purpose of brain signal decoding is a major challenge in brain-computer-interface (BCI) research. In a previous study we showed in principle that hidden Markov models (HMM) are a suitable alternative to the well-studied static classifiers. However, since we investigated a rather straightforward task, advantages from modeling of the signal could not be assessed. Approach. Here, we investigate a more complex data set in order to find out to what extent HMMs, as a dynamic classifier, can provide useful additional information. We show for a visual decoding problem that besides category information, HMMs can simultaneously decode picture duration without an additional training required. This decoding is based on a strong correlation that we found between picture duration and the behavior of the Viterbi paths. Main results. Decoding accuracies of up to 80% could be obtained for category and duration decoding with a single classifier trained on category information only. Significance. The extraction of multiple types of information using a single classifier enables the processing of more complex problems, while preserving good training results even on small databases. Therefore, it provides a convenient framework for online real-life BCI utilizations.
Temporal structure analysis of broadcast tennis video using hidden Markov models
NASA Astrophysics Data System (ADS)
Kijak, Ewa; Oisel, Lionel; Gros, Patrick
2003-01-01
This work aims at recovering the temporal structure of a broadcast tennis video from an analysis of the raw footage. Our method relies on a statistical model of the interleaving of shots, in order to group shots into predefined classes representing structural elements of a tennis video. This stochastic modeling is performed in the global framework of Hidden Markov Models (HMMs). The fundamental units are shots and transitions. In a first step, colors and motion attributes of segmented shots are used to map shots into 2 classes: game (view of the full tennis court) and not game (medium, close up views, and commercials). In a second step, a trained HMM is used to analyze the temporal interleaving of shots. This analysis results in the identification of more complex structures, such as first missed services, short rallies that could be aces or services, long rallies, breaks that are significant of the end of a game and replays that highlight interesting points. These higher-level unit structures can be used either to create summaries, or to allow non-linear browsing of the video.
Segmentation of cone-beam CT using a hidden Markov random field with informative priors
NASA Astrophysics Data System (ADS)
Moores, M.; Hargrave, C.; Harden, F.; Mengersen, K.
2014-03-01
Cone-beam computed tomography (CBCT) has enormous potential to improve the accuracy of treatment delivery in image-guided radiotherapy (IGRT). To assist radiotherapists in interpreting these images, we use a Bayesian statistical model to label each voxel according to its tissue type. The rich sources of prior information in IGRT are incorporated into a hidden Markov random field model of the 3D image lattice. Tissue densities in the reference CT scan are estimated using inverse regression and then rescaled to approximate the corresponding CBCT intensity values. The treatment planning contours are combined with published studies of physiological variability to produce a spatial prior distribution for changes in the size, shape and position of the tumour volume and organs at risk. The voxel labels are estimated using iterated conditional modes. The accuracy of the method has been evaluated using 27 CBCT scans of an electron density phantom. The mean voxel-wise misclassification rate was 6.2%, with Dice similarity coefficient of 0.73 for liver, muscle, breast and adipose tissue. By incorporating prior information, we are able to successfully segment CBCT images. This could be a viable approach for automated, online image analysis in radiotherapy.
A hidden Markov model for investigating recent positive selection through haplotype structure.
Chen, Hua; Hey, Jody; Slatkin, Montgomery
2015-02-01
Recent positive selection can increase the frequency of an advantageous mutant rapidly enough that a relatively long ancestral haplotype will be remained intact around it. We present a hidden Markov model (HMM) to identify such haplotype structures. With HMM identified haplotype structures, a population genetic model for the extent of ancestral haplotypes is then adopted for parameter inference of the selection intensity and the allele age. Simulations show that this method can detect selection under a wide range of conditions and has higher power than the existing frequency spectrum-based method. In addition, it provides good estimate of the selection coefficients and allele ages for strong selection. The method analyzes large data sets in a reasonable amount of running time. This method is applied to HapMap III data for a genome scan, and identifies a list of candidate regions putatively under recent positive selection. It is also applied to several genes known to be under recent positive selection, including the LCT, KITLG and TYRP1 genes in Northern Europeans, and OCA2 in East Asians, to estimate their allele ages and selection coefficients. PMID:25446961
A Hidden Markov Model for Investigating Recent Positive Selection through Haplotype Structure
Hey, Jody; Slatkin, Montgomery
2014-01-01
Recent positive selection can increase the frequency of an advantageous mutant rapidly enough that a relatively long ancestral haplotype will be remained intact around it. We present a hidden Markov model (HMM) to identify such haplotype structures. With HMM identified haplotype structures, a population genetic model for the extent of ancestral haplotypes is then adopted for parameter inference of the selection intensity and the allele age. Simulations show that this method can detect selection under a wide range of conditions and has higher power than the existing frequency spectrum-based method. In addition, it provides good estimate of the selection coefficients and allele ages for strong selection. The method analyzes large data sets in a reasonable amount of running time. This method is applied to HapMap III data for a genome scan, and identifies a list of candidate regions putatively under recent positive selection. It is also applied to several genes known to be under recent positive selection, including the LCT, KITLG and TYRP1 genes in Northern Europeans, and OCA2 in East Asians, to estimate their allele ages and selection coefficients. PMID:25446961
CR image filter methods research based on wavelet-domain hidden markov models
NASA Astrophysics Data System (ADS)
Wang, Jun-li; Wang, Yun-peng; Li, Da-yi; Li, Shi-wu; Kui, Hai-lin
2006-01-01
In the procedure of computed radiography imaging, we should firstly get across the characters of kinds of noises and the relationship between the image signals and noises. Based on the specialties of computed radiography (CR) images and medical image processing, we have study the filtering methods for computed radiography images noises. On the base of analyzing computed radiography imaging system in detail, the author think that the major two noises are Gaussian white noise and Poisson noise. Then, the different relationship of between two kinds of noises and signal were studied completely. By considering both the characteristics of computed radiography images and the statistical features of wavelet transformed images, a multiscale image filtering algorithm, which based on two-state hidden markov model (HMM) and mixture Gaussian statistical model, has been used to decrease the Gaussian white noise in computed images. By using EM (Expectation Maximization) algorithm to estimate noise coefficients in each scale and obtain power spectrum matrix, then this carried through the syncretized two Filter that are IIR(infinite impulse response) Wiener Filter and HMM, according to scale size ,and achieve the experiments as well as the comparison with other denoising methods were presented at last.
Hypovigilance Detection for UCAV Operators Based on a Hidden Markov Model
Kwon, Namyeon; Shin, Yongwook; Ryo, Chuh Yeop; Park, Jonghun
2014-01-01
With the advance of military technology, the number of unmanned combat aerial vehicles (UCAVs) has rapidly increased. However, it has been reported that the accident rate of UCAVs is much higher than that of manned combat aerial vehicles. One of the main reasons for the high accident rate of UCAVs is the hypovigilance problem which refers to the decrease in vigilance levels of UCAV operators while maneuvering. In this paper, we propose hypovigilance detection models for UCAV operators based on EEG signal to minimize the number of occurrences of hypovigilance. To enable detection, we have applied hidden Markov models (HMMs), two of which are used to indicate the operators' dual states, normal vigilance and hypovigilance, and, for each operator, the HMMs are trained as a detection model. To evaluate the efficacy and effectiveness of the proposed models, we conducted two experiments on the real-world data obtained by using EEG-signal acquisition devices, and they yielded satisfactory results. By utilizing the proposed detection models, the problem of hypovigilance of UCAV operators and the problem of high accident rate of UCAVs can be addressed. PMID:24963338
Gene recognition in cyanobacterium genomic sequence data using the hidden Markov model.
Yada, T; Hirosawa, M
1996-01-01
We have developed a hidden Markov model (HMM) to detect the protein coding regions within one megabase contiguous sequence data, registered in a database called GenBank in eight entries, of the genome of cyanobacterium, Synechocystis sp. strain PCC6803. Detection of the coding regions in the database entry was performed by using HMM whose parameters were determined by taking the statistics from the rests of the entries. This HMM has states modeling the di-codons and their frequencies within coding regions and those modeling its base contents in the intergenic regions. Results of the cross-validation showed that the HMM recognized 92.1% of coding regions assigned in sequence annotation. In addition, it suggested 94 potential new coding regions whose length are longer than 90 bases. The recognition accuracy calculated at the level of individual bases was 90.7% for the coding regions and 88.1% for the intergenic regions. This corresponds to a correlation coefficient for coding region recognition of 0.784. Comparison with its prediction accuracy with that by GeneMark showed that the HMM has the same level of prediction accuracy as GeneMark on average. Since we can extend the HMM to utilize information such as SD sequences, the prediction accuracy of the HMM will be enhanced. It was observed that correlation was positive between the prediction rate of the coding regions and the G + C content at the third position of the codon. This suggests the possibility that the prediction rate of coding regions in the cyanobacteria sequence can be enhanced by improving the present HMM into that reflects the classification of coding regions based on the G + C content. PMID:8877525
Lin, Yen-Jen; Chen, Yu-Tin; Hsu, Shu-Ni; Peng, Chien-Hua; Tang, Chuan-Yi; Yen, Tzu-Chen; Hsieh, Wen-Ping
2014-01-01
Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states. PMID:24849202
Modeling strategic use of human computer interfaces with novel hidden Markov models.
Mariano, Laura J; Poore, Joshua C; Krum, David M; Schwartz, Jana L; Coskren, William D; Jones, Eric M
2015-01-01
Immersive software tools are virtual environments designed to give their users an augmented view of real-world data and ways of manipulating that data. As virtual environments, every action users make while interacting with these tools can be carefully logged, as can the state of the software and the information it presents to the user, giving these actions context. This data provides a high-resolution lens through which dynamic cognitive and behavioral processes can be viewed. In this report, we describe new methods for the analysis and interpretation of such data, utilizing a novel implementation of the Beta Process Hidden Markov Model (BP-HMM) for analysis of software activity logs. We further report the results of a preliminary study designed to establish the validity of our modeling approach. A group of 20 participants were asked to play a simple computer game, instrumented to log every interaction with the interface. Participants had no previous experience with the game's functionality or rules, so the activity logs collected during their naïve interactions capture patterns of exploratory behavior and skill acquisition as they attempted to learn the rules of the game. Pre- and post-task questionnaires probed for self-reported styles of problem solving, as well as task engagement, difficulty, and workload. We jointly modeled the activity log sequences collected from all participants using the BP-HMM approach, identifying a global library of activity patterns representative of the collective behavior of all the participants. Analyses show systematic relationships between both pre- and post-task questionnaires, self-reported approaches to analytic problem solving, and metrics extracted from the BP-HMM decomposition. Overall, we find that this novel approach to decomposing unstructured behavioral data within software environments provides a sensible means for understanding how users learn to integrate software functionality for strategic task pursuit. PMID
Automated Detection and Classification of Rockfall Induced Seismic Signals with Hidden-Markov-Models
NASA Astrophysics Data System (ADS)
Zeckra, M.; Hovius, N.; Burtin, A.; Hammer, C.
2015-12-01
Originally introduced in speech recognition, Hidden Markov Models are applied in different research fields of pattern recognition. In seismology, this technique has recently been introduced to improve common detection algorithms, like STA/LTA ratio or cross-correlation methods. Mainly used for the monitoring of volcanic activity, this study is one of the first applications to seismic signals induced by geomorphologic processes. With an array of eight broadband seismometers deployed around the steep Illgraben catchment (Switzerland) with high-level erosion, we studied a sequence of landslides triggered over a period of several days in winter. A preliminary manual classification led us to identify three main seismic signal classes that were used as a start for the HMM automated detection and classification: (1) rockslide signal, including a failure source and the debris mobilization along the slope, (2) rockfall signal from the remobilization of debris along the unstable slope, and (3) single cracking signal from the affected cliff observed before the rockslide events. Besides the ability to classify the whole dataset automatically, the HMM approach reflects the origin and the interactions of the three signal classes, which helps us to understand this geomorphic crisis and the possible triggering mechanisms for slope processes. The temporal distribution of crack events (duration > 5s, frequency band [2-8] Hz) follows an inverse Omori law, leading to the catastrophic behaviour of the failure mechanisms and the interest for warning purposes in rockslide risk assessment. Thanks to a dense seismic array and independent weather observations in the landslide area, this dataset also provides information about the triggering mechanisms, which exhibit a tight link between rainfall and freezing level fluctuations.
Automatic detection of alpine rockslides in continuous seismic data using hidden Markov models
NASA Astrophysics Data System (ADS)
Dammeier, Franziska; Moore, Jeffrey R.; Hammer, Conny; Haslinger, Florian; Loew, Simon
2016-02-01
Data from continuously recording permanent seismic networks can contain information about rockslide occurrence and timing complementary to eyewitness observations and thus aid in construction of robust event catalogs. However, detecting infrequent rockslide signals within large volumes of continuous seismic waveform data remains challenging and often requires demanding manual intervention. We adapted an automatic classification method using hidden Markov models to detect rockslide signals in seismic data from two stations in central Switzerland. We first processed 21 known rockslides, with event volumes spanning 3 orders of magnitude and station event distances varying by 1 order of magnitude, which resulted in 13 and 19 successfully classified events at the two stations. Retraining the models to incorporate seismic noise from the day of the event improved the respective results to 16 and 19 successful classifications. The missed events generally had low signal-to-noise ratio and small to medium volumes. We then processed nearly 14 years of continuous seismic data from the same two stations to detect previously unknown events. After postprocessing, we classified 30 new events as rockslides, of which we could verify three through independent observation. In particular, the largest new event, with estimated volume of 500,000 m3, was not generally known within the Swiss landslide community, highlighting the importance of regional seismic data analysis even in densely populated mountainous regions. Our method can be easily implemented as part of existing earthquake monitoring systems, and with an average event detection rate of about two per month, manual verification would not significantly increase operational workload.
HHMMiR: efficient de novo prediction of microRNAs using hierarchical hidden Markov models
Kadri, Sabah; Hinman, Veronica; Benos, Panayiotis V
2009-01-01
Background MicroRNAs (miRNAs) are small non-coding single-stranded RNAs (20–23 nts) that are known to act as post-transcriptional and translational regulators of gene expression. Although, they were initially overlooked, their role in many important biological processes, such as development, cell differentiation, and cancer has been established in recent times. In spite of their biological significance, the identification of miRNA genes in newly sequenced organisms is still based, to a large degree, on extensive use of evolutionary conservation, which is not always available. Results We have developed HHMMiR, a novel approach for de novo miRNA hairpin prediction in the absence of evolutionary conservation. Our method implements a Hierarchical Hidden Markov Model (HHMM) that utilizes region-based structural as well as sequence information of miRNA precursors. We first established a template for the structure of a typical miRNA hairpin by summarizing data from publicly available databases. We then used this template to develop the HHMM topology. Conclusion Our algorithm achieved average sensitivity of 84% and specificity of 88%, on 10-fold cross-validation of human miRNA precursor data. We also show that this model, trained on human sequences, works well on hairpins from other vertebrate as well as invertebrate species. Furthermore, the human trained model was able to correctly classify ~97% of plant miRNA precursors. The success of this approach in such a diverse set of species indicates that sequence conservation is not necessary for miRNA prediction. This may lead to efficient prediction of miRNA genes in virtually any organism. PMID:19208136
Sourty, Marion; Thoraval, Laurent; Roquet, Daniel; Armspach, Jean-Paul; Foucher, Jack; Blanc, Frédéric
2016-01-01
Exploring time-varying connectivity networks in neurodegenerative disorders is a recent field of research in functional MRI. Dementia with Lewy bodies (DLB) represents 20% of the neurodegenerative forms of dementia. Fluctuations of cognition and vigilance are the key symptoms of DLB. To date, no dynamic functional connectivity (DFC) investigations of this disorder have been performed. In this paper, we refer to the concept of connectivity state as a piecewise stationary configuration of functional connectivity between brain networks. From this concept, we propose a new method for group-level as well as for subject-level studies to compare and characterize connectivity state changes between a set of resting-state networks (RSNs). Dynamic Bayesian networks, statistical and graph theory-based models, enable one to learn dependencies between interacting state-based processes. Product hidden Markov models (PHMM), an instance of dynamic Bayesian networks, are introduced here to capture both statistical and temporal aspects of DFC of a set of RSNs. This analysis was based on sliding-window cross-correlations between seven RSNs extracted from a group independent component analysis performed on 20 healthy elderly subjects and 16 patients with DLB. Statistical models of DFC differed in patients compared to healthy subjects for the occipito-parieto-frontal network, the medial occipital network and the right fronto-parietal network. In addition, pairwise comparisons of DFC of RSNs revealed a decrease of dependency between these two visual networks (occipito-parieto-frontal and medial occipital networks) and the right fronto-parietal control network. The analysis of DFC state changes thus pointed out networks related to the cognitive functions that are known to be impaired in DLB: visual processing as well as attentional and executive functions. Besides this context, product HMM applied to RSNs cross-correlations offers a promising new approach to investigate structural and
Modeling strategic use of human computer interfaces with novel hidden Markov models
Mariano, Laura J.; Poore, Joshua C.; Krum, David M.; Schwartz, Jana L.; Coskren, William D.; Jones, Eric M.
2015-01-01
Immersive software tools are virtual environments designed to give their users an augmented view of real-world data and ways of manipulating that data. As virtual environments, every action users make while interacting with these tools can be carefully logged, as can the state of the software and the information it presents to the user, giving these actions context. This data provides a high-resolution lens through which dynamic cognitive and behavioral processes can be viewed. In this report, we describe new methods for the analysis and interpretation of such data, utilizing a novel implementation of the Beta Process Hidden Markov Model (BP-HMM) for analysis of software activity logs. We further report the results of a preliminary study designed to establish the validity of our modeling approach. A group of 20 participants were asked to play a simple computer game, instrumented to log every interaction with the interface. Participants had no previous experience with the game's functionality or rules, so the activity logs collected during their naïve interactions capture patterns of exploratory behavior and skill acquisition as they attempted to learn the rules of the game. Pre- and post-task questionnaires probed for self-reported styles of problem solving, as well as task engagement, difficulty, and workload. We jointly modeled the activity log sequences collected from all participants using the BP-HMM approach, identifying a global library of activity patterns representative of the collective behavior of all the participants. Analyses show systematic relationships between both pre- and post-task questionnaires, self-reported approaches to analytic problem solving, and metrics extracted from the BP-HMM decomposition. Overall, we find that this novel approach to decomposing unstructured behavioral data within software environments provides a sensible means for understanding how users learn to integrate software functionality for strategic task pursuit. PMID
Sourty, Marion; Thoraval, Laurent; Roquet, Daniel; Armspach, Jean-Paul; Foucher, Jack; Blanc, Frédéric
2016-01-01
Exploring time-varying connectivity networks in neurodegenerative disorders is a recent field of research in functional MRI. Dementia with Lewy bodies (DLB) represents 20% of the neurodegenerative forms of dementia. Fluctuations of cognition and vigilance are the key symptoms of DLB. To date, no dynamic functional connectivity (DFC) investigations of this disorder have been performed. In this paper, we refer to the concept of connectivity state as a piecewise stationary configuration of functional connectivity between brain networks. From this concept, we propose a new method for group-level as well as for subject-level studies to compare and characterize connectivity state changes between a set of resting-state networks (RSNs). Dynamic Bayesian networks, statistical and graph theory-based models, enable one to learn dependencies between interacting state-based processes. Product hidden Markov models (PHMM), an instance of dynamic Bayesian networks, are introduced here to capture both statistical and temporal aspects of DFC of a set of RSNs. This analysis was based on sliding-window cross-correlations between seven RSNs extracted from a group independent component analysis performed on 20 healthy elderly subjects and 16 patients with DLB. Statistical models of DFC differed in patients compared to healthy subjects for the occipito-parieto-frontal network, the medial occipital network and the right fronto-parietal network. In addition, pairwise comparisons of DFC of RSNs revealed a decrease of dependency between these two visual networks (occipito-parieto-frontal and medial occipital networks) and the right fronto-parietal control network. The analysis of DFC state changes thus pointed out networks related to the cognitive functions that are known to be impaired in DLB: visual processing as well as attentional and executive functions. Besides this context, product HMM applied to RSNs cross-correlations offers a promising new approach to investigate structural and
NASA Astrophysics Data System (ADS)
Suvorova, S.; Sun, L.; Melatos, A.; Moran, W.; Evans, R. J.
2016-06-01
Gravitational wave searches for continuous-wave signals from neutron stars are especially challenging when the star's spin frequency is unknown a priori from electromagnetic observations and wanders stochastically under the action of internal (e.g., superfluid or magnetospheric) or external (e.g., accretion) torques. It is shown that frequency tracking by hidden Markov model (HMM) methods can be combined with existing maximum likelihood coherent matched filters like the F -statistic to surmount some of the challenges raised by spin wandering. Specifically, it is found that, for an isolated, biaxial rotor whose spin frequency walks randomly, HMM tracking of the F -statistic output from coherent segments with duration Tdrift=10 d over a total observation time of Tobs=1 yr can detect signals with wave strains h0>2 ×10-26 at a noise level characteristic of the Advanced Laser Interferometer Gravitational Wave Observatory (Advanced LIGO). For a biaxial rotor with randomly walking spin in a binary orbit, whose orbital period and semimajor axis are known approximately from electromagnetic observations, HMM tracking of the Bessel-weighted F -statistic output can detect signals with h0>8 ×10-26. An efficient, recursive, HMM solver based on the Viterbi algorithm is demonstrated, which requires ˜103 CPU hours for a typical, broadband (0.5-kHz) search for the low-mass x-ray binary Scorpius X-1, including generation of the relevant F -statistic input. In a "realistic" observational scenario, Viterbi tracking successfully detects 41 out of 50 synthetic signals without spin wandering in stage I of the Scorpius X-1 Mock Data Challenge convened by the LIGO Scientific Collaboration down to a wave strain of h0=1.1 ×10-25, recovering the frequency with a root-mean-square accuracy of ≤4.3 ×10-3 Hz .
Evaluation of various feature extraction methods for landmine detection using hidden Markov models
NASA Astrophysics Data System (ADS)
Hamdi, Anis; Frigui, Hichem
2012-06-01
Hidden Markov Models (HMM) have proved to be eective for detecting buried land mines using data collected by a moving-vehicle-mounted ground penetrating radar (GPR). The general framework for a HMM-based landmine detector consists of building a HMM model for mine signatures and a HMM model for clutter signatures. A test alarm is assigned a condence proportional to the probability of that alarm being generated by the mine model and inversely proportional to its probability in the clutter model. The HMM models are built based on features extracted from GPR training signatures. These features are expected to capture the salient properties of the 3-dimensional alarms in a compact representation. The baseline HMM framework for landmine detection is based on gradient features. It models the time varying behavior of GPR signals, encoded using edge direction information, to compute the likelihood that a sequence of measurements is consistent with a buried landmine. In particular, the HMM mine models learns the hyperbolic shape associated with the signature of a buried mine by three states that correspond to the succession of an increasing edge, a at edge, and a decreasing edge. Recently, for the same application, other features have been used with dierent classiers. In particular, the Edge Histogram Descriptor (EHD) has been used within a K-nearest neighbor classier. Another descriptor is based on Gabor features and has been used within a discrete HMM classier. A third feature, that is closely related to the EHD, is the Bar histogram feature. This feature has been used within a Neural Networks classier for handwritten word recognition. In this paper, we propose an evaluation of the HMM based landmine detection framework with several feature extraction techniques. We adapt and evaluate the EHD, Gabor, Bar, and baseline gradient feature extraction methods. We compare the performance of these features using a large and diverse GPR data collection.
Witowski, Vitali; Foraita, Ronja; Pitsiladis, Yannis; Pigeot, Iris; Wirsik, Norman
2014-01-01
Introduction The use of accelerometers to objectively measure physical activity (PA) has become the most preferred method of choice in recent years. Traditionally, cutpoints are used to assign impulse counts recorded by the devices to sedentary and activity ranges. Here, hidden Markov models (HMM) are used to improve the cutpoint method to achieve a more accurate identification of the sequence of modes of PA. Methods 1,000 days of labeled accelerometer data have been simulated. For the simulated data the actual sedentary behavior and activity range of each count is known. The cutpoint method is compared with HMMs based on the Poisson distribution (HMM[Pois]), the generalized Poisson distribution (HMM[GenPois]) and the Gaussian distribution (HMM[Gauss]) with regard to misclassification rate (MCR), bout detection, detection of the number of activities performed during the day and runtime. Results The cutpoint method had a misclassification rate (MCR) of 11% followed by HMM[Pois] with 8%, HMM[GenPois] with 3% and HMM[Gauss] having the best MCR with less than 2%. HMM[Gauss] detected the correct number of bouts in 12.8% of the days, HMM[GenPois] in 16.1%, HMM[Pois] and the cutpoint method in none. HMM[GenPois] identified the correct number of activities in 61.3% of the days, whereas HMM[Gauss] only in 26.8%. HMM[Pois] did not identify the correct number at all and seemed to overestimate the number of activities. Runtime varied between 0.01 seconds (cutpoint), 2.0 minutes (HMM[Gauss]) and 14.2 minutes (HMM[GenPois]). Conclusions Using simulated data, HMM-based methods were superior in activity classification when compared to the traditional cutpoint method and seem to be appropriate to model accelerometer data. Of the HMM-based methods, HMM[Gauss] seemed to be the most appropriate choice to assess real-life accelerometer data. PMID:25464514
2013-01-01
Background Fungal pathogens cause devastating losses in economically important cereal crops by utilising pathogen proteins to infect host plants. Secreted pathogen proteins are referred to as effectors and have thus far been identified by selecting small, cysteine-rich peptides from the secretome despite increasing evidence that not all effectors share these attributes. Results We take advantage of the availability of sequenced fungal genomes and present an unbiased method for finding putative pathogen proteins and secreted effectors in a query genome via comparative hidden Markov model analyses followed by unsupervised protein clustering. Our method returns experimentally validated fungal effectors in Stagonospora nodorum and Fusarium oxysporum as well as the N-terminal Y/F/WxC-motif from the barley powdery mildew pathogen. Application to the cereal pathogen Fusarium graminearum reveals a secreted phosphorylcholine phosphatase that is characteristic of hemibiotrophic and necrotrophic cereal pathogens and shares an ancient selection process with bacterial plant pathogens. Three F. graminearum protein clusters are found with an enriched secretion signal. One of these putative effector clusters contains proteins that share a [SG]-P-C-[KR]-P sequence motif in the N-terminal and show features not commonly associated with fungal effectors. This motif is conserved in secreted pathogenic Fusarium proteins and a prime candidate for functional testing. Conclusions Our pipeline has successfully uncovered conservation patterns, putative effectors and motifs of fungal pathogens that would have been overlooked by existing approaches that identify effectors as small, secreted, cysteine-rich peptides. It can be applied to any pathogenic proteome data, such as microbial pathogen data of plants and other organisms. PMID:24252298
Development of a brain MRI-based hidden Markov model for dementia recognition
2013-01-01
Background Dementia is an age-related cognitive decline which is indicated by an early degeneration of cortical and sub-cortical structures. Characterizing those morphological changes can help to understand the disease development and contribute to disease early prediction and prevention. But modeling that can best capture brain structural variability and can be valid in both disease classification and interpretation is extremely challenging. The current study aimed to establish a computational approach for modeling the magnetic resonance imaging (MRI)-based structural complexity of the brain using the framework of hidden Markov models (HMMs) for dementia recognition. Methods Regularity dimension and semi-variogram were used to extract structural features of the brains, and vector quantization method was applied to convert extracted feature vectors to prototype vectors. The output VQ indices were then utilized to estimate parameters for HMMs. To validate its accuracy and robustness, experiments were carried out on individuals who were characterized as non-demented and mild Alzheimer's diseased. Four HMMs were constructed based on the cohort of non-demented young, middle-aged, elder and demented elder subjects separately. Classification was carried out using a data set including both non-demented and demented individuals with a wide age range. Results The proposed HMMs have succeeded in recognition of individual who has mild Alzheimer's disease and achieved a better classification accuracy compared to other related works using different classifiers. Results have shown the ability of the proposed modeling for recognition of early dementia. Conclusion The findings from this research will allow individual classification to support the early diagnosis and prediction of dementia. By using the brain MRI-based HMMs developed in our proposed research, it will be more efficient, robust and can be easily used by clinicians as a computer-aid tool for validating imaging bio
Kaya, Yılmaz
2015-09-01
This paper proposes a novel approach to detect epilepsy seizures by using Electroencephalography (EEG), which is one of the most common methods for the diagnosis of epilepsy, based on 1-Dimension Local Binary Pattern (1D-LBP) and grey relational analysis (GRA) methods. The main aim of this paper is to evaluate and validate a novel approach, which is a computer-based quantitative EEG analyzing method and based on grey systems, aimed to help decision-maker. In this study, 1D-LBP, which utilizes all data points, was employed for extracting features in raw EEG signals, Fisher score (FS) was employed to select the representative features, which can also be determined as hidden patterns. Additionally, GRA is performed to classify EEG signals through these Fisher scored features. The experimental results of the proposed approach, which was employed in a public dataset for validation, showed that it has a high accuracy in identifying epileptic EEG signals. For various combinations of epileptic EEG, such as A-E, B-E, C-E, D-E, and A-D clusters, 100, 96, 100, 99.00 and 100% were achieved, respectively. Also, this work presents an attempt to develop a new general-purpose hidden pattern determination scheme, which can be utilized for different categories of time-varying signals. PMID:26206400
NASA Astrophysics Data System (ADS)
Yoo, Jiyoung; Kwon, Hyun-Han; So, Byung-Jin; Rajagopalan, Balaji; Kim, Tae-Woong
2015-04-01
This study proposed a hidden Markov chain model-based drought analysis (HMM-DA) tool to understand the beginning and ending of meteorological drought and to further characterize typhoon-induced drought busters (TDB) by exploring spatiotemporal drought patterns in South Korea. It was found that typhoons have played a dominant role in ending drought events (EDE) during the typhoon season (July-September) over the last four decades (1974-2013). The percentage of EDEs terminated by TDBs was about 43-90% mainly along coastal regions in South Korea. Furthermore, the TDBs, mainly during summer, have a positive role in managing extreme droughts during the subsequent autumn and spring seasons. The HMM-DA models the temporal dependencies between drought states using Markov chain, consequently capturing the dependencies between droughts and typhoons well, thus, enabling a better performance in modeling spatiotemporal drought attributes compared to traditional methods.
Isojunno, Saana; Miller, Patrick J O
2016-01-01
The biological consequences of behavioral responses to anthropogenic noise depend on context. We explore the links between individual motivation, condition, and external constraints in a concept model and illustrate the use of motivational-behavioral states as a means to quantify the biologically relevant effects of tagging. Behavioral states were estimated from multiple streams of data in a hidden Markov model and used to test the change in foraging effort and the change in energetic success or cost given the effort. The presence of a tag boat elicited a short-term reduction in time spent in foraging states but not for proxies for success or cost within foraging states. PMID:26610996
NASA Astrophysics Data System (ADS)
Tan, Wei Lun; Yusof, Fadhilah; Yusop, Zulkifli
2016-04-01
This study involves the modelling of a homogeneous hidden Markov model (HMM) on the northeast rainfall monsoon using 40 rainfall stations in Peninsular Malaysia for the period of 1975 to 2008. A six hidden states HMM was selected based on Bayesian information criterion (BIC), and every hidden state has distinct rainfall characteristics. Three of the states were found to correspond by wet conditions; while the remaining three states were found to correspond to dry conditions. The six hidden states were found to correspond with the associated atmospheric composites. The relationships between El Niño-Southern Oscillation (ENSO) and the sea surface temperatures (SST) in the Pacific Ocean are found regarding interannual variability. The wet (dry) states were found to be well correlated with a Niño 3.4 index which was used to characterize the intensity of an ENSO event. This model is able to assess the behaviour of the rainfall characteristics with the large scale atmospheric circulation; the monsoon rainfall is well correlated with the El Niño-Southern Oscillation in Peninsular Malaysia.
Profile Hidden Markov Models for the Detection of Viruses within Metagenomic Sequence Data
Skewes-Cox, Peter; Sharpton, Thomas J.; Pollard, Katherine S.; DeRisi, Joseph L.
2014-01-01
Rapid, sensitive, and specific virus detection is an important component of clinical diagnostics. Massively parallel sequencing enables new diagnostic opportunities that complement traditional serological and PCR based techniques. While massively parallel sequencing promises the benefits of being more comprehensive and less biased than traditional approaches, it presents new analytical challenges, especially with respect to detection of pathogen sequences in metagenomic contexts. To a first approximation, the initial detection of viruses can be achieved simply through alignment of sequence reads or assembled contigs to a reference database of pathogen genomes with tools such as BLAST. However, recognition of highly divergent viral sequences is problematic, and may be further complicated by the inherently high mutation rates of some viral types, especially RNA viruses. In these cases, increased sensitivity may be achieved by leveraging position-specific information during the alignment process. Here, we constructed HMMER3-compatible profile hidden Markov models (profile HMMs) from all the virally annotated proteins in RefSeq in an automated fashion using a custom-built bioinformatic pipeline. We then tested the ability of these viral profile HMMs (“vFams”) to accurately classify sequences as viral or non-viral. Cross-validation experiments with full-length gene sequences showed that the vFams were able to recall 91% of left-out viral test sequences without erroneously classifying any non-viral sequences into viral protein clusters. Thorough reanalysis of previously published metagenomic datasets with a set of the best-performing vFams showed that they were more sensitive than BLAST for detecting sequences originating from more distant relatives of known viruses. To facilitate the use of the vFams for rapid detection of remote viral homologs in metagenomic data, we provide two sets of vFams, comprising more than 4,000 vFams each, in the HMMER3 format. We also
NASA Astrophysics Data System (ADS)
Faisan, Sylvain; Thoraval, Laurent; Armspach, Jean-Paul; Heitz, Fabrice; Foucher, Jack
2005-04-01
This paper presents a novel, completely unsupervised fMRI brain mapping approach that addresses the three problems of hemodynamic response function (HRF) shape variability, neural event timing, and fMRI response linearity. To make it robust, the method takes into account spatial and temporal information directly into the core of the activation detection process. In practice, activation detection is formulated in terms of temporal alignment between the sequence of hemodynamic response onsets (HROs) detected in the fMRI signal at υ and in the spatial neighbourhood of υ, and the sequence of "off-on" transitions observed in the input blocked stimulation paradigm (when considering epoch-related fMRI data), or the sequence of stimuli of the event-based paradigm (when considering event-related fMRI data). This multiple event sequence alignment problem, which comes under multisensor data fusion, is solved within the probabilistic framework of hidden Markov multiple event sequence models (HMMESMs), a special class of hidden Markov models. Results obtained on real and synthetic data compete with those obtained with the popular statistical parametric mapping (SPM) approach, but without necessitating any prior definition of the expected activation patterns, the HMMESM mapping approach being completely unsupervised.
Stifter, Cynthia A.; Rovine, Michael
2016-01-01
The focus of the present longitudinal study, to examine mother-infant interaction during the administration of immunizations at two and six months of age, used hidden Markov modeling, a time series approach that produces latent states to describe how mothers and infants work together to bring the infant to a soothed state. Results revealed a 4-state model for the dyadic responses to a two-month inoculation whereas a 6-state model best described the dyadic process at six months. Two of the states at two months and three of the states at six months suggested a progression from high intensity crying to no crying with parents using vestibular and auditory soothing methods. The use of feeding and/or pacifying to soothe the infant characterized one two-month state and two six-month states. These data indicate that with maturation and experience, the mother-infant dyad is becoming more organized around the soothing interaction. Using hidden Markov modeling to describe individual differences, as well as normative processes, is also presented and discussed.
Ghil, M.; Kravtsov, S.; Robertson, A. W.; Smyth, P.
2008-10-14
This project was a continuation of previous work under DOE CCPP funding, in which we had developed a twin approach of probabilistic network (PN) models (sometimes called dynamic Bayesian networks) and intermediate-complexity coupled ocean-atmosphere models (ICMs) to identify the predictable modes of climate variability and to investigate their impacts on the regional scale. We had developed a family of PNs (similar to Hidden Markov Models) to simulate historical records of daily rainfall, and used them to downscale GCM seasonal predictions. Using an idealized atmospheric model, we had established a novel mechanism through which ocean-induced sea-surface temperature (SST) anomalies might influence large-scale atmospheric circulation patterns on interannual and longer time scales; we had found similar patterns in a hybrid coupled ocean-atmosphere-sea-ice model. The goal of the this continuation project was to build on these ICM results and PN model development to address prediction of rainfall and temperature statistics at the local scale, associated with global climate variability and change, and to investigate the impact of the latter on coupled ocean-atmosphere modes. Our main results from the grant consist of extensive further development of the hidden Markov models for rainfall simulation and downscaling together with the development of associated software; new intermediate coupled models; a new methodology of inverse modeling for linking ICMs with observations and GCM results; and, observational studies of decadal and multi-decadal natural climate results, informed by ICM results.
Yang, Sejung; Lee, Byung-Uk
2015-01-01
In certain image acquisitions processes, like in fluorescence microscopy or astronomy, only a limited number of photons can be collected due to various physical constraints. The resulting images suffer from signal dependent noise, which can be modeled as a Poisson distribution, and a low signal-to-noise ratio. However, the majority of research on noise reduction algorithms focuses on signal independent Gaussian noise. In this paper, we model noise as a combination of Poisson and Gaussian probability distributions to construct a more accurate model and adopt the contourlet transform which provides a sparse representation of the directional components in images. We also apply hidden Markov models with a framework that neatly describes the spatial and interscale dependencies which are the properties of transformation coefficients of natural images. In this paper, an effective denoising algorithm for Poisson-Gaussian noise is proposed using the contourlet transform, hidden Markov models and noise estimation in the transform domain. We supplement the algorithm by cycle spinning and Wiener filtering for further improvements. We finally show experimental results with simulations and fluorescence microscopy images which demonstrate the improved performance of the proposed approach. PMID:26352138
NASA Astrophysics Data System (ADS)
Young, Dylan
Particle tracking offers significant insight into the molecular mechanics that govern the behavior of living cells. The analysis of molecular trajectories that transition between different motive states, such as diffusive, driven and tethered modes, is of considerable importance, with even single trajectories containing significant amounts of information about a molecule's environment and its interactions with cellular structures such as the cell cytoskeleton, membrane or extracellular matrix. Hidden Markov models (HMM) have been widely adopted to perform the segmentation of such complex tracks, however robust methods for failure detection are required when HMMs are applied to individual particle tracks and limited data sets. Here, we show that extensive analysis of hidden Markov model outputs using data derived from multi-state Brownian dynamics simulations can be used for both the optimization of likelihood models, and also to generate custom failure tests based on a modified Bayesian Information Criterion. In the first instance, these failure tests can be applied to assess the quality of the HMM results. In addition, they provide critical information for the successful design of particle tracking experiments where trajectories containing multiple mobile states are expected.
Benoit, Julia S; Chan, Wenyaw; Luo, Sheng; Yeh, Hung-Wen; Doody, Rachelle
2016-04-30
Understanding the dynamic disease process is vital in early detection, diagnosis, and measuring progression. Continuous-time Markov chain (CTMC) methods have been used to estimate state-change intensities but challenges arise when stages are potentially misclassified. We present an analytical likelihood approach where the hidden state is modeled as a three-state CTMC model allowing for some observed states to be possibly misclassified. Covariate effects of the hidden process and misclassification probabilities of the hidden state are estimated without information from a 'gold standard' as comparison. Parameter estimates are obtained using a modified expectation-maximization (EM) algorithm, and identifiability of CTMC estimation is addressed. Simulation studies and an application studying Alzheimer's disease caregiver stress-levels are presented. The method was highly sensitive to detecting true misclassification and did not falsely identify error in the absence of misclassification. In conclusion, we have developed a robust longitudinal method for analyzing categorical outcome data when classification of disease severity stage is uncertain and the purpose is to study the process' transition behavior without a gold standard. PMID:26782946
Super-Resolution Using Hidden Markov Model and Bayesian Detection Estimation Framework
NASA Astrophysics Data System (ADS)
Humblot, Fabrice; Mohammad-Djafari, Ali
2006-12-01
This paper presents a new method for super-resolution (SR) reconstruction of a high-resolution (HR) image from several low-resolution (LR) images. The HR image is assumed to be composed of homogeneous regions. Thus, the a priori distribution of the pixels is modeled by a finite mixture model (FMM) and a Potts Markov model (PMM) for the labels. The whole a priori model is then a hierarchical Markov model. The LR images are assumed to be obtained from the HR image by lowpass filtering, arbitrarily translation, decimation, and finally corruption by a random noise. The problem is then put in a Bayesian detection and estimation framework, and appropriate algorithms are developed based on Markov chain Monte Carlo (MCMC) Gibbs sampling. At the end, we have not only an estimate of the HR image but also an estimate of the classification labels which leads to a segmentation result.
Fiske, Ian J.; Royle, J. Andrew; Gross, Kevin
2014-01-01
Ecologists and wildlife biologists increasingly use latent variable models to study patterns of species occurrence when detection is imperfect. These models have recently been generalized to accommodate both a more expansive description of state than simple presence or absence, and Markovian dynamics in the latent state over successive sampling seasons. In this paper, we write these multi-season, multi-state models as hidden Markov models to find both maximum likelihood estimates of model parameters and finite-sample estimators of the trajectory of the latent state over time. These estimators are especially useful for characterizing population trends in species of conservation concern. We also develop parametric bootstrap procedures that allow formal inference about latent trend. We examine model behavior through simulation, and we apply the model to data from the North American Amphibian Monitoring Program.
A Hybrid of Deep Network and Hidden Markov Model for MCI Identification with Resting-State fMRI
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2015-01-01
In this paper, we propose a novel method for modelling functional dynamics in resting-state fMRI (rs-fMRI) for Mild Cognitive Impairment (MCI) identification. Specifically, we devise a hybrid architecture by combining Deep Auto-Encoder (DAE) and Hidden Markov Model (HMM). The roles of DAE and HMM are, respectively, to discover hierarchical non-linear relations among features, by which we transform the original features into a lower dimension space, and to model dynamic characteristics inherent in rs-fMRI, i.e., internal state changes. By building a generative model with HMMs for each class individually, we estimate the data likelihood of a test subject as MCI or normal healthy control, based on which we identify the clinical label. In our experiments, we achieved the maximal accuracy of 81.08% with the proposed method, outperforming state-of-the-art methods in the literature. PMID:27054199
NASA Astrophysics Data System (ADS)
Yusof, Fadhilah; Kane, Ibrahim Lawal; Yusop, Zulkifli
2015-02-01
Precarious circumstances related to rainfall events can be due to very intense or persistence of rainfall over a long period of time. Such events may give rise to an exceedence of the capacity of sewer systems resulting to landslides or flooding. One of the conventional ways of measuring such risk associated with persistence in rain is done through studies of long term persistence and volatility persistence. This work investigates the persistence level of Kuantan daily rainfall using the hybrid of autoregressive fractional integrated moving average (ARFIMA) and hidden Markov model (HMM). The result shows that the rainfall variability period returns quickly to its usual variability level which may not have a lasting period of extreme wet, hence relatively stable rainfall behavior is observed in Kuantan rainfall. This will enhance the understanding of the process for the successful development and implementation of water resource tools to assess engineering and environmental problems such as flood control.
Camproux, A C; Tufféry, P
2005-08-01
Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence. PMID:16040198
Kim, Jongin; Lee, Suh-Kyung; Lee, Boreom
2013-01-01
The purpose of this paper is to determine whether electroencephalograpy (EEG) can be used as a tool for hearing impairment tests such as hearing screening. For this study, we recorded EEG responses to two syllables, /a/ and /u/, in Korean from three subjects at Gwangju Institute of Science and Technology. The ultimate goal of this study is to classify speech sound data regardless of their size using EEG; however, as an initial stage of the study, we classified only two different speech syllables using Gaussian hidden markov model. The result of this study shows a possibility that EEG could be used for hearing screening and other diagnostic tools related to speech perception. PMID:24110681
NASA Astrophysics Data System (ADS)
Lisiecki, L. E.; Ahn, S.; Khider, D.; Lawrence, C.
2015-12-01
Stratigraphic alignment is the primary way in which long marine climate records are placed on a common age model. We previously presented a probabilistic pairwise alignment algorithm, HMM-Match, which uses hidden Markov models to estimate alignment uncertainty and apply it to the alignment of benthic δ18O records to the "LR04" global benthic stack of Lisiecki and Raymo (2005) (Lin et al., 2014). However, since the LR04 stack is deterministic, the algorithm does not account for uncertainty in the stack. Here we address this limitation by developing a probabilistic stack, HMM-Stack. In this model the stack is a probabilistic inhomogeneous hidden Markov model, a.k.a. profile HMM. The HMM-stack is represented by a probabilistic model that "emits" each of the input records (Durbin et al., 1998). The unknown parameters of this model are learned from a set of input records using the expectation maximization (EM) algorithm. Because the multiple alignment of these records is unknown and uncertain, the expected contribution of each input point to each point in the stack is determined probabilistically. For each time step in the HMM-stack, δ18O values are described by a Gaussian probability distribution. Available δ18O records (N=180) are employed to estimate the mean and variance of δ18O at each time point. The mean of HMM-Stack follows the predicted pattern of glacial cycles with increased amplitude after the Pliocene-Pleistocene boundary and also larger and longer cycles after the mid-Pleistocene transition. Furthermore, the δ18O variance increases with age, producing a substantial loss in the signal-to-noise ratio. Not surprisingly, uncertainty in alignment and thus estimated age also increase substantially in the older portion of the stack.
A F Pimentel, Marco; Santos, Mauro D; Springer, David B; Clifford, Gari D
2015-08-01
Accurate heart beat detection in signals acquired from intensive care unit (ICU) patients is necessary for establishing both normality and detecting abnormal events. Detection is normally performed by analysing the electrocardiogram (ECG) signal, and alarms are triggered when parameters derived from this signal exceed preset or variable thresholds. However, due to noisy and missing data, these alarms are frequently deemed to be false positives, and therefore ignored by clinical staff. The fusion of features derived from other signals, such as the arterial blood pressure (ABP) or the photoplethysmogram (PPG), has the potential to reduce such false alarms. In order to leverage the highly correlated temporal nature of the physiological signals, a hidden semi-Markov model (HSMM) approach, which uses the intra- and inter-beat depolarization interval, was designed to detect heart beats in such data. Features based on the wavelet transform, signal gradient and signal quality indices were extracted from the ECG and ABP waveforms for use in the HSMM framework. The presented method achieved an overall score of 89.13% on the hidden/test data set provided by the Physionet/Computing in Cardiology Challenge 2014: Robust Detection of Heart Beats in Multimodal Data. PMID:26218536
NASA Astrophysics Data System (ADS)
Pal, I.; Robertson, A. W.; Lall, U.; Cane, M. A.
2013-12-01
A multiscale-modeling framework for daily rainfall is considered and diagnostic results are presented for an application to the winter season in Northwest India. The daily rainfall process is considered to follow a Hidden Markov Model (HMM), with the hidden states assumed to be an unknown random function of slowly varying climatic modulation of the winter jet stream and moisture transport dynamics. The data used are from 14 stations over the Satluj River basin in northwest India in winter (Dec-Jan-Feb-Mar). The period considered is 1977/78-2005/06. The HMM identifies four discrete weather states, which are used to describe daily rainfall variability over the study region. The first hidden state has low rainfall occurrence and intensity, the second has modest occurrence and low intensity, the third has high occurrence but low to modest intensity and the fourth has high frequency and intensity of daily rainfall. Each state was found to be associated with a distinct atmospheric circulation pattern, with States 3 and 4 characterized by a zonally oriented wave train extending across Eurasia between 20N-40N, identified with ';Western Disturbances'. State 1, by contrast, is characterized by a lack of synoptic wave activity. The occurrence of State 4 is strongly conditioned by the El Nino and Indian Ocean Dipole (IOD) phenomena in winter, which is demonstrated using large-scale correlation maps based on mean sea level pressure (MSLP) and sea surface temperature (SST). This suggests that there is a tendency of higher frequency of the wet days and intense Western Disturbances in winter during El Nino and positive IOD years. These findings, derived from daily rainfall station records, help clarify the sequence of Northern Hemisphere mid-latitude storms bringing winter rainfall over Northwest India, and their association with potentially predictable low frequency modes on seasonal time scales and longer.
NASA Astrophysics Data System (ADS)
Pal, Indrani; Robertson, Andrew W.; Lall, Upmanu; Cane, Mark A.
2015-02-01
A multiscale-modeling framework for daily rainfall is considered and diagnostic results are presented for an application to the winter season in Northwest India. The daily rainfall process is considered to follow a hidden Markov model (HMM), with the hidden states assumed to be an unknown random function of slowly varying climatic modulation of the winter jet stream and moisture transport dynamics. The data used are from 14 stations over Satluj River basin in winter (December-January-February-March). The period considered is 1977/78-2005/06. The HMM identifies four discrete weather states, which are used to describe daily rainfall variability over study region. Each state was found to be associated with a distinct atmospheric circulation pattern, with the driest and drier states, State 1 and 2 respectively, characterized by a lack of synoptic wave activity. In contrast, the wetter and wettest states, States 3 and 4 respectively, are characterized by a zonally oriented wave train extending across Eurasia between 20N and 40N, identified with `western disturbances' (WD). The occurrence of State 4 is strongly conditioned by the El Nino and Indian Ocean Dipole (IOD) phenomena in winter, which is demonstrated using large-scale correlation maps based on mean sea level pressure and sea surface temperature. This suggests that there is a tendency of higher frequency of the wet days and intense WD activities in winter during El Nino and positive IOD years. These findings, derived from daily rainfall station records, help clarify the sequence of Northern Hemisphere mid-latitude storms bringing winter rainfall over Northwest India, and their association with potentially predictable low frequency modes on seasonal time scales and longer.
Regentova, Emma; Zhang, Lei; Zheng, Jun; Veni, Gopalkrishna
2007-06-01
In this paper we investigate the performance of statistical modeling of digital mammograms by means of wavelet domain hidden Markov trees for its inclusion to a computer-aided diagnostic prompting system. The system is designed for detecting clusters of microcalcifications. Their further discrimination as benign or malignant is to be done by radiologists. The model is used for segmenting images based on the maximum likelihood classifier enhanced by the weighting technique. Further classification incorporates spatial filtering for a single microcalcification (MC) and microcalcification cluster (MCC) detection. Contrast filtering applied for the digital database for screening mammography (DDSM) dataset prior to spatial filtering greatly improves the classification accuracy. For all MC clusters of 40 mammograms from the mini-MIAS database of Mammographic Image Analysis Society, 92.5%-100% of true positive cases can be detected under 2-3 false positives per image. For 150 cases of DDSM cases, the designed system is capable to detect up to 98% of true positives under 3.3% of false positive cases. PMID:17654922
Parida, Bikram K; Panda, Prasanna K; Misra, Namrata; Mishra, Barada K
2015-02-01
Modeling the three-dimensional (3D) structures of proteins assumes great significance because of its manifold applications in biomolecular research. Toward this goal, we present MaxMod, a graphical user interface (GUI) of the MODELLER program that combines profile hidden Markov model (profile HMM) method with Clustal Omega program to significantly improve the selection of homologous templates and target-template alignment for construction of accurate 3D protein models. MaxMod distinguishes itself from other existing GUIs of MODELLER software by implementing effortless modeling of proteins using templates that bear modified residues. Additionally, it provides various features such as loop optimization, express modeling (a feature where protein model can be generated directly from its sequence, without any further user intervention) and automatic update of PDB database, thus enhancing the user-friendly control of computational tasks. We find that HMM-based MaxMod performs better than other modeling packages in terms of execution time and model quality. MaxMod is freely available as a downloadable standalone tool for academic and non-commercial purpose at http://www.immt.res.in/maxmod/. PMID:25636267
NASA Astrophysics Data System (ADS)
Luk, B. L.; Liu, K. P.; Tong, F.; Man, K. F.
2010-05-01
The impact-acoustics method utilizes different information contained in the acoustic signals generated by tapping a structure with a small metal object. It offers a convenient and cost-efficient way to inspect the tile-wall bonding integrity. However, the existence of the surface irregularities will cause abnormal multiple bounces in the practical inspection implementations. The spectral characteristics from those bounces can easily be confused with the signals obtained from different bonding qualities. As a result, it will deteriorate the classic feature-based classification methods based on frequency domain. Another crucial difficulty posed by the implementation is the additive noise existing in the practical environments that may also cause feature mismatch and false judgment. In order to solve this problem, the work described in this paper aims to develop a robust inspection method that applies model-based strategy, and utilizes the wavelet domain features with hidden Markov modeling. It derives a bonding integrity recognition approach with enhanced immunity to surface roughness as well as the environmental noise. With the help of the specially designed artificial sample slabs, experiments have been carried out with impact acoustic signals contaminated by real environmental noises acquired under practical inspection background. The results are compared with those using classic method to demonstrate the effectiveness of the proposed method.
Taborri, Juri; Scalona, Emilia; Palermo, Eduardo; Rossi, Stefano; Cappa, Paolo
2015-01-01
Gait-phase recognition is a necessary functionality to drive robotic rehabilitation devices for lower limbs. Hidden Markov Models (HMMs) represent a viable solution, but they need subject-specific training, making data processing very time-consuming. Here, we validated an inter-subject procedure to avoid the intra-subject one in two, four and six gait-phase models in pediatric subjects. The inter-subject procedure consists in the identification of a standardized parameter set to adapt the model to measurements. We tested the inter-subject procedure both on scalar and distributed classifiers. Ten healthy children and ten hemiplegic children, each equipped with two Inertial Measurement Units placed on shank and foot, were recruited. The sagittal component of angular velocity was recorded by gyroscopes while subjects performed four walking trials on a treadmill. The goodness of classifiers was evaluated with the Receiver Operating Characteristic. The results provided a goodness from good to optimum for all examined classifiers (0 < G < 0.6), with the best performance for the distributed classifier in two-phase recognition (G = 0.02). Differences were found among gait partitioning models, while no differences were found between training procedures with the exception of the shank classifier. Our results raise the possibility of avoiding subject-specific training in HMM for gait-phase recognition and its implementation to control exoskeletons for the pediatric population. PMID:26404309