-
Model-based clustering with Hidden Markov Model regression for time series with regime changes
Authors:
Faicel Chamroukhi,
Allou Samé,
Patrice Aknin,
Gérard Govaert
Abstract:
This paper introduces a novel model-based clustering approach for clustering time series which present changes in regime. It consists of a mixture of polynomial regressions governed by hidden Markov chains. The underlying hidden process for each cluster activates successively several polynomial regimes during time. The parameter estimation is performed by the maximum likelihood method through a de…
▽ More
This paper introduces a novel model-based clustering approach for clustering time series which present changes in regime. It consists of a mixture of polynomial regressions governed by hidden Markov chains. The underlying hidden process for each cluster activates successively several polynomial regimes during time. The parameter estimation is performed by the maximum likelihood method through a dedicated Expectation-Maximization (EM) algorithm. The proposed approach is evaluated using simulated time series and real-world time series issued from a railway diagnosis application. Comparisons with existing approaches for time series clustering, including the stand EM for Gaussian mixtures, $K$-means clustering, the standard mixture of regression models and mixture of Hidden Markov Models, demonstrate the effectiveness of the proposed approach.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
A regression model with a hidden logistic process for feature extraction from time series
Authors:
Faicel Chamroukhi,
Allou Samé,
Gérard Govaert,
Patrice Aknin
Abstract:
A new approach for feature extraction from time series is proposed in this paper. This approach consists of a specific regression model incorporating a discrete hidden logistic process. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algor…
▽ More
A new approach for feature extraction from time series is proposed in this paper. This approach consists of a specific regression model incorporating a discrete hidden logistic process. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. A piecewise regression algorithm and its iterative variant have also been considered for comparisons. An experimental study using simulated and real data reveals good performances of the proposed approach.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
A regression model with a hidden logistic process for signal parametrization
Authors:
Faicel Chamroukhi,
Allou Samé,
Gérard Govaert,
Patrice Aknin
Abstract:
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-…
▽ More
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
Modèle à processus latent et algorithme EM pour la régression non linéaire
Authors:
Faicel Chamroukhi,
Allou Samé,
Gérard Govaert,
Patrice Aknin
Abstract:
A non linear regression approach which consists of a specific regression model incorporating a latent process, allowing various polynomial regression models to be activated preferentially and smoothly, is introduced in this paper. The model parameters are estimated by maximum likelihood performed via a dedicated expecation-maximization (EM) algorithm. An experimental study using simulated and real…
▽ More
A non linear regression approach which consists of a specific regression model incorporating a latent process, allowing various polynomial regression models to be activated preferentially and smoothly, is introduced in this paper. The model parameters are estimated by maximum likelihood performed via a dedicated expecation-maximization (EM) algorithm. An experimental study using simulated and real data sets reveals good performances of the proposed approach.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
Time series modeling by a regression approach based on a latent process
Authors:
Faicel Chamroukhi,
Allou Samé,
Gérard Govaert,
Patrice Aknin
Abstract:
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allo…
▽ More
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
A hidden process regression model for functional data description. Application to curve discrimination
Authors:
Faicel Chamroukhi,
Allou Samé,
Gérard Govaert,
Patrice Aknin
Abstract:
A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve dis…
▽ More
A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve discrimination rule is derived using the Maximum A Posteriori rule. The proposed model is evaluated using simulated curves and real world curves acquired during railway switch operations, by performing comparisons with the piecewise regression approach in terms of curve modeling and classification.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
Model-based clustering and segmentation of time series with changes in regime
Authors:
Allou Samé,
Faicel Chamroukhi,
Gérard Govaert,
Patrice Aknin
Abstract:
Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation-Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject…
▽ More
Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation-Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject to changes in regime. The proposed approach consists in modeling each cluster by a regression model in which the polynomial coefficients vary according to a discrete hidden process. In particular, this approach makes use of logistic functions to model the (smooth or abrupt) transitions between regimes. The model parameters are estimated by the maximum likelihood method solved by an Expectation-Maximization algorithm. The proposed approach can also be regarded as a clustering approach which operates by finding groups of time series having common changes in regime. In addition to providing a time series partition, it therefore provides a time series segmentation. The problem of selecting the optimal numbers of clusters and segments is solved by means of the Bayesian Information Criterion (BIC). The proposed approach is shown to be efficient using a variety of simulated time series and real-world time series of electrical power consumption from rail switching operations.
△ Less
Submitted 25 December, 2013;
originally announced December 2013.
-
An Efficient Approach to Sparse Linear Discriminant Analysis
Authors:
Luis Francisco Sanchez Merchante,
Yves Grandvalet,
Gerrad Govaert
Abstract:
We present a novel approach to the formulation and the resolution of sparse Linear Discriminant Analysis (LDA). Our proposal, is based on penalized Optimal Scoring. It has an exact equivalence with penalized LDA, contrary to the multi-class approaches based on the regression of class indicator that have been proposed so far. Sparsity is obtained thanks to a group-Lasso penalty that selects the sam…
▽ More
We present a novel approach to the formulation and the resolution of sparse Linear Discriminant Analysis (LDA). Our proposal, is based on penalized Optimal Scoring. It has an exact equivalence with penalized LDA, contrary to the multi-class approaches based on the regression of class indicator that have been proposed so far. Sparsity is obtained thanks to a group-Lasso penalty that selects the same features in all discriminant directions. Our experiments demonstrate that this approach generates extremely parsimonious models without compromising prediction performances. Besides prediction, the resulting sparse discriminant directions are also amenable to low-dimensional representations of data. Our algorithm is highly efficient for medium to large number of variables, and is thus particularly well suited to the analysis of gene expression data.
△ Less
Submitted 27 June, 2012;
originally announced June 2012.