-
Model-based clustering and classification using mixtures of multivariate skewed power exponential distributions
Authors:
Utkarsh J. Dang,
Michael P. B. Gallaugher,
Ryan P. Browne,
Paul D. McNicholas
Abstract:
Families of mixtures of multivariate power exponential (MPE) distributions have been previously introduced and shown to be competitive for cluster analysis in comparison to other elliptical mixtures including mixtures of Gaussian distributions. Herein, we propose a family of mixtures of multivariate skewed power exponential distributions to combine the flexibility of the MPE distribution with the…
▽ More
Families of mixtures of multivariate power exponential (MPE) distributions have been previously introduced and shown to be competitive for cluster analysis in comparison to other elliptical mixtures including mixtures of Gaussian distributions. Herein, we propose a family of mixtures of multivariate skewed power exponential distributions to combine the flexibility of the MPE distribution with the ability to model skewness. These mixtures are more robust to variations from normality and can account for skewness, varying tail weight, and peakedness of data. A generalized expectation-maximization approach combining minorization-maximization and optimization based on accelerated line search algorithms on the Stiefel manifold is used for parameter estimation. These mixtures are implemented both in the model-based clustering and classification frameworks. Both simulated and benchmark data are used for illustration and comparison to other mixture families.
△ Less
Submitted 20 January, 2023; v1 submitted 3 July, 2019;
originally announced July 2019.
-
Mixtures of Multivariate Power Exponential Distributions
Authors:
Utkarsh J. Dang,
Ryan P. Browne,
Paul D. McNicholas
Abstract:
An expanded family of mixtures of multivariate power exponential distributions is introduced. While fitting heavy-tails and skewness has received much attention in the model-based clustering literature recently, we investigate the use of a distribution that can deal with both varying tail-weight and peakedness of data. A family of parsimonious models is proposed using an eigen-decomposition of the…
▽ More
An expanded family of mixtures of multivariate power exponential distributions is introduced. While fitting heavy-tails and skewness has received much attention in the model-based clustering literature recently, we investigate the use of a distribution that can deal with both varying tail-weight and peakedness of data. A family of parsimonious models is proposed using an eigen-decomposition of the scale matrix. A generalized expectation-maximization algorithm is presented that combines convex optimization via a minorization-maximization approach and optimization based on accelerated line search algorithms on the Stiefel manifold. Lastly, the utility of this family of models is illustrated using both toy and benchmark data.
△ Less
Submitted 12 June, 2015;
originally announced June 2015.
-
Multivariate response and parsimony for Gaussian cluster-weighted models
Authors:
Utkarsh J. Dang,
Antonio Punzo,
Paul D. McNicholas,
Salvatore Ingrassia,
Ryan P. Browne
Abstract:
A family of parsimonious Gaussian cluster-weighted models is presented. This family concerns a multivariate extension to cluster-weighted modelling that can account for correlations between multivariate responses. Parsimony is attained by constraining parts of an eigen-decomposition imposed on the component covariance matrices. A sufficient condition for identifiability is provided and an expectat…
▽ More
A family of parsimonious Gaussian cluster-weighted models is presented. This family concerns a multivariate extension to cluster-weighted modelling that can account for correlations between multivariate responses. Parsimony is attained by constraining parts of an eigen-decomposition imposed on the component covariance matrices. A sufficient condition for identifiability is provided and an expectation-maximization algorithm is presented for parameter estimation. Model performance is investigated on both synthetic and classical real data sets and compared with some popular approaches. Finally, accounting for linear dependencies in the presence of a linear regression structure is shown to offer better performance, vis-à-vis clustering, over existing methodologies.
△ Less
Submitted 26 February, 2016; v1 submitted 3 November, 2014;
originally announced November 2014.
-
Accelerated Failure Time Models for Competing Risks in a Cluster Weighted Modelling Framework
Authors:
Utkarsh J. Dang,
Paul D. McNicholas
Abstract:
A novel approach for dealing with censored competing risks regression data is proposed. This is implemented by a mixture of accelerated failure time (AFT) models for a competing risks scenario within a cluster-weighted modelling (CWM) framework. Specifically, we make use of the log-normal AFT model here but any commonly used AFT model can be utilized. The alternating expectation conditional maximi…
▽ More
A novel approach for dealing with censored competing risks regression data is proposed. This is implemented by a mixture of accelerated failure time (AFT) models for a competing risks scenario within a cluster-weighted modelling (CWM) framework. Specifically, we make use of the log-normal AFT model here but any commonly used AFT model can be utilized. The alternating expectation conditional maximization algorithm (AECM) is used for parameter estimation and bootstrap** for standard error estimation. Finally, we present our results on some simulated and real competing risks data.
△ Less
Submitted 3 December, 2013;
originally announced December 2013.
-
Families of Parsimonious Finite Mixtures of Regression Models
Authors:
Utkarsh J. Dang,
Paul D. McNicholas
Abstract:
Finite mixtures of regression models offer a flexible framework for investigating heterogeneity in data with functional dependencies. These models can be conveniently used for unsupervised learning on data with clear regression relationships. We extend such models by imposing an eigen-decomposition on the multivariate error covariance matrix. By constraining parts of this decomposition, we obtain…
▽ More
Finite mixtures of regression models offer a flexible framework for investigating heterogeneity in data with functional dependencies. These models can be conveniently used for unsupervised learning on data with clear regression relationships. We extend such models by imposing an eigen-decomposition on the multivariate error covariance matrix. By constraining parts of this decomposition, we obtain families of parsimonious mixtures of regressions and mixtures of regressions with concomitant variables. These families of models account for correlations between multiple responses. An expectation-maximization algorithm is presented for parameter estimation and performance is illustrated on simulated and real data.
△ Less
Submitted 2 December, 2013;
originally announced December 2013.