Mixture Models and topography of mixtures


The main result of this article states that one can get as many as D+1 modes from just D a two component normal mixture in dimensions. Multivariate mixture models are widely used for modeling homogeneous populations and for cluster analysis. Either the components directly or modes arising from these components are often used to extract individual clusters. Although in lower dimensions these strategies work well, our results show that high dimensional mixtures are often very complex and researchers should take extra precautions when using mixture models for cluster analysis. Further our analysis shows that the number of modes depends on the component means and eigenvalues of the ratio of the two component covariance matrices, which in turn provides a clear guideline as to when one can use mixture analysis for clustering high dimensional data.

Former students

  • Dan Ren
  • Bader Al-Ruwali

Collaborators

  • Bruce G. Lindasy (Phd Supervisor)
  • Marianthi Markatou
  • Shu-Chaun Chen

Related publications

Kernels, degrees of freedom, and power properties of quadratic distance goodness-of-fit tests
Lindsay B.G., Markatou M., and Ray S. Journal of the American Statistical Association. 109 (505)
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract

On the number of modes of finite mixtures of elliptical distributions
Alexandrovich G., Holzmann H., and Ray S. Studies in Classification, Data Analysis, and Knowledge Organization.
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract

On the upper bound of the number of modes of a multivariate normal mixture
Ray S. and Ren D. Journal of Multivariate Analysis. 108
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract

Quadratic distances on probabilities: A unified foundation
Lindsay B.G., Markatou M., Ray S., Yang K.E., and Chen S.-C. Annals of Statistics. 36 (2)
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract

Model selection in high dimensions: A quadratic-risk-based approach
Ray S. and Lindsay B.G. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 70 (1)
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract

The topography of multivariate normal mixtures
Ray S. and Lindsay B.G. Annals of Statistics. 33 (5)
Journal Page | Open Access | Scopus Link | Cite | Citing Papers |
Abstract