Semi-parametric exponential family PCA : Reducing dimensions via non-parametric latent distribution estimation
Skip to main content
eScholarship
Open Access Publications from the University of California

Semi-parametric exponential family PCA : Reducing dimensions via non-parametric latent distribution estimation

Abstract

Principal component analysis is a widely used technique for dimensionality reduction, but is not based on a probability model. Many recently proposed dimension reduction methods are based on latent variable modelling with restrictive assumptions on the latent distribution. We present a semi-parametric latent variable model based technique for density modelling, dimensionality reduction and visualization. Unlike previous methods, we estimate the latent distribution non-parametrically. Using this estimated prior to reduce dimensions ensures that multi-modality is better preserved in the projected space. In addition, we allow the components of latent variable models to be drawn from the exponential family which makes the method suitable for special data types, for example binary or count data. We discuss connections to other probabilistic and non-probabilistic dimension reduction schemes based on gaussian and other exponential family distributions. Simulations on real valued, binary and count data show favorable comparison to other related schemes both in terms of separating different populations and generalization to unseen samples.

Pre-2018 CSE ID: CS2004-0790

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View