Skip to main content
Open Access Publications from the University of California

Bayesian Sparse Signal Recovery using Scale Mixtures with Applications to Speech

  • Author(s): Giri, Ritwik
  • Advisor(s): Rao, Bhaskar D
  • et al.

Sparse Signal Recovery (SSR) problem has gained a lot of interest in recent times because of its significant impact on many engineering applications.

This thesis tackles this important problem in a Bayesian framework and discusses a generalized scale mixture distribution family, Power Exponential Scale Mixture (PESM) and analyzes its usefulness as a candidate for the sparsity promoting prior distribution. We also derive a unified MAP estimation or Type I framework for SSR by employing an appropriate member of PESM family and show that our unified framework encompasses several popular regularization based SSR algorithms. In addition to that, exploiting natural hierarchical framework induced by the PESM family, we utilize these priors in a Type II/ Empirical Bayes framework and develop corresponding EM based SSR algorithms. Multivariate extension of our proposed PESM family has also been discussed, which in turn resulted in a unified framework for imposing joint sparsity in Multiple Measurement Vector (MMV) recovery problem.

We have also shown three specific applications of SSR in audio signal processing, which includes problem specific algorithm enhancements but still utilizes the basic understanding of SSR. For example, by employing a source prior from M-PESM family in a joint blind source separation problem, we propose a realm of reweighted algorithms for Independent Vector Analysis (IVA) with the ability to exploit any intra-source correlation structure. An Empirical bayes based Impulse Response (IR) estimator has also been proposed, which exploits both sparse early reflections and exponential decay reverb tail structure in Room Impulse Response/ Relative Impulse Response as prior information. Sparsity in residual has also been exploited for a speech modeling application, which uses the prior block sparse structure of glottal excitation to find the all pole filter coefficients to model speech efficiently.

Main Content
Current View