Low-Dimensional Models for PCA and Regression

2013

Abstract

This thesis examines two separate statistical problems for which

low-dimensional models are effective.

In the first part of this thesis, we examine the Robust Principal Components Analysis

(RPCA) problem: given a

matrix $\datam$ that is the sum of a low-rank matrix $\lowopt$ and a sparse noise matrix

$\sparseopt$, recover $\lowopt$ and $\sparseopt$.

This problem appears in various settings, including image processing,

computer vision, and graphical models. Various polynomial-time heuristics

and algorithms have been proposed to solve this problem.

We introduce a block coordinate descent algorithm for this

problem and prove a convergence result. In addition, our iterative

algorithm has low complexity per iteration and empirically performs well

on synthetic datasets.

In the second part of this thesis, we examine a variant of ridge regression:

unlike in the classical setting where we know that the parameter of

interest lies near a single point, we instead only know that it lies near

a known low-dimensional subspace.

We formulate this regression problem as a convex optimization problem, and

introduce an efficient block coordinate descent algorithm for solving it.

We demonstrate that this ``subspace prior" version of ridge regression is

an appropriate model for understanding player effectiveness in basketball.

In particular, we apply our algorithm to real-world data and demonstrate

empirically that it produces a more accurate model of player effectiveness

by showing that (1) the algorithm outperforms existing approaches and (2)

it leads to a profitable betting strategy.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Berkeley