Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Spectrum-Revealing Randomized Matrix Factorization: Theory and Algorithms

Abstract

This thesis contains my work on Spectrum-revealing randomized matrix algorithms. This thesis has been divided into three chapters. Each chapter is self-contained.

In chapter 1, we discuss spectrum-revealing Cholesky factorization and its applications to kernel methods. Kernel methods represent some of the most popular machine learning tools for data analysis. Since exact kernel methods can be prohibitively expensive for large problems, reliable low-rank matrix approximations and high-performance implementations have become indispensable for practical applications of kernel methods. We introduce spectrum-revealing Cholesky factorization, a reliable low-rank matrix factorization, for kernel matrix approximation. We also develop an efficient and effective randomized algorithm for computing this factorization. Our numerical experiments demonstrate that this algorithm is as effective as other Cholesky factorization based kernel methods on machine learning problems but significantly faster.

In chapter 2, we discuss spectrum-revealing QR factorization and also distributed memory implementation of randomized QRCP. Factorizing large matrices by QR with column pivoting (QRCP) is substantially more expensive than QR without pivoting, owing to communication costs required for pivoting decisions. In contrast, randomized QRCP (RQRCP) algorithms have proven themselves empirically to be highly competitive with high-performance implementations of QR in processing time, on uniprocessor and shared memory machines, and as reliable as QRCP in pivot quality. We show that RQRCP algorithms can be as reliable as QRCP with failure probabilities exponentially decaying in oversampling size. We also analyze efficiency differences among different RQRCP algorithms. More importantly, we develop distributed memory implementations of RQRCP that are significantly better than QRCP implementations in ScaLAPACK. As a further development, we introduce the concept of and develop algorithms for computing spectrum-revealing QR factorizations for low-rank matrix approximations, and demonstrate their effectiveness against leading low-rank approximation methods in both theoretical and numerical reliability and efficiency.

In chapter 3, we present Flip-Flop Spectrum-Revealing QR (Flip-Flop SRQR) factorization, a significantly faster and more reliable variant of the QLP factorization of Stewart, for low-rank matrix approximations. Flip-Flop SRQR uses SRQR factorization to initialize a partial column pivoted QR factorization and then compute a partial LQ factorization. As observed by Stewart in his original QLP work, Flip-Flop SRQR tracks the exact singular values with ``considerable fidelity''. We develop singular value lower bounds and residual error upper bounds for Flip-Flop SRQR factorization. In situations where singular values of the input matrix decay relatively quickly, the low-rank approximation computed by Flip-Flop SRQR is guaranteed to be as accurate as truncated SVD. We also perform a complexity analysis to show that for the same accuracy, Flip-Flop SRQR is faster than randomized subspace iteration for approximating the SVD, the standard method used in Matlab tensor toolbox. We additionally compare Flip-Flop SRQR with alternatives on two applications, tensor approximation and nuclear norm minimization, to demonstrate its efficiency and effectiveness.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View