Rigorous Guarantees for Randomized Diagonalization Algorithms
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Rigorous Guarantees for Randomized Diagonalization Algorithms

Abstract

We rigorously analyze numerical methods for (approximately) computing the eigenvalues and eigenvectors of a matrix. Our main results are the following:\begin{itemize} \item \emph{Spectral bisection.} We show that a backward-stable solution to the eigenvalue problem can be obtained by a randomized algorithm in nearly matrix-multiplication time on any input. More specifically, we show that for every $\delta>0$, there is a randomized version of the spectral bisection algorithm that, on any input $A\in \bC$, with probability $1-O(1/n)$, finds matrices $V, D\in \bC^{n\times n}$ with $V$ invertible and $D$ diagonal such that $\|A-VDV^{-1}\|\leq \delta \|A\|$, using $O(T_\MM(n)\log^2(n/\delta))$ arithmetic operations, in finite arithmetic with $O(\log^4(n/\delta)\log n)$ bits of precision. Here $T_\MM(n)$ is the number of arithmetic operations required to multiply two $n\times n$ complex matrices numerically stably, known to satisfy $T_\MM(n)=O(n^{\omega+\eta})$ for every $\eta>0$ where $\omega$ is the exponent of matrix multiplication. \item \emph{The Hessenberg shifted QR algorithm.} We introduce a randomized shifting strategy for the Hessenberg QR algorithm, for which we prove (in the finite precision arithmetic model) rapid global convergence. It follows from our results, that for any $\delta , \phi>0$, a randomized version of the shifted Hessenberg QR algorithm can be used to compute, on any input $A\in \bC^{n\times n}$, with probability at least $1-\phi$, matrices $U, T\in \bC^{n\times n}$ with $U$ unitary and $T$ upper triangular, such that $\|A-UTU^*\|\leq \delta \|A\|,$ using $O(n^3\log(n/(\delta\phi))^2 \log \log (n/(\delta\phi)) )$ arithmetic operations performed using $O(\log^2(n/(\delta \phi))\log \log (n/(\delta\phi)))$ bits of precision. This provides the first rapidly globally convergent shifting strategy in the nonsymmetric case, which was an open problem even under the assumption of exact arithmetic. \iffalse The key ideas in the analysis are: (1) to use certain higher degree shifts to dampen transient effects due to nonnormality (2) to characterize and detect slowly converging trajectories with respect to such shifts in terms of certain symmetry considerations, and to break this symmetry explicitly in the shifting strategy (3) to introduce a criteria for early deflation to deal with potential numerical stabilities that arise from using high degree shifts. \fi \item \emph{The Lanczos algorithm.} We analyze the Lanczos algorithm where the initial vector $u$ is sampled uniformly at random from the unit sphere $\mathbb{S}^{n-1}$, and show that when run on a Hermitian input $A\in \bC^{n\times n}$ for $c_1\log n$ iterations (where $c_1$ depends on some coarse global property of the spectrum of $A$ but not on $n$) its outputs (both the Jacobi coefficients and the Ritz values) are exponentially concentrated around their medians. Our techniques also allow us to understand the location of the output in the regime of $k= O(\log n)$ iterations, and in particular we use them to show that the Lanczos algorithm can fail to identify outlying eigenvalues when run for less than $c_2\log n$, where $c_2$ depends again on the same global property of the spectrum of $A$ used to determine $c_1$. \end{itemize}

A key ingredient in showing the first two results is a smoothed analysis of the eigenvector condition number and minimum eigenvalue gap of a matrix. In this direction we show a general result that might be of independent interest in random matrix theory: if one adds independent (absolutely continuous with respect to the Lebesgue measure) random variables of small variance to the entries of an arbitrary matrix $A \in \mathbb{C}^{n\times n}$, with high probability, the resulting matrix $A' $ will have (relatively) stable eigenvenvalues and eigenvectors.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View