- Main
Fast Algorithms for Interior Point Methods
- Zhang, Qiuyi
- Advisor(s): Srivastava, Nikhil;
- Rao, Satish
Abstract
Interior point methods (IPM) are first introduced as an efficient polynomial time algorithm to solve linear programs. Since then, they have enjoyed success in general convex optimization with the introduction of self-concordant barriers, replacing the ellipsoid method as the optimizer of choice in many settings. As compared to the ellipsoid method, interior point methods boast a better runtime complexity due to its $O(\sqrt{n})$ iteration complexity, where each iteration requires a linear system solve for the Newton step computation. This implies a naive $O(n^{0.5+\omega})$ total runtime for IPMs, where $\omega$ is the exponent of matrix multiplication.
In a recent breakthrough work, [Cohen, Lee, Song'18] showed that we can solve linear programs in the IPM framework in current matrix multiplication time $\widetilde{O}(n^{\omega})$, implying that linear programs are computationally not much harder than matrix inversion. In this thesis, we extend this result to general Empirical Risk Minimization (ERM), showing that many convex optimization problems can be solved as efficiently as matrix inversion.
Specifically, many convex problems in machine learning and computer science share the same form:
\begin{align*}
\min_{x} \sum_{i} f_i( A_i x + b_i),
\end{align*}
where $f_i$ are convex functions on $\R^{n_i}$ with constant $n_i$, $A_i \in \R^{n_i \times d}$, $b_i \in \R^{n_i}$ and $\sum_i n_i = n$. This problem generalizes linear programming and we give an algorithm that runs in time
\begin{align*}
O^* ( ( n^{\omega} + n^{2.5 - \alpha/2} + n^{2+ 1/6} ) \log (1 / \delta) )
\end{align*}
where $\alpha$ is the dual exponent of matrix multiplication, and $\delta$ is the relative accuracy, and $O^*$ hides sub-polynomial terms. Note that the runtime has only a log dependence on the condition numbers or other data dependent parameters and these are captured in $\delta$. For the current bound $\omega \sim 2.38$ and $\alpha \sim 0.31$, our runtime $O^* ( n^{\omega} \log (n / \delta))$ matches the current best for solving a dense least squares regression problem, which is a special case of the problem we consider. Very recently, [Alman'18] proved that all the current known techniques can not give a better $\omega$ below $2.168$, which is larger than our $2+1/6$.
Our algorithm proposes two novel concepts, which can be of independent interest :\
$\bullet$ We give a robust deterministic central path method, whereas the previous central path is a stochastic central path which updates weights by a random sparse vector. \
$\bullet$ We propose an efficient data-structure to maintain the central path of interior point methods even when the weights update vector is dense.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-