Asynchronous Parallel Coordinate Update Methods for Large Scale Problems
- Author(s): Peng, Zhimin
- Advisor(s): Yin, Wotao
- et al.
This thesis focuses on coordinate update methods (CU), which are useful for solving problems involving large or high-dimensional datasets. Coordinate update methods decompose a problem into simple subproblems, where each updates one, or a small block of, variables while fixing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize.
The great performance of coordinate update methods depends on solving sim- ple subproblems. To derive simple subproblems for several new classes of appli- cations, this thesis systematically studies coordinate friendly (CF) operators that perform low-cost coordinate updates. Based on the discovered coordinate friendly operators, as well as operator splitting techniques, I obtained new coordinate up- date algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history.
CU can be further scaled up to solve larger problems through asynchronous parallel (async-parallel) techniques. Asynchrony is crucial to parallel computing since it reduces synchronization wait, relaxes communication bottleneck, and thus speeds up computing significantly. I proposed ARock, an algorithmic framework in which multiple agents (machines, processors, or cores) update coordinates in an asynchronous parallel fashion. At each step of ARock, an agent updates a randomly selected coordinate based on possibly out-of-date information. Conver- gence results are established for solving fixed-point problems. Numerical tests on regression and equations show that ARock significantly outperforms its syn- chronous counterpart.
The other novel contribution of this thesis is the introduction of an abstract framework for implementing asynchronous parallel algorithms on shared memory platforms. The programming model adopts the thread library from C++ at its lowest level. At the highest level, the goal is to enable fast prototyping of async- parallel algorithms. I developed a multilevel approach to reduce the gap between expert to low-level programming and novice-level programming. A spectrum of applications have been implemented under this framework.