Coordinate Update Algorithms: Theory and Applications
This thesis focuses on coordinate update methods, which are useful for solving problems involving large or high-dimensional datasets. They decompose a problem into simple subproblems, where each updates one, or a small block of, variables while xing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize.
The great performance of coordinate update methods depends on solving simple subproblems. To derive simple subproblems for several new classes of applications, this thesis systematically studies coordinate friendly operators that perform low-cost coordinate updates.
Based on the discovered coordinate friendly operators, as well as operator splitting techniques, we obtain new coordinate update algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history.
Coordinate update algorithms can be used to develop decentralized algorithms for consensus optimization, which run over a network in which the agents communicate with their neighbors and perform local computation. Decentralized algorithms have broad applications and attract interest by itself. The second part of this thesis proposes an asynchronous, decentralized algorithm.
In the proposed algorithm, each agent can compute and communicate independently at different times, for different durations, with the information it has even if the latest information from its neighbors is not yet available. Such an asynchronous algorithm reduces the time that agents would otherwise waste idle because of communication delays or because their neighbors are slower. It also eliminates the need for a global clock for synchronization.
Mathematically, the algorithm involves both primal and dual variables, uses fixed stepsize parameters, and provably converges to the exact solution under a random agent assumption and both bounded and unbounded delay assumptions. When running synchronously, the algorithm performs just as well as existing competitive synchronous algorithms such as PG-EXTRA, which diverges without synchronization. Numerical experiments confirm the theoretical findings and illustrate the performance of the proposed algorithm.
In addition to the async-parallel coordinate selection rule well studied in the rst two parts of the thesis, we analyse the cyclic coordinate selection rules, where the ordering of coordinates in each cycle is arbitrary. The corresponding algorithms are fast, but their convergence is unknown in the fixed-point setting.
When T is a nonexpansive operator and has a fixed point, we show that the sequence of coordinate-update iterates converges to a fixed point under proper step sizes. This result applies to the primal-dual coordinate-update algorithms, which have wide applications to optimization problems with nonseparable nonsmooth objectives, as well as global linear constraints.
Numerical experiments are presented to compare the cyclic coordinate selection rule with the previous discussed rules.