This dissertation studies the performance and linear convergence properties of primal-dual methods for the solution of decentralized multi-agent optimization problems. Decentralized multi-agent optimization is a powerful paradigm that finds applications in diverse fields in learning and engineering design. In these setups, a network of agents
is connected through some topology and agents are allowed to share information only locally. Their overall goal is to seek the minimizer of a global optimization problem through localized interactions. In decentralized consensus problems, the agents are coupled through a common consensus variable that they need to agree upon. While in decentralized resource allocation problems, the agents are coupled through global affine constraints.
Various decentralized consensus optimization algorithms already exist in the literature. Some methods are derived from a primal-dual perspective, while other methods are derived as gradient tracking mechanisms meant to track the average of local gradients. Among the gradient tracking methods are the adapt-then-combine implementations motivated by diffusion strategies, which have been observed to perform better than other implementations. In this dissertation, we develop a novel adapt-then-combine primal-dual algorithmic framework that captures most state-of-the-art gradient based methods as special cases including all the variations of the gradient-tracking methods. We also develop a concise and novel analysis technique that establishes the linear convergence of this general framework under strongly-convex objectives. Due to our unified framework, the analysis reveals important characteristics for these methods such as their convergence rates and step-size stability ranges. Moreover, the analysis reveals how the augmented Lagrangian penalty term, which is utilized in most of these methods, affects the performance of decentralized algorithms.
Another important question that we answer is whether decentralized proximal gradient methods can achieve global linear convergence for non-smooth composite optimization. For centralized algorithms, linear convergence has been established in the presence of a non-smooth composite term. In this dissertation, we close the gap between centralized and decentralized proximal gradient algorithms and show that decentralized proximal algorithms can also achieve linear convergence in the presence of a non-smooth term. Furthermore, we show that when each agent possesses a different local non-smooth term then global linear convergence cannot be established in the worst case.
Most works that study decentralized optimization problems assume that all agents are involved in computing all variables. However, in many applications the coupling across agents is sparse in the sense that only a few agents are involved in computing certain variables. We show how to design decentralized algorithms in sparsely coupled consensus and resource allocation problems. More importantly, we establish analytically the importance of exploiting the sparsity structure in coupled large-scale networks.