Open Access Publications from the University of California

## Learning Under Random Reshuffling and Distributed Features

• Author(s): Ying, Bicheng
It has been observed in the literature through experimentation that learning under random reshuffling leads to enhanced performance, not only for the traditional stochastic gradient algorithms but also for variance-reduced implementations. However, no theoretical analysis exists that explains the phenomenon well under constant step-size adaptation. The first part of this dissertation resolves this issue and establishes analytically that, under random reshuffling, convergence is guaranteed to a small neighborhood of the optimizer at a linear rate. The analysis further shows that random reshuffling outperforms uniform sampling by showing that the iterates approach a smaller neighborhood of size $O(\mu^2)$ around the minimizer as opposed to $O(\mu)$. An analytical expression for the steady-state mean-square-error under random reshuffling is derived, which helps clarify in greater detail the differences between sampling with and without replacement. The dissertation also provides an explanation for the periodic behavior that is observed in random reshuffling implementations.