Tempered Bregman Divergence for Continuous and Discrete Time Mirror Descent and Robust Classification
- Author(s): Amid, Ehsan
- Advisor(s): Warmuth, Manfred Klaus
- et al.
Bregman divergence is an important class of divergence functions in Machine Learning. Many well-known updates including gradient descent and (un)normalized exponentiated gradient are motivated by using a Bregman divergence as the inertia term. Moreover, Bregman divergence is used as a measure of progress for online algorithms as well as the training loss for classification models. In this thesis, we introduce a class of tempered Bregman divergences that, as special cases, includes many well-known distance measures such as squared Euclidean and relative entropy. We explore the tempered updates motivated by the new tempered Bregman divergence and develop theorems that allow us to unify these updates as gradient descent. We show the application of the reparameterized updates by proving regret bounds for the special case of reparameterized exponentiated gradient. Finally, we extend the notion of a matching loss to the new tempered Bregman divergence and develop bounded classification loss functions that are significantly more robust to noise and outliers.