A study of the Exponentiated Gradient +/- algorithm for stochastic optimization of neural networks

2019

Abstract

Exponentiated Gradient +/- (abbr. EG+-) is a gradient update algorithm

drawn from work by Manfred Warmuth (Kivinen and Warmuth, 1997) in the

online learning setting. This thesis ports the algorithm into the context of deep

neural networks and analyses its fitness in that context compared to the current

state of the art gradient update methods. Existing methods employ an additive

update scheme whereby some fraction of the gradient is added to the weight

values to update them at each iteration in the gradient descent algorithm. EG+-

provides a multiplicative update scheme whereby a proportion of the gradient

is multiplied into the original weight value, and then normalized to update the

weight. EG+- is motivated by using a relative entropy regularization. This thesis

analyzes various properties and experimental results of the algorithm in comparison

to other update methods, and analyzes EG+- in the context of state of the

art residual networks and challenging vision problems. Three published implementations

are experimented with, and demonstrate that EG+- performs better

than SGD when there are many noisy features, and that it compares well with

commonly used state-of-the art gradient descent optimization methods. EG+-

also performs better than most SGD based optimizers on black-box adversarial

attacks, with the exception of non momentum based SGD with which it performs

similarly.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Santa Cruz