Gupta, Shubham

Deep Learning in Rectified Gaussian Nets

2018

Gupta, Shubham
Advisor(s): Saul, Lawrence K.

Abstract

Here, we introduce a new family of probabilistic models called Rectified Gaussian Nets, or RGNs. RGNs can be thought of as an extension to Deep Boltzmann Machines (DBMs) with real non-negative nodes, instead of binary. Another distinguishing feature of RGN is that the probability density functions $P(\textbf{\textit{y, h \textbar v}})$ and $P(\textbf{\textit{h\textbar v, y}})$ are log-concave, even in deep architectures, where \textbf{\textit{v}} is the real valued input vector bounded between 0 and 1; \textbf{\textit{y}} and \textbf{\textit{h}} are the real valued output and hidden vectors respectively, rectified to be greater than or equal to zero. Due to this property, the most likely value of \textbf{\textit{y}} and \textbf{\textit{h}} conditioned on \textbf{\textit{v}} can be found exactly and efficiently, hence MAP estimate is tractable. We will also see that this property comes in handy, as the update rule for the network parameters resembles that of Boltzmann Machines, but we can approximate certain expectations over the nodes of the RGN by their MAP estimates, which is only a mild assumption as the posterior distribution over the nodes is provably unimodal. Hence, it is possible to train RGN both exactly and efficiently, unlike DBMs. We also show how one might go about using this model for generative modeling.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

Deep Learning in Rectified Gaussian Nets