Skip to main content
eScholarship
Open Access Publications from the University of California

Deep Learning in Rectified Gaussian Nets

  • Author(s): Gupta, Shubham
  • Advisor(s): Saul, Lawrence K.
  • et al.
Abstract

Here, we introduce a new family of probabilistic models called Rectified Gaussian Nets, or RGNs. RGNs can be thought of as an extension to Deep Boltzmann Machines (DBMs) with real non-negative nodes, instead of binary. Another distinguishing feature of RGN is that the probability density functions $P(\textbf{\textit{y, h \textbar v}})$ and $P(\textbf{\textit{h\textbar v, y}})$ are log-concave, even in deep architectures, where \textbf{\textit{v}} is the real valued input vector bounded between 0 and 1; \textbf{\textit{y}} and \textbf{\textit{h}} are the real valued output and hidden vectors respectively, rectified to be greater than or equal to zero. Due to this property, the most likely value of \textbf{\textit{y}} and \textbf{\textit{h}} conditioned on \textbf{\textit{v}} can be found exactly and efficiently, hence MAP estimate is tractable. We will also see that this property comes in handy, as the update rule for the network parameters resembles that of Boltzmann Machines, but we can approximate certain expectations over the nodes of the RGN by their MAP estimates, which is only a mild assumption as the posterior distribution over the nodes is provably unimodal. Hence, it is possible to train RGN both exactly and efficiently, unlike DBMs. We also show how one might go about using this model for generative modeling.

Main Content
Current View