The goal of a generative model is to capture the distribution underlying the
data, typically through latent variables. After training, these variables are
often used as a new representation, more effective than the original features
in a variety of learning tasks. However, the representations constructed by
contemporary generative models are usually point-wise deterministic mappings
from the original feature space. Thus, even with representations robust to
class-specific transformations, statistically driven models trained on them
would not be able to generalize when the labeled data is scarce. Inspired by
the stochasticity of the synaptic connections in the brain, we introduce
Energy-based Stochastic Ensembles. These ensembles can learn non-deterministic
representations, i.e., mappings from the feature space to a family of
distributions in the latent space. These mappings are encoded in a distribution
over a (possibly infinite) collection of models. By conditionally sampling
models from the ensemble, we obtain multiple representations for every input
example and effectively augment the data. We propose an algorithm similar to
contrastive divergence for training restricted Boltzmann stochastic ensembles.
Finally, we demonstrate the concept of the stochastic representations on a
synthetic dataset as well as test them in the one-shot learning scenario on
MNIST.