Fires, rainstorms or insect swarms produce natural sounds made up of rapidly occurring
acoustic events. which we call ”sound textures”. This kind of phenomena have been studied
by computational audio community [MS11] and neural science people for a long time. From
previous studies, it has been verified that sound textures can be schematically synthesized
from statistical models fairly well. Here we take a novel approach involving neural networks
or deep learning methods. Specifically, we use cooperative training of a descriptor and a
generator network, modeled as a convolutional neural network(ConvNet) and a deconvolu-
tional neural network(DeconvNet) respectively. From several experiments, we proved that
our framework can capture the essence of sound textures and synthesize identifiable natural
sound