A study of Generalization in Deep Neural Networks
One of the important challenges today in deep learning is explaining the outstanding power of generalization of deep neural networks and how they are able to avoid the curse of dimensionality and perform exceptionally well in tasks such as computer vision, natural language processing and recently, physical problems like protein folding. Various bounds have been proposed on generalization error of DNNs, however all of the proposed bounds have been empirically shown to be numerically vacuous. In this study we approach the problem of understanding generalization of DNNs by investigating the role of different attributes of DNNs, both structural - such as width, depth, kernel parameters, skip connections, etc - as well as functional - such as intermediate feature representations, receptive fields of CNN kernels, etc, in affecting the generalization. We provide experimental results showing which of the properties above influence generalization the most and which of them, the least and discuss the results relating them to the theoretical bounds.