On the Crossroads of Scattering Transform and Machine Learning in Image and Signal Processing
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

On the Crossroads of Scattering Transform and Machine Learning in Image and Signal Processing

Abstract

Convolutional neural network (CNN) has been effective in solving image and signal processing problems. On the other hand, scattering transform (ST) mathematically formalizes some properties that have made the CNN successful in solving these problems. Its network structure is similar to a CNN, except that it provides interpretable ST coefficients and does not require a gigantic dataset. The ST network generates a robust representation stable to local deformation while keeping essential high-frequency components of an input signal through a cascade of wavelet convolutions with nonlinear operations followed by averaging. In dissertation, motivated by the mathematics behind analyticity and monogenicity, we propose new ST networks to solve these problems.

We present the novel incorporation of the generalized Morse wavelet into the 1-D ST network (Morse-STN) for music genre classification, instead of the commonly-used Morlet wavelet. The reason behind is that the class of generalized Morse wavelets is a superfamily of analytic wavelets, but the Morlet wavelet is only approximately analytic. A significant improvement in the classification accuracy of music genre can be demonstrated in the GTZAN music signal dataset using the generalized Morse wavelet rather than the Morlet wavelet.

A new Monogenic Wavelet Scattering Network (MWSN) is also proposed for 2-D texture image classification, instead of using the 2-D Morlet wavelet in the standard 2-D ST network. Our MWSN extracts valuable hierarchical features with interpretable ST coefficients which help us explain the result. We illustrate the superior performance of our MWSN over the standard STN from the experiment in the CUReT texture image database. The improvement can be explained by the natural extension of 1-D analyticity to 2-D monogenicity. Lastly, we apply the proposed ST networks in sonar signal classfication. Mine counter-measure (MCM) is crucial for the US Navy, but it relies on accurate mine detection. We synthesize sonar signals using both Gabor and real dolphin signals as sources for sonar classification. Then the generated synthetic aperture sonar (SAS) signals from the underwater objects are classified by the new ST feature extractors. The result can be explained by the ST coefficients together with interpretable simple classifiers such as the logistic regression and the support vector machines, while deep learning approaches lack interpretability.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View