Efficient Learning across Multiple Domains with Deep Neural Networks
- Author(s): Guo, Yunhui
- Advisor(s): Rosing, Tajana S
- et al.
Learning with data from multiple domains is a longstanding topic in machine learning research. In recent years, deep neural networks (DNN) have shown remarkable performance on different machine learning tasks. However, how to efficiently utilize deep neural networks for learning with multiple domains is largely unexploited. A model aware of the relationships between different domains can be trained to work on new domains with fewer resources and achieve better performance. However, to identify and leverage the transferable structure is challenging.
In this dissertation, we propose novel methods which allow efficient learning across multiple domains in several different scenarios. First, we address the problem of learning across two image domains with deep neural networks. We propose two adaptive methods which allow different images to fine-tune and reuse different residual blocks and convolutional filters of the pre-trained model. Experimental results show that the proposed SpotTune outperforms the standard fine-tuning on 12 out of 14 datasets. Second, we consider the case that the target domain only has few examples per category which is referred to as the cross-domain few-shot problem. We establish a new benchmark for cross-domain few-shot learning and propose a multi-model selection algorithm which achieves an average improvement of 2\% compared with the state-of-the-art approach on the proposed benchmark. Third, we consider learning with multiple domains simultaneously. We propose a multi-domain learning method based on depthwise separable convolution which achieves the highest score on the Visual Decathlon Challenge and reduces the number of parameters by 50% compared with the state-of-the-art approach. We further propose an efficient multi-domain learning method for distributed training in sensor networks. The proposed method can reduce the communication cost by up to 53% and energy consumption by up to 67% without accuracy degradation compared with conventional approaches. Finally, we address learning with multiple domains sequentially. We propose an algorithm called mixed stochastic gradient descent (MEGA) which allows the model to maintain the performance on old domains while being trained on a new domain. MEGA achieves an average accuracy of 91.21±0.10% on Permuted MNIST, which is 2% better than the previous state-of-the-art model. On Split CIFAR, the proposed MEGA achieves an average accuracy of 66.12±1.93%, which is about 5% better than the state-of-the-art method.