- Main
Towards efficient, effective, and robust Neural Architecture Search methods
- Wang, Ruochen
- Advisor(s): Hsieh, Cho-Jui
Abstract
Recently, Neural Architecture Search (NAS) has attracted lots of attention for its potential to democratize deep learning. For a practical end-to-end deep learning platform, NAS plays a crucial role in discovering task-specific architecture depending on users' configurations (e.g., dataset, evaluation metric, etc.). Among various search paradigms, Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity, accomplished by jointly optimizing the model weight and architecture parameters in a weight-sharing supernet via gradient-based algorithms. At the end of the search phase, the operations with the largest architecture parameters will be selected to form the final architecture, with the implicit assumption that the values of architecture parameters reflect the operation strength. Despite the search efficiency, the weight-sharing supernet also shows a tendency towards non-parametric operations, resulting in shallow architectures with degenerated performance. We provide both theoretical and empirical analysis of the poor generalization observed in Differentiable NAS, which links this issue to the failure of the magnitude-based selection. Following this inspiration, we discuss two lines of methods that greatly improve the effectiveness and robustness of Differentiable NAS: The first line proposes an alternative perturbation-based architecture selection that is shown to identify better architectures in the search space, whereas the second line aligns the architecture parameter with the strength of underlying operations. To complete the picture, an alternative paradigm to the differential architecture search (predictor-based NAS) is also presented.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-