A geometric perspective on some topics in statistical learning
- Author(s): Wei, Yuting
- Advisor(s): Wainwright, Martin J
- Guntuboyina, Adityanand
- et al.
Modern science and engineering often generate data sets with a large sample size and a comparably large dimension which puts classic asymptotic theory into question in many ways. Therefore, the main focus of this thesis is to develop a fundamental understanding
of statistical procedures for estimation and hypothesis testing from a non-asymptotic point of view, where both the sample size and problem dimension grow hand in hand. A range of different problems are explored in this thesis, including work on the geometry of hypothesis testing, adaptivity to local structure in estimation, effective methods for shape-constrained problems, and early stopping with boosting algorithms.
Our treatment of these different problems shares the common theme of emphasizing the underlying geometric structure.
To be more specific, in our hypothesis testing problem, the null and alternative are specified by a pair of convex cones. This cone structure makes it possible for a sharp characterization of the behavior of Generalized Likelihood Ratio Test (GLRT) and its optimality property. The problem of planar set estimation based on noisy measurements of its support function, is a non-parametric problem in nature. It is interesting to see that estimators can be constructed such that they are more efficient in the case when the underlying set has a simpler structure, even without knowing the set beforehand. Moreover, when we consider applying boosting algorithms to estimate a function in reproducing kernel Hibert space (RKHS), the optimal stopping rule and the resulting estimator turn out to be determined by the localized complexity of the space.
These results demonstrate that, on one hand, one can benefit from respecting and making use of the underlying structure (optimal early stopping rule for different RKHS); on the other hand, some procedures (such as GLRT or local smoothing estimators) can achieve better performance when the underlying structure is simpler, without prior knowledge of the structure itself.
To evaluate the behavior of any statistical procedure, we follow the classic minimax framework and also discuss about more refined notion of local minimaxity.