Skip to main content
eScholarship
Open Access Publications from the University of California

On robust regression with high-dimensional predictors.

  • Author(s): El Karoui, Noureddine
  • Bean, Derek
  • Bickel, Peter J
  • Lim, Chinghway
  • Yu, Bin
  • et al.

Published Web Location

https://statistics.berkeley.edu/sites/default/files/tech-reports/812.pdf
No data is associated with this publication.
Abstract

We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but p ≤ n. We find an exact stochastic representation for the distribution of β = argmin(β∈ℝ(p)) Σ(i=1)(n) ρ(Y(i) - X(i')β) at fixed p and n under various assumptions on the objective function ρ and our statistical model. A scalar random variable whose deterministic limit rρ(κ) can be studied when p/n → κ > 0 plays a central role in this representation. We discover a nonlinear system of two deterministic equations that characterizes rρ(κ). Interestingly, the system shows that rρ(κ) depends on ρ through proximal mappings of ρ as well as various aspects of the statistical model underlying our study. Several surprising results emerge. In particular, we show that, when p/n is large enough, least squares becomes preferable to least absolute deviations for double-exponential errors.

Item not freely available? Link broken?
Report a problem accessing this item