Skip to main content
eScholarship
Open Access Publications from the University of California

Department of Statistics, UCLA

Department of Statistics Papers bannerUCLA

An Introduction to Ensemble Methods for Data Analysis

Abstract

There are a growing number of new statistical procedures Leo Breiman (2001b) has called "algorithmic". Coming from work primarily in statistics, applied mathematics, and computer science, these techniques are sometimes linked to "data mining", "machine learning", and "statistical learning". A key idea behind algorithmic methods is that there is no statistical model in the usual sense; no effort to made to represent how the data were generated. And no apologies are made for the absence of a model. Rather, there is some practical data analysis problem to solve that is attacked directly with procedures designed specifically for that purpose. If, for example, the goal is to determine which prison inmates are likely to engage in some form of serious misconduct while in prison (Berk and Baek, 2003), there is a classification problem is to be addressed. Should the goal be to minimize some function of classification errors, procedures are applied with that minimization problem paramount. There is no need to represent how the data were generated if it is possible to accurately classify inmates by other means.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View