Convexity, Classification, and Risk Bounds

Authors: Bartlett, Peter L.1; Jordan, Michael I.1; McAuliffe, Jon D.2

Source: Journal of the American Statistical Association, Volume 101, Number 473, March 2006 , pp. 138-156(19)

Publisher: American Statistical Association

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content

Abstract:

Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0–1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0–1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function—that it satisfies a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise, and show that in this case, strictly convex loss functions lead to faster rates of convergence of the risk than would be implied by standard uniform convergence arguments. Finally, we present applications of our results to the estimation of convergence rates in function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.

Keywords: BOOSTING; CONVEX OPTIMIZATION; EMPIRICAL PROCESS THEORY; MACHINE LEARNING; RADEMACHER COMPLEXITY; SUPPORT VECTOR MACHINE

Document Type: Research article

DOI: 10.1198/016214505000000907

Affiliations: 1: Department of Statistics and the Computer Science Division, University of California, Berkeley, CA 94720 2: Department of Statistics, University of California, Berkeley, CA 94720; Department of Statistics, University of Pennsylvania, Philadelphia, PA 19104

* This feature is in beta and some links may initially be displayed as numbers instead of article titles. Clicking on any of the links will take you to the recommended articles, regardless of the display of the link.

The full text article is available for purchase

$23.50 plus tax

 

OR

Back to top

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content
Page Help Click here for Page Help
Shopping cart
Tools
Sign in






Need to register?
Sign up here
Text size: A | A | A | A