|
NeuroCOLT
Technical Report NC-TR-96-011
Worst-case Quadratic
Loss Bounds for Prediction Using Linear Functions and Gradient Descent
Nicolò
Cesa-Bianchi
DSI
Università di Milano
Philip
M. Long
Duke University
Manfred
K. Warmuth
UC Santa Cruz
Abstract
In this paper we study the performance of gradient descent when applied
to the problem of on-line linear prediction in arbitrary inner product
spaces. We show worst-case bounds on the sum of the squared prediction
errors under various assumptions concerning the amount of a
priori information about the sequence to predict. The algorithms
we use are variants and extensions of on-line gradient descent. Whereas
our algorithms always predict using linear functions as hypotheses,
none of our results requires the data to be linearly related. In fact,
the bounds proved on the total prediction loss are typically expressed
as a function of the total loss of the best fixed linear predictor
with bounded norm. All the upper bounds are tight to within constants.
Matching lower bounds are provided in some cases. Finally, we apply
our results to the problem of on-line prediction for classes of smooth
functions.
Download Compressed
Postscript
|