The
smoothing spline is a method of
smoothing, or fitting a smooth curve to a set of noisy observations.
Definition
Let
be a sequence of observations, modeled by the relation
. The smoothing spline estimate
of the function
is defined to be the minimizer (over the class of twice differentiable functions) of
sum_{i=1}^n (Y_i - hatmu(x_i))^2 + lambda int hatmu(x)^2 ,dx.
Remarks:
- is a smoothing parameter, controlling the trade-off between fidelity to the data and roughness of the function estimate.
- The integral is evaluated over the range of the .
- As (no smoothing), the smoothing spline converges to the interpolating spline.
- As (infinite smoothing), the roughness penalty becomes paramount and the estimate converges to a linear least-squares estimate.
- The roughness penalty based on the second derivative is the most common in modern statistics literature, although the method can easily be adapted to penalties based on other derivatives.
- In early literature, with equally-spaced , second or third-order differences were used in the penalty, rather than derivatives.
- When the sum-of-squares term is replaced by a log-likelihood, the resulting estimate is termed penalized likelihood. The smoothing spline is the special case of penalized likelihood resulting from a Gaussian likelihood.
Derivation of the smoothing spline
It is useful to think of fitting a smoothing spline in two steps:
- First, derive the values .
- From these values, derive for all x.
Now, treat the second step first.
Given the vector of fitted values, the sum-of-squares part of the spline criterion is fixed. It remains only to minimize , and the minimizer is a natural cubic spline that interpolates the points . This interpolating spline is a linear operator, and can be written in the form
hatmu(x) = sum_{i=1}^n hatmu(x_i) f_i(x)
where
are a set of spline basis functions. As a result, the roughness penalty has the form
int hatmu''(x)^2 dx = hat{m}^T A hat{m}.
where the elements of
A are
. The basis functions, and hence the matrix
A, depend on the configuration of the predictor variables
, but not on the responses
or
.
Now back the first step. The penalized sum-of-squares can be written as
|Y - hat m|^2 + lambda hat{m}^T A hat m,
where
.
Minimizing over
gives
hat m = (I + lambda A)^{-1} Y.
Related methods
Smoothing splines are related to, but distinct from:
- Regression splines. In this method, the data is fitted to a set of spline basis functions with a reduced set of knots, typically by least squares. No roughness penalty is used.
- Penalized Splines. This combines the reduced knots of regression splines, with the roughness penalty of smoothing splines.
Further reading
- Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.
- Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. CRC Press.
References