In
mathematics and
multivariate statistics, the
centering matrix is a
symmetric and
idempotent matrix, which when multiplied with a vector has the same effect as subtracting the
mean of the components of the vector from every component.
Definition
The
centering matrix of size
n is defined as the
n-by-
n matrix
where
is the
identity matrix of size
n,
is the column-vector of
n ones and where
denotes
matrix transpose. For example
0 end{bmatrix}
,
C_2 = left[begin{array}{rrr}
frac{1}{2} & -frac{1}{2}
-frac{1}{2} & frac{1}{2}
end{array} right]
,
C_3 = left[begin{array}{rrr}
frac{2}{3} & -frac{1}{3} & -frac{1}{3}
-frac{1}{3} & frac{2}{3} & -frac{1}{3}
-frac{1}{3} & -frac{1}{3} & frac{2}{3}
end{array} right]
Properties
Given a column-vector,
of size
n, the
centering property of
can be expressed as
where
is the mean of the components of
.
is symmetric positive semi-definite.
is idempotent, so that , for . Once you have removed the mean, it is zero and removing it again has no effect.
is singular. The effects of applying the transformation cannot be reversed.
has the eigenvalue 1 of multiplicity n − 1 and 0 of multiplicity 1.
has a nullspace of dimension 1, along the vector .
is a projection matrix. That is, is a projection of onto the (n − 1)-dimensional subspace that is orthogonal to the nullspace . (This is the subspace of all n-vectors whose components sum to zero.)
Application
Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an
m-by-
n matrix
, the multiplication
removes the means from each of the
n columns, while
removes the means from each of the
m rows.
The centering matrix provides in particular a succinct way to express the scatter matrix, of a data sample , where is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as
References