Efficiency_(statistics)

Efficiency (statistics)

In statistics, efficiency is one measure of desirability of an estimator. The efficiency of an unbiased statistic T is defined as

e(T) = frac{1/mathcal{I}(theta)}{mathrm{var}(T)}

where mathcal{I}(theta) is the Fisher information of the sample. Thus e(T) is the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér-Rao bound can be used to prove that e(T) le 1:

mathrm{var} left(widehat{theta}right) geq frac {1} {mathcal{I}(theta)}

1geq frac {1/mathcal{I}(theta)} {mathrm{var} left(widehat{theta}right)}
= e(T).

Efficient estimator

If an unbiased estimator of a parameter theta in Theta attains e(T) = 1 for all values of the parameter, then the estimator is called efficient.

Equivalently, the estimator achieves equality on the Cramér-Rao inequality for all theta in Theta.

An efficient estimator is also the minimum variance unbiased estimator (MVUE). This is because an efficient estimator maintains equality on the Cramér-Rao inequality for all parameter values, which means it attains the minimum variance for all parameters (the definition of the MVUE). The MVUE estimator, even if it exists, is not necessarily efficient, because "minimum" does not mean equality holds on the Cramér-Rao inequality.

Thus an efficient estimator need not exist, but if it does, it is the MVUE.

Asymptotic efficiency

For some estimators, they can attain efficiency asymptotically and are thus called asymptotically efficient estimators. This can be the case for some maximum likelihood estimators or for any estimators that attain equality of the Cramér-Rao bound asymptotically.

Examples

Consider a sample of size N drawn from a normal distribution of mean mu and unit variance (i.e., x[n] sim mathcal{N}(mu, 1)).

The sample mean, overline{x}, of the sample x[0], x[1], ldots, x[N-1], defined as

overline{x} = frac{1}{N} sum_{n=0}^{N-1} x[n]

has variance frac{1}{N}. This is equal to the reciprocal of the Fisher information from the sample (this is clear from the definition) and thus, by the Cramér-Rao inequality, the sample mean is efficient in the sense that its efficiency is unity.

Now consider the sample median. This is an unbiased and consistent estimator for mu. For large N the sample median is approximately normally distributed with mean mu and variance frac{pi}{2N} (i.e., x[n] sim mathcal{N}left(mu, frac{pi}{2N}right)). The efficiency is thus frac{2}{pi}, or about 64%. Note that this is the asymptotic efficiency — that is, the efficiency in the limit as sample size N tends to infinity. For finite values of N the efficiency is higher than this (for example, a sample size of 3 gives an efficiency of about 74%).

Many workers prefer the sample median as an estimator of the mean, holding that the loss in efficiency is more than compensated for by its enhanced robustness in terms of its insensitivity to outliers.

Relative efficiency

If T_1 and T_2 are estimators for the parameter theta, then T_1 is said to dominate T_2 if:

  1. its mean squared error (MSE) is smaller for at least some value of theta
  2. the MSE does not exceed that of T_2 for any value of θ.

Formally, T_1 dominates T_2 if

mathrm{E} left[
(T_1 - theta)^2
right] leq mathrm{E} left[
(T_2-theta)^2
right]

holds for all theta, with strict inequality holding somewhere.

The relative efficiency is defined as

e(T_1,T_2) = frac {mathrm{E} left[(T_2-theta)^2 right]} {mathrm{E} left[(T_1-theta)^2 right]}

Although e is in general a function of theta, in many cases the dependence drops out; if this is so, e being greater than one would indicate that T_1 is preferable, whatever the true value of theta.

See also

External links

Search another word or see Efficiency_(statistics)on Dictionary | Thesaurus |Spanish
  • Please Login or Sign Up to use the Recent Searches feature
FAVORITES
RECENT