# F1 score

In statistics, the F1 score (also F-score or F-measure) is a measure of a test's accuracy. It considers both the precision p and the recall r of the test to compute the score: p is the number of correct results divided by the number of all returned results and r is the number of correct results divided by the number of results that should have been returned. The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0.

The traditional F-measure or balanced F-score (F1 score) is the harmonic mean of precision and recall:

$F = 2 cdot \left(mathrm\left\{precision\right\} cdot mathrm\left\{recall\right\}\right) / \left(mathrm\left\{precision\right\} + mathrm\left\{recall\right\}\right).,$

The general formula for non-negative real β is:

$F_beta = \left(1 + beta^2\right) cdot \left(mathrm\left\{precision\right\} cdot mathrm\left\{recall\right\}\right) / \left(beta^2 cdot mathrm\left\{precision\right\} + mathrm\left\{recall\right\}\right).,$

The formula in terms of Type I and type II errors:

$F_beta = frac \left\{\left(1 + beta^2\right) cdot mathrm\left\{true positive\right\} \right\}\left\{\left(\left(1 + beta^2\right) cdot mathrm\left\{true positive\right\} + beta^2 cdot mathrm\left\{false positive\right\} + mathrm\left\{false negative\right\}\right)\right\}.,$

Two other commonly used F measures are the $F_\left\{2\right\}$ measure, which weights recall twice as much as precision, and the $F_\left\{0.5\right\}$ measure, which weights precision twice as much as recall.

The F-measure was derived by van Rijsbergen (1979) so that $F_beta$ "measures the effectiveness of retrieval with respect to a user who attaches β times as much importance to recall as precision". It is based on van Rijsbergen's effectiveness measure $E = 1-\left(1/\left(alpha/P + \left(1-alpha\right)/R\right)\right)$. Their relationship is $F_beta = 1 - E$ where $alpha=1/\left(beta^2+1\right)$.

