, a diversity index
is a statistic
which is intended to measure the biodiversity
of an ecosystem
. More generally, diversity indices can be used to assess the diversity of any population in which each member belongs to a unique species. Estimators
for diversity indices are likely to be biased
, so caution is advisable when comparing similar values.
The species richness
is simply the number of species present in an ecosystem. This index makes no use of relative abundances.
The species evenness is the relative abundance or proportion of individuals among the species.
Simpson's diversity index
is the fraction of all organisms which belong to the i-th species, then Simpson's diversity index
is most commonly defined as the statistic
This quantity was introduced by Edward Hugh Simpson.
If is the number of individuals of species which are counted, and is the total number of all individuals counted, then
is an estimator for Simpson's index for sampling without replacement
Note that , with values near zero corresponding to highly diverse or heterogeneous ecosystems and values near one corresponding to more homogeneous ecosystems. Biologists who find this confusing sometimes use instead; confusingly, this reciprocal quantity is also called Simpson's index. Another response is to redefine Simpson's index as
This quantity is called by statisticians the index of diversity
In sociology, psychology and management studies the index is often known as Blau's Index, as it was introduced into the literature by the sociologist Peter Blau.
Shannon's diversity index
Shannon's diversity index
is simply the ecologist's name for the communication entropy
introduced by Claude Shannon
is the fraction of individuals belonging to the i-th species. This is by far the most widely used diversity index. The intuitive significance of this index can be described as follows. Suppose we devise binary codewords
for each species in our ecosystem, with short codewords used for the most abundant species, and longer codewords for rare species. As we walk around and observe individual organisms, we call out the corresponding codeword. This gives a binary sequence. If we have used an efficient code, we will be able to save some breath by calling out a shorter sequence than would otherwise be the case. If so, the average codeword length we call out as we wander around will be close to the Shannon diversity index.
It is possible to write down estimators which attempt to correct for bias in finite sample sizes, but this would be misleading since communication entropy does not really fit expectations based upon parametric statistics. Differences arising from using two different estimators are likely to be overwhelmed by errors arising from other sources. Current best practice tends to use bootstrapping procedures to estimate communication entropy.
Shannon himself showed that his communication entropy enjoys some powerful formal properties, and furthermore, it is the unique quantity which does so. These observations are the foundation of its interpretation as a measure of statistical diversity (or "surprise", in the arena of communications). The applications of this quantity go far beyond the one discussed here; see the textbook cited below for an elementary survey of the extraordinary richness of modern information theory.
The Berger-Parker diversity index
This is an example of an index which uses only partial information about the relative abundances of the various species in its definition.
The Species richness, the Shannon index, Simpson's index, and the Berger-Parker index can all be identified as particular examples of quantities bearing a simple relation to the Renyi entropy
Unfortunately, the powerful formal properties of communication entropy do not generalize to Renyi's entropy, which largely explains the much greater power and popularity of Shannon's index with respect to its competitors.
- Colinvaux, Paul A. (1973). Introduction to Ecology. Wiley. ISBN 0-471-16498-4.
- Cover, Thomas M.; and Thomas, Joy A. (1991). Elements of Information Theory. Wiley. See chapter 5 for an elaboration of coding procedures described informally above.