Definitions

Multinomial distribution

{{Probability distribution|
` pdf_image  =|`
` cdf_image  =|`
` name       =Multinomial|`
` type       =mass|`
` parameters =$n > 0$ number of trials (integer)$p_1, ldots p_k$ event probabilities ($Sigma p_i = 1$)|`
support =$X_i in \left\{0,dots,n\right\}$
$Sigma X_i = n!$| pdf =$frac\left\{n!\right\}\left\{x_1!cdots x_k!\right\} p_1^\left\{x_1\right\} cdots p_k^\left\{x_k\right\}$|
` cdf        =|`
mean =$E\left\{X_i\right\} = np_i$|
` median     =|`
` mode       =|`
variance =$\left\{mathrm\left\{Var\right\}\right\}\left(X_i\right) = n p_i \left(1-p_i\right)$
$\left\{mathrm\left\{Cov\right\}\right\}\left(X_i,X_j\right) = - n p_i p_j$ ($ineq j$)|
` skewness   =|`
` kurtosis   =|`
` entropy    =|`
mgf =$left\left(sum_\left\{i=1\right\}^k p_i e^\left\{t_i\right\} right\right)^n$|
` char       =|`
conjugate =Dirichlet: $mathrm\left\{Dir\right\}\left(alpha+beta\right)$| }} In probability theory, the multinomial distribution is a generalization of the binomial distribution.

The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. In a multinomial distribution, each trial results in exactly one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk (so that pi ≥ 0 for i = 1, ..., k and $sum_\left\{i=1\right\}^k p_i = 1$), and there are n independent trials. Then let the random variables $X_i$ indicate the number of times outcome number i was observed over the n trials. $X=\left(X_1,ldots,X_k\right)$ follows a multinomial distribution with parameters n and p, where p = (p1, ..., pk).

Specification

Probability mass function

The probability mass function of the multinomial distribution is:

begin\left\{align\right\}
f(x_1,ldots,x_k;n,p_1,ldots,p_k) & {} = Pr(X_1 = x_1mbox{ and }dotsmbox{ and }X_k = x_k) & {} = begin{cases} { displaystyle {n! over x_1!cdots x_k!}p_1^{x_1}cdots p_k^{x_k}}, quad & mbox{when } sum_{i=1}^k x_i=n 0 & mbox{otherwise,} end{cases} end{align}

for non-negative integers x1, ..., xk.

Properties

The expected value of draws in the ith bin is

$operatorname\left\{E\right\}\left(X_i\right) = n p_i.$

The covariance matrix is as follows. Each diagonal entry is the variance of a binomially distributed random variable, and is therefore

$operatorname\left\{var\right\}\left(X_i\right)=np_i\left(1-p_i\right).$

The off-diagonal entries are the covariances:

$operatorname\left\{cov\right\}\left(X_i,X_j\right)=-np_i p_j$

for i, j distinct.

All covariances are negative because for fixed N, an increase in one component of a multinomial vector requires a decrease in another component.

This is a k × k nonnegative-definite matrix of rank k − 1.

The off-diagonal entries of the corresponding correlation matrix are

$rho\left(X_i,X_j\right) = -sqrt\left\{frac\left\{p_i p_j\right\}\left\{ \left(1-p_i\right)\left(1-p_j\right)\right\}\right\}.$

Note that the sample size drops out of this expression.

Each of the k components separately has a binomial distribution with parameters n and pi, for the appropriate value of the subscript i.

The support of the multinomial distribution is the set :$\left\{\left(n_1,dots,n_k\right)in mathbb\left\{N\right\}^\left\{k\right\}| n_1+cdots+n_k=n\right\}.$ Its number of elements is

$\left\{n+k-1 choose k\right\} = leftlangle begin\left\{matrix\right\}n k end\left\{matrix\right\}rightrangle,$

the number of n-combinations of a multiset with k types, or multiset coefficient.

Sampling from a multinomial distribution

First, reorder the parameters $p_1, ldots p_k$ such that they are sorted descendingly (this is only to speed up computation and not strictly necessary). Now, for each trial, draw an auxiliary variable $x$ from a uniform $\left(0,1\right)$ distribution. The resulting outcome is the component $j = arg min_\left\{j\text{'}=1\right\}^\left\{k\right\} left\left(sum_\left\{i=1\right\}^\left\{j\text{'}\right\} p_i ge x right\right)$.