Definitions

Characteristic function (probability theory)

In probability theory, the characteristic function of any random variable completely defines its probability distribution. On the real line it is given by the following formula, where X is any random variable with the distribution in question:

$varphi_X\left(t\right) = operatorname\left\{E\right\}left\left(e^\left\{itX\right\}right\right),$

where t is a real number, i is the imaginary unit, and E denotes the expected value.

If FX is the cumulative distribution function, then the characteristic function is given by the Riemann-Stieltjes integral

$operatorname\left\{E\right\}left\left(e^\left\{itX\right\}right\right) = int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{itx\right\},dF_X\left(x\right).,$

In cases in which there is a probability density function, fX, this becomes

$operatorname\left\{E\right\}left\left(e^\left\{itX\right\}right\right) = int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{itx\right\} f_X\left(x\right),dx.$

If X is a vector-valued random variable, one takes the argument t to be a vector and tX to be a dot product.

Every probability distribution on R or on Rn has a characteristic function, because one is integrating a bounded function over a space whose measure is finite, and for every characteristic function there is exactly one probability distribution.

The characteristic function of a symmetric PDF (that is, one with $p\left(x\right)=p\left(-x\right)$) is real, because the imaginary components obtained from $x>0$ cancel those from $x<0$.

Lévy continuity theorem

The core of the Lévy continuity theorem states that a sequence of random variables $scriptstyle \left(X_n\right)_\left\{n=1\right\}^infty$ where each $scriptstyle X_n$ has a characteristic function $scriptstyle varphi_n$ will converge in distribution towards a random variable $scriptstyle X$,
$X_n xrightarrow\left\{mathcal D\right\} X qquadtextrm\left\{as\right\}qquad n to infty$
if
$varphi_n quad xrightarrow\left\{textrm\left\{pointwise\right\}\right\} quad varphi qquadtextrm\left\{as\right\}qquad n to infty$
and $scriptstyle varphi\left(t\right)$ continuous in $scriptstyle t=0$ and $scriptstyle varphi$ is the characteristic function of $scriptstyle X$.

The Lévy continuity theorem can be used to prove the weak law of large numbers, see the proof using convergence of characteristic functions.

The inversion theorem

More than that, there is a bijection between cumulative probability distribution functions and characteristic functions. In other words, two distinct probability distributions never share the same characteristic function.

Given a characteristic function φ, it is possible to reconstruct the corresponding cumulative probability distribution function F:

$F_X\left(y\right) - F_X\left(x\right) = lim_\left\{tau to +infty\right\} frac\left\{1\right\} \left\{2pi\right\}$
int_{-tau}^{+tau} frac{e^{-itx} - e^{-ity}} {it}, varphi_X(t), dt.

In general this is an improper integral; the function being integrated may be only conditionally integrable rather than Lebesgue integrable, i.e. the integral of its absolute value may be infinite.

Reference: see (P. Levy, Calcul des probabilites, Gauthier-Villars, Paris, 1925. p166)

Bochner-Khinchin theorem

An arbitrary function $scriptstyle varphi$ is a characteristic function corresponding to some probability law $scriptstyle mu$ if and only if the following three conditions are satisfied:

(1) $scriptstyle varphi ,$ is continuous

(2) $scriptstyle varphi\left(0\right) = 1 ,$

(3) $scriptstyle varphi ,$ is a positive definite function (note that this is a complicated condition which is not equivalent to $scriptstyle varphi >0$).

Uses of characteristic functions

Because of the continuity theorem, characteristic functions are used in the most frequently seen proof of the central limit theorem. The main trick involved in making calculations with a characteristic function is recognizing the function as the characteristic function of a particular distribution.

Basic properties

Characteristic functions are particularly useful for dealing with functions of independent random variables. For example, if X1, X2, ..., Xn is a sequence of independent (and not necessarily identically distributed) random variables, and

$S_n = sum_\left\{i=1\right\}^n a_i X_i,,!$

where the ai are constants, then the characteristic function for Sn is given by


varphi_{S_n}(t)=varphi_{X_1}(a_1t)varphi_{X_2}(a_2t)cdots varphi_{X_n}(a_nt). ,!

In particular, $varphi_\left\{X+Y\right\}\left(t\right) = varphi_X\left(t\right)varphi_Y\left(t\right)$. To see this, write out the definition of characteristic function:

$varphi_\left\{X+Y\right\}\left(t\right)=Eleft\left(e^\left\{it\left(X+Y\right)\right\}right\right)=Eleft\left(e^\left\{itX\right\}e^\left\{itY\right\}right\right)=Eleft\left(e^\left\{itX\right\}right\right)Eleft\left(e^\left\{itY\right\}right\right)=varphi_X\left(t\right) varphi_Y\left(t\right)$.

Observe that the independence of $X$ and $Y$ is required to establish the equality of the third and fourth expressions.

Another special case of interest is when $a_i=1/n$ and then $S_n$ is the sample mean. In this case, writing $overline\left\{X\right\}$ for the mean,

$varphi_\left\{overline\left\{X\right\}\right\}\left(t\right)=left\left(varphi_X\left(t/n\right)right\right)^n.$

Moments

Characteristic functions can also be used to find moments of a random variable. Provided that the nth moment exists, characteristic function can be differentiated n times and

$operatorname\left\{E\right\}left\left(X^nright\right) = i^\left\{-n\right\}, varphi_X^\left\{\left(n\right)\right\}\left(0\right)$
= i^{-n}, left[frac{d^n}{dt^n} varphi_X(t)right]_{t=0}. ,!

For example, suppose $X$ has a standard Cauchy distribution. Then $varphi_X\left(t\right)=e^\left\{-|t|\right\}$. See how this is not differentiable at $t=0$, showing that the Cauchy distribution has no expectation. Also see that the characteristic function of the sample mean $overline\left\{X\right\}$ of $n$ independent observations has characteristic function $varphi_\left\{overline\left\{X\right\}\right\}\left(t\right)=\left(e^\left\{-|t|/n\right\}\right)^n=e^\left\{-|t|\right\}$, using the result from the previous section. This is the characteristic function of the standard Cauchy distribution: thus, the sample mean has the same distribution as the population itself.

The logarithm of a characteristic function is a cumulant generating function, which is useful for finding cumulants.

An example

The Gamma distribution with scale parameter θ and a shape parameter k has the characteristic function
$\left(1 - theta,i,t\right)^\left\{-k\right\}.$
Now suppose that we have
$X ~sim Gamma\left(k_1,theta\right) mbox\left\{ and \right\} Y sim Gamma\left(k_2,theta\right)$
with X and Y independent from each other, and we wish to know what the distribution of X + Y is. The characteristic functions are
$varphi_X\left(t\right)=\left(1 - theta,i,t\right)^\left\{-k_1\right\},,qquad varphi_Y\left(t\right)=\left(1 - theta,i,t\right)^\left\{-k_2\right\}$
which by independence and the basic properties of characteristic function leads to
$varphi_\left\{X+Y\right\}\left(t\right)=varphi_X\left(t\right)varphi_Y\left(t\right)=\left(1 - theta,i,t\right)^\left\{-k_1\right\}\left(1 - theta,i,t\right)^\left\{-k_2\right\}=left\left(1 - theta,i,tright\right)^\left\{-\left(k_1+k_2\right)\right\}.$
This is the characteristic function of the gamma distribution scale parameter θ and shape parameter k1 + k2, and we therefore conclude
$X+Y sim Gamma\left(k_1+k_2,theta\right) ,$
The result can be expanded to n independent gamma distributed random variables with the same scale parameter and we get
$forall i in \left\{1,ldots, n\right\} : X_i sim Gamma\left(k_i,theta\right) qquad Rightarrow qquad sum_\left\{i=1\right\}^n X_i sim Gammaleft\left(sum_\left\{i=1\right\}^nk_i,thetaright\right).$

Multivariate characteristic functions

If $X$ is a multivariate PDF, then its characteristic function is defined as


varphi_X(t)=Eleft(e^{itcdot x}right).

Here, the dot signifies vector dot product ($t$ is in the dual space of $x$).

Example

If $Xsim N\left(0,Sigma\right)$ is a multivariate Gaussian with zero mean, then

varphi_X(t)=Eleft(e^{itcdot x}right)

int_{xin R^n}frac{1}{left|2piSigmaright|^{1/2}}e^{-frac{1}{2}x^TSigma^{-1}x}cdot e^{itcdot x}dx

e^{-frac{1}{2}t^TSigma t}.

Matrix-valued random variables

If $X$ is a matrix-valued PDF, then the characteristic function is



varphi_X(T)=Eleft(e^{i, mathrm{Tr}(XT)}right)

Here $mathrm\left\{Tr\right\}\left(cdot\right)$ is the trace function and matrix multiplication (of $T$ and $X$) is used. Note that the order of the multiplication is immaterial ($XTneq TX$ but $tr\left(XT\right)=tr\left(TX\right)$).

Examples of matrix-valued PDFs include the Wishart distribution.

Related concepts

Related concepts include the moment-generating function and the probability-generating function. The characteristic function exists for all probability distributions. However this is not the case for moment generating function.

The characteristic function is closely related to the Fourier transform: the characteristic function of a probability density function $p\left(x\right)$ is the complex conjugate of the continuous Fourier transform of $p\left(x\right)$ (according to the usual convention; see ).

$varphi_X\left(t\right) = langle e^\left\{itX\right\} rangle = int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{itx\right\}p\left(x\right), dx = overline\left\{left\left(int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{-itx\right\}p\left(x\right), dx right\right)\right\} = overline\left\{P\left(t\right)\right\},$

where $P\left(t\right)$ denotes the continuous Fourier transform of the probability density function $p\left(x\right)$. Likewise, $p\left(x\right)$ may be recovered from $varphi_X\left(t\right)$ through the inverse Fourier transform:

$p\left(x\right) = frac\left\{1\right\}\left\{2pi\right\} int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{itx\right\} P\left(t\right), dt = frac\left\{1\right\}\left\{2pi\right\} int_\left\{-infty\right\}^\left\{infty\right\} e^\left\{itx\right\} overline\left\{varphi_X\left(t\right)\right\}, dt.$

Indeed, even when the random variable does not have a density, the characteristic function may be seen as the Fourier transform of the measure corresponding to the random variable.

References

• Lukacs E. (1970) Characteristic Functions. Griffin, London. pp. 350
• Bisgaard, T. M., Sasvári, Z. (2000) Characteristic Functions and Moment Sequences, Nova Science

Search another word or see Characteristic_function_(probability_theory)on Dictionary | Thesaurus |Spanish