"Generalized functions" were introduced by Sergei Sobolev in 1935. They were independently introduced in the late 1940s by Laurent Schwartz, who developed a comprehensive theory of distributions.
Basic idea
The basic idea is to identify functions with abstract linear functionals on a space of unproblematic test functions (conventional and well-behaved functions). Operators on distributions can be understood by moving them to the test function.
be a smooth (that is, infinitely differentiable) function with compact support (i.e., identically zero outside of some bounded set). The function φ is the "test function." We then set
.
This is a real number which linearly and continuously depends on φ. One can therefore think of the function f as a continuous linear functional on the space which consists of all the "test functions" φ.
Similarly, if P is a probability distribution on the reals and φ is a test function, then
is a real number that continuously and linearly depends on φ: probability distributions can thus also be viewed as continuous linear functionals on the space of test functions. This notion of "continuous linear functional on the space of test functions" is therefore used as the definition of a distribution.
Such distributions may be multiplied with real numbers and can be added together, so they form a real vector space. In general it is not possible to define a multiplication for distributions, but distributions may be multiplied with infinitely differentiable functions.
To define the derivative of a distribution, we first consider the case of a differentiable and integrable function f : R → R. If φ is a test function, then we have
using integration by parts (note that φ is zero outside of a bounded set and that therefore no boundary values have to be taken into account). This suggests that if S is a distribution, we should define its derivative S' by
.
It turns out that this is the proper definition; it extends the ordinary definition of derivative, every distribution becomes infinitely differentiable and the usual properties of derivatives hold.
Example: The Dirac delta (so-called Dirac delta function) is the distribution defined by
so . because of compact support. Similarly, the derivative of the Dirac delta is the distribution
This latter distribution is our first example of a distribution which is neither a function nor a probability distribution.
Test functions and distributions
In the sequel, real-valued distributions on an open subsetU of Rn will be formally defined. With minor modifications, one can also define complex-valued distributions, and one can replace Rn by any (paracompact) smooth manifold.
The first object to define is the space D(U) of test functions on U. Once this is defined, it is then necessary to equip it with a topology by defining the limit of a sequence of elements of D(U). The space of distributions will then be given as the space of continuous linear functionals on D(U).
Test function space
The space D(U) of test functions on U is defined as follows. A function φ : U → R is said to have compact support if there exists a compact subset K of U such that φ(x) = 0 for all x in UK. The elements of D(U) are the infinitely differentiable functions φ : U → R with compact support — also known as bump functions. This is a real vector space. It can be given a topology by defining the limit of a sequence of elements of D(U). A sequence (φk) in D(U) is said to converge to φ ∈ D(U) if the following two conditions hold :
There is a compact set K ⊂ U containing the supports of all φk:
For each multiindex α, the sequence of partial derivatives Dαφk tends uniformly to Dαφ.
where DKi is the set of all smooth functions with support lying in Ki. The topology on D(U) is the final topology of the family of nested metric spaces DKi and so D(U) is an LF-space. The topology is not metrizable by the Baire category theorem, since D(U) is the union of subspaces of the first category in D(U) .
Distributions
A distribution on U is a linear functionalS : D(U) → R with values in R (or C), such that
for any convergent sequence φn in D(U). The space of all distributions on U is denoted by D'(U). Equivalently, the vector space D'(U) is the continuous dual space of the topological vector space D(U).
The dual pairing between a distribution S in D′(U) and a test function φ in D(U) is denoted using angle brackets thus:
Equipped with the weak-* topology, the space D'(U) is a locally convex topological vector space. In particular, a sequence (Sk) in D'(U) converges to a distribution S if and only if
for all test functions φ. This is the case if and only if Skconverges uniformly to S on all bounded subsets of D(U). (A subset E of D(U) is bounded if there exists a compact subset K of U and numbers dn such that every φ in E has its support in K and has its n-th derivatives bounded by dn.)
Functions as distributions
The function ƒ : U → R is called locally integrable if it is Lebesgue integrable over every compact subset K of U. This is a large class of functions which includes all continuous functions and all Lp functions. The topology on D(U) is defined in such a fashion that any locally integrable function ƒ yields a continuous linear functional on D(U), denoted here by Tƒ, whose value on the test function φ is given by the Lebesgue integral:
Conventionally, one abuses notation by identifying Tƒ with ƒ, provided no confusion can arise, and thus the pairing between ƒ and φ is often written
If ƒ and g are two locally integrable functions, then the associated distributions Tƒ and Tg are equal the same element of D'(U) if and only if ƒ and g are equal almost everywhere (see, for instance, ). In a similar manner, every Radon measure μ on U defines an element of D'(U) whose value on the test function φ is ∫φ dμ. As above, it is conventional to abuse notation and write the pairing between a Radon measure μ and a test function φ as .
The test functions are themselves locally integrable, and so define distributions. As such they are dense in D'(U) with respect to the topology on D'(U) in the sense that for any distribution S ∈ D'(U), there is a sequence φn ∈ D(U) such that
for all ψ ∈ D(U). This follows at once from the Hahn-Banach theorem, since by an elementary fact about weak topologies the dual of D'(U) with its weak-* topology is the space D(U) . This can also be proven more constructively by a convolution argument.
Operations on distributions
Many operations which are defined on smooth functions with compact support can also be defined for distributions. In general, if
is a linear mapping of vector spaces which is with respect to the weak-* topology, then it is possible to extend T to a mapping
by passing to the limit. (This approach works for more general non-linear mappings as well, provided they are assumed to be uniformly continuous.)
In practice, however, it is more convenient to define operations on distributions by means of the transpose (or adjoint transformation) (). If T : D(U) → D(U) is a continuous linear operator, then the transpose is an operator T* : D(U) → D(U) such that
for all φ, ψ ∈ D(U). If such an operator T* exists, and is continuous, then the original operator T may be extended to distributions by defining
Differentiation
If T : D(U) → D(U) is given by the partial derivative
By integration by parts, if φ and ψ are in D(U), then
so that T* = −T. This is a continuous linear transformation D(U) → D(U). So, if S ∈ D'(U) is a distribution, then the partial derivative of S with respect to the coordinate xk is defined by the formula
for all test functions φ. In this way, every distribution is infinitely differentiable, and the derivative in the direction xk is a linear operator on D′(U). In general, if α = (α1, ..., αn) is an arbitrary multi-index and ∂α denotes the associated mixed partial derivative operator, the mixed partial derivative ∂αS of the distribution S ∈ D′(U) is defined by