Stochastic Calculus of Variations

Calculus of variations

Calculus of variations is a field of mathematics that deals with functionals, as opposed to ordinary calculus which deals with functions. Such functionals can for example be formed as integrals involving an unknown function and its derivatives. The interest is in extremal functions: those making the functional attain a maximum or minimum value.

Perhaps the simplest example of such a problem is to find the curve of shortest length connecting two points. If there are no constraints, the solution is obviously a straight line between the points. However, if the curve is constrained to lie on a surface in space, then the solution is less obvious, and possibly many solutions may exist. Such solutions are known as geodesics. A related problem is posed by Fermat's principle: light follows the path of shortest optical length connecting two points, where the optical length depends upon the material of the medium. One corresponding concept in mechanics is the principle of least action.

Many important problems involve functions of several variables. Solutions of boundary value problems for the Laplace equation satisfy the Dirichlet principle. Plateau's problem requires finding a surface of minimal area that spans a given contour in space: the solution or solutions may be found by dipping a wire frame in a solution of soap suds. Although such experiments are relatively easy to perform, their mathematical interpretation is far from simple: there may be more than one locally minimizing surface, and they may have non-trivial topology.

Weak and strong extrema

Recall that the supremum norm for real continuous functions on a topological space X is defined as *
|y| = sup{y(x): x in X}.

A functional J(y) defined on some appropriate space of functions V with norm |cdot|_V is said to have a weak extremum at the point y_0 if there exists some delta > 0 such that, for all functions y with

|y - y_0|_V < delta,

J(y_0) - J(y) has the same sign. Typically, V is the space of r-times continuously differentiable functions on a compact subset E of the real line, with its norm given by

|y|_V = sum^r_{n = 0}sup{y^{(n)}(x): x in E}.

This norm is just the sum of the supremum norms of y and its derivatives.

A functional J is said to have a strong extremum at y_0 if J(y_0) - J(y) has the same sign for all functions in a delta-neighbourhood of y_0 in the norm of continuous functions, as opposed to whichever norm the space may have been given. If y_0 is a strong extremum for J then it is also a weak extremum, but the converse may not hold. Finding strong extrema is usually more difficult than finding weak extrema and in what follows it will be assumed that we are looking for weak extrema.

The Euler–Lagrange equation

Under ideal conditions, the maxima and minima of a given function may be located by finding the points where its derivative vanishes. By analogy, solutions of smooth variational problems may be obtained by solving the associated Euler–Lagrange equation. In order to illustrate this process, consider the problem of finding the shortest curve in the plane that connects two points (x_1, y_1) and (x_2, y_2). The arc length is given by

A[f] = int_{x_1}^{x_2} sqrt{1 + [f'(x) ]^2} , dx,

with

f'(x) = frac{df}{dx}, ,

and where y=f(x), f(x_1)=y_1 and f(x_2)=y_2. The function f should have at least one derivative in order to satisfy the requirements for valid application of the function, further, if f_0 is a local minimum and f_1 is an arbitrary function that vanishes at the endpoints x_1 and x_2 and with at least one derivative, then we must have

A[f_0] le A[f_0 + epsilon f_1]

for any number ε close to 0. Therefore, the derivative of A[f_0 + epsilon f_1] with respect to ε (the first variation of A) must vanish at ε=0. Thus

int_{x_1}^{x_2} frac{ f_0'(x) f_1'(x) } {sqrt{1 + [f_0'(x) ]^2}},dx =0, ,

for any choice of the function f_1. We may interpret this condition as the vanishing of all directional derivatives of A[f_0] in the space of differentiable functions, and this is formalized by requiring the Fréchet derivative of A to vanish at f_0. If we assume that f_0 has two continuous derivatives (or if we consider weak derivatives), then we may use integration by parts:

int_a^b u(x) v'(x),dx = left[u(x) v(x) right]_{a}^{b} - int_a^b u'(x) v(x),dx

with the substitution

u(x)=frac{ f_0'(x)} {sqrt{1 + [f_0'(x) ]^2}}, quad v'(x)=f_1'(x),

then we have

left[u(x) v(x) right]_{x_1}^{x_2} - int_{x_1}^{x_2} f_1(x) frac{d}{dx}left[frac{ f_0'(x) } {sqrt{1 + [f_0'(x) ]^2}} right] , dx =0,

but the first term is zero since v(x)=f_1(x) was chosen to vanish at x_1 and x_2 where the evaluation is taken. Therefore,

int_{x_1}^{x_2} f_1(x) frac{d}{dx}left[frac{ f_0'(x) } {sqrt{1 + [f_0'(x) ]^2}} right] , dx =0

for any twice differentiable function f_1 that vanishes at the endpoints of the interval. This is a special case of the fundamental lemma of calculus of variations:

I =int_{x_1}^{x_2} f_1(x) H(x), dx =0, ,

for any differentiable function f_1(x) that vanishes at the endpoints of the interval. Since f_1(x) is an arbitrary function within the integration range, we conclude that H(x) = 0. Therefore,

frac{d}{dx}left[frac{ f_0'(x) } {sqrt{1 + [f_0'(x) ]^2}} right] =0.,

It follows from this equation that

frac{d^2 f_0}{dx^2}=0,

and hence the extremals are straight lines.

A similar calculation holds in the general case where

A[f] = int_{x_1}^{x_2} L(x,f,f'), dx . ,

and f is required to have two continuous derivatives. Again, we find an extremal f_0 by setting f = f_0 + epsilon f_1, taking the derivative with respect to ε, and setting epsilon = 0 at the end:

begin{align} left.frac{dA}{depsilon}right|_{epsilon = 0} & = int_{x_1}^{x_2} left.frac{dL}{depsilon}right|_{epsilon = 0} dx & = int_{x_1}^{x_2} left(frac{partial L}{partial f} f_1 + frac{partial L}{partial f'} f'_1right), dx & = int_{x_1}^{x_2} left(frac{partial L}{partial f} f_1 - f_1 frac{d}{dx}frac{partial L}{partial f'} right), dx + left.frac{partial L}{partial f'} f_1 right|_{x_1}^{x_2} & = int_{x_1}^{x_2} f_1 left(frac{partial L}{partial f} - frac{d}{dx}frac{partial L}{partial f'} right), dx
& = 0,
end{align}

where we have used the chain rule in the second line and integration by parts in the third. As before, the last term in the third line vanishes due to our choice of f_1. Finally, according to the fundamental lemma of calculus of variations, we find that L will satisfy the Euler–Lagrange equation

-frac{d}{dx} frac{part L}{part f'} + frac{part L}{part f}=0,

In general this gives a second-order ordinary differential equation which can be solved to obtain the extremal f. The Euler–Lagrange equation is a necessary, but not sufficient, condition for an extremal. Sufficient conditions for an extremal are discussed in the references.

The Beltrami Identity

Frequently in physical problems, it turns out that part L/part x=0. In that case, the Euler-Lagrange equation can be simplified using the Beltrami identity:

L-f'frac{part L}{part f'}=C,

where C is a constant.

du Bois Reymond's theorem

The discussion thus far has assumed that extremal functions possess two continuous derivatives, although the existence of the integral A requires only first derivatives of trial functions. The condition that the first variation vanish at an extremal may be regarded as a weak form of the Euler-Lagrange equation. The theorem of du Bois Reymond asserts that this weak form implies the strong form. If L has continuous first and second derivatives with respect to all of its arguments, and if

frac{part^2 L}{(part f')^2} ne 0,

then f_0 has two continuous derivatives, and it satisfies the Euler-Lagrange equation.

Fermat's principle

Fermat's principle states that light takes a path that (locally) minimizes the optical length between its endpoints. If the x-coordinate is chosen as the parameter along the path, and y=f(x) along the path, then the optical length is given by

A[f] = int_{x=x_0}^{x_1} n(x,f(x)) sqrt{1 + f'(x)^2} dx, ,

where the refractive index n(x,y) depends upon the material. If we try f(x) = f_0 (x) + epsilon f_1 (x) then the first variation of A (the derivative of A with respect to ε) is

delta A[f_0,f_1] = int_{x=x_0}^{x_1} left[frac{ n(x,f_0) f_0'(x) f_1'(x)}{sqrt{1 + f_0'(x)^2}} + n_y (x,f_0) f_1 right] dx.

After integration by parts of the first term within brackets, we obtain the Euler-Lagrange equation

-frac{d}{dx} left[frac{ n(x,f_0) f_0'}{sqrt{1 + f_0'^2}} right] + n_y (x,f_0) =0. ,

The light rays may be determined by integrating this equation.

Snell's law

There is a discontinuity of the refractive index when light enters or leaves a lens. Let

n(x,y) = n_- quad hbox{if} quad x<0, ,
n(x,y) = n_+ quad hbox{if} quad x>0,,

where n_- and n_+ are constants. Then the Euler-Lagrange equation holds as before in the region where x<0 or x>0, and in fact the path is a straight line there, since the refractive index is constant. At the x=0, f must be continuous, but f' may be discontinuous. After integration by parts in the separate regions and using the Euler-Lagrange equations, the first variation takes the form

delta A[f_0,f_1] = f_1(0)left[n_-frac{f_0'(0_-)}{sqrt{1 + f_0'(0_-)^2}} -n_+frac{f_0'(0_+)}{sqrt{1 + f_0'(0_+)^2}} right].,

The factor multiplying n_- is the sine of angle of the incident ray with the x axis, and the factor multiplying n_+ is the sine of angle of the refracted ray with the x axis. Snell's law for refraction requires that these terms be equal. As this calculation demonstrates, Snell's law is equivalent to vanishing of the first variation of the optical path length.

Fermat's principle in three dimensions

It is expedient to use vector notation: let X=(x_1,x_2,x_3), let t be a parameter, let X(t) be the parametric representation of a curve C, and let dot X(t) be its tangent vector. The optical length of the curve is given by

A[C] = int_{t=t_0}^{t_1} n(X) sqrt{ dot X cdot dot X} dt. ,

Note that this integral is invariant with respect to changes in the parametric representation of C. The Euler-Lagrange equations for a minimizing curve have the symmetric form

frac{d}{dt} P = sqrt{ dot X cdot dot X} nabla n, ,

where

P = frac{n(X) dot X}{sqrt{dot X cdot dot X} }.,

It follows from the definition that P satisfies

P cdot P = n(X)^2. ,

Therefore the integral may also be written as

A[C] = int_{t=t_0}^{t_1} P cdot dot X , dt.,

This form suggests that if we can find a function ψ whose gradient is given by P, then the integral A is given by the difference of ψ at the endpoints of the interval of integration. Thus the problem of studying the curves that make the integral stationary can be related to the study of the level surfaces of ψ. In order to find such a function, we turn to the wave equation, which governs the propagation of light.

Connection with the wave equation

The wave equation for an inhomogeneous medium is

u_{tt} = c^2 nabla cdot nabla u, ,

where c is the velocity, which generally depends upon X. Wave fronts for light are characteristic surfaces for this partial differential equation: they satisfy

varphi_t^2 = c(X)^2 nabla varphi cdot nabla varphi. ,

We may look for solutions in the form

varphi(t,X) = t - psi(X). ,

In that case, ψ satisfies

nabla psi cdot nabla psi = n^2, ,

where n=1/c. According to the theory of first order partial differential equations, if P = nabla psi, then P satisfies

frac{dP}{ds} = 2 n nabla n, ,

along a system of curves (the light rays) that are given by

frac{dX}{ds} = P. ,

These equations for solution of a first-order partial differential equation are identical to the Euler-Lagrange equations if we make the identification

frac{ds}{dt} = frac{sqrt{ dot X cdot dot X} }{n}. ,

We conclude that the function ψ is the value of the minimizing integral A as a function of the upper end point. That is, when a family of minimizing curves is constructed, the values of the optical length satisfy the characteristic equation corresponding the wave equation. Hence, solving the associated partial differential equation of first order is equivalent to finding families of solutions of the variational problem. This is the essential content of the Hamilton-Jacobi theory, which applies to more general variational problems.

The action principle

The action was defined by Hamilton to be the time integral of the Lagrangian, L, which is defined as a difference of energies:

L = T - U, ,
where T is the kinetic energy of a mechanical system and U is the potential energy. Hamilton's principle (or the action principle) states that the motion of a mechanical system is such that the action integral
A[C] = int_{t=t_0}^{t_1} L(X, dot X) dt ,
is stationary with respect to variations in the path X(t). The Euler-Lagrange equations for this system are known as Lagrange's equations:
frac{d}{dt} frac{part L}{part dot X} = frac{part L}{part X}, ,
and they are equivalent to Newton's equations of motion.

The conjugate momenta P are defined by

P = frac{part L}{part dot X}. ,
For example, if
T = frac{1}{2} m dot x^2, ,
then
P = m dot x. ,
Hamiltonian mechanics results if the conjugate momenta are introduced in place of dot X, and the Lagrangian L is replaced by the Hamiltonian H defined by
H(X,P) = -L(X,dot X) + P cdot dot X.,
The Hamiltonian is the total energy of the system: H = T + U. Analogy with Fermat's principle suggests that solutions of Lagrange's equations (the particle trajectories) may be described in terms of level surfaces of some function of X. This function is a solution of the Hamilton-Jacobi equation:
frac{part psi}{part t} + H(X,nabla psi) =0.,

Functions of several variables

Variational problems that involve multiple integrals arise in numerous applications. For example, if φ(x,y) denotes the displacement of a membrane above the domain D in the x,y plane, then its potential energy is proportional to its surface area:
U[varphi] = iint_D sqrt{1 +nabla varphi cdot nabla varphi} dx,dy.,
Plateau's problem consists of finding a function that minimizes the surface area while assuming prescribed values on the boundary of D; the solutions are called minimal surfaces. The Euler-Lagrange equation for this problem is nonlinear:
varphi_{xx}(1 + varphi_y^2) + varphi_{yy}(1 + varphi_x^2) - 2varphi_x varphi_y varphi_{xy} = 0.,
See Courant (1950) for details.

Dirichlet's principle

It is often sufficient to consider only small displacements of the membrane, whose energy difference from no displacement is approximated by
V[varphi] = frac{1}{2}iint_D nabla varphi cdot nabla varphi , dx, dy.,
The functional V is to be minimized among all trial functions φ that assume prescribed values on the boundary of D. If u is the minimizing function and v is an arbitrary smooth function that vanishes on the boundary of D, then the first variation of V[u + epsilon v] must vanish:
frac{d}{depsilon} V[u + epsilon v]|_{epsilon=0} = iint_D nabla u cdot nabla v , dx,dy = 0.,
Provided that u has two derivatives, we may apply the divergence theorem to obtain
iint_D nabla cdot (v nabla u) ,dx,dy =
iint_D nabla u cdot nabla v + v nabla cdot nabla u ,dx,dy = int_C v frac{part u}{part n} ds, , where C is the boundary of D, s is arclength along C and part u / part n is the normal derivative of u on C. Since v vanishes on C and the first variation vanishes, the result is
iint_D vnabla cdot nabla u ,dx,dy =0 ,
for all smooth functions v that vanish on the boundary of D. The proof for the case of one dimensional integrals may be adapted to this case to show that
nabla cdot nabla u= 0 , in D.

The difficulty with this reasoning is the assumption that the minimizing function u must have two derivatives. Riemann argued that the existence of a smooth minimizing function was assured by the connection with the physical problem: membranes do indeed assume configurations with minimal potential energy. Riemann named this idea Dirichlet's principle in honor of his teacher Dirichlet. However Weierstrass gave an example of a variational problem with no solution: minimize

W[varphi] = int_{-1}^{1} (xvarphi')^2 , dx,
among all functions φ that satisfy varphi(-1)=-1 and varphi(1)=1. W can be made arbitrarily small by choosing piecewise linear functions that make a transition between -1 and 1 in a small neighborhood of the origin. However, there is no function that makes W=0. The resulting controversy over the validity of Dirichlet's principle is explained in http://turnbull.mcs.st-and.ac.uk/~history/Biographies/Riemann.html . Eventually it was shown that Dirichlet's principle is valid, but it requires a sophisticated application of the regularity theory for elliptic partial differential equations; see Jost and Li-Jost (1998).

Generalization to other boundary value problems

A more general expression for the potential energy of a membrane is

v[varphi] = iint_D left[frac{1}{2} nabla varphi cdot nabla varphi + f(x,y) varphi right] , dx,dy , + int_C left[frac{1}{2} sigma(s) varphi^2 + g(s) varphi right] , ds.
This corresponds to an external force density f(x,y) in D, an external force g(s) on the boundary C, and elastic forces with modulus sigma(s) acting on C. The function that minimizes the potential energy with no restriction on its boundary values will be denoted by u. Provided that f and g are continuous, regularity theory implies that the minimizing function u will have two derivatives. In taking the first variation, no boundary condition need be imposed on the increment v. The first variation of V[u + epsilon v] is given by
iint_D left[nabla u cdot nabla v + f v right] , dx, dy + int_C left[sigma u v + g v right] , ds =0. ,
If we apply the divergence theorem, the result is
iint_D left[-v nabla cdot nabla u + v f right] , dx , dy + int_C v left[frac{part u}{part n} + sigma u + g right] , ds =0. ,
If we first set v=0 on C, the boundary integral vanishes, and we conclude as before that
- nabla cdot nabla u + f =0 ,
in D. Then if we allow v to assume arbitrary boundary values, this implies that u must satisfy the boundary condition
frac{part u}{part n} + sigma u + g =0, ,
on C. Note that this boundary condition is a consequence of the minimizing property of u: it is not imposed beforehand. Such conditions are called natural boundary conditions.

The preceding reasoning is not valid if sigma vanishes identically on C. In such a case, we could allow a trial function varphi equiv c, where c is a constant. For such a trial function,

V[c] = cleft[iint_D f , dx,dy + int_C g ds right].
By appropriate choice of c, V can assume any value unless the quantity inside the brackets vanishes. Therefore the variational problem is meaningless unless
iint_D f , dx,dy + int_C g , ds =0.,
This condition implies that net external forces on the system are in equilibrium. If these forces are in equilibrium, then the variational problem has a solution, but it is not unique, since an arbitrary constant may be added. Further details and examples are in Courant and Hilbert (1953).

Eigenvalue problems

Both one-dimensional and multi-dimensional eigenvalue problems can be formulated as variational problems.

Sturm-Liouville problems

The Sturm-Liouville eigenvalue problem involves a general quadratic form

Q[varphi] = int_{x_1}^{x_2} left[p(x) varphi'(x)^2 + q(x) varphi(x)^2 right] , dx, ,
where φ is restricted to functions that satisfy the boundary conditions
varphi(x_1)=0, quad varphi(x_2)=0. ,
Let R be a normalization integral
R[varphi] =int_{x_1}^{x_2} r(x)varphi(x)^2 , dx.,
The functions p(x) and r(x) are required to be everywhere positive and bounded away from zero. The primary variational problem is to minimize the ratio Q/R among all φ satisfying the endpoint conditions. It is shown below that the Euler-Lagrange equation for the minimizing u is
-(pu')' +q u -lambda r u =0, ,
where λ is the quotient
lambda = frac{Q[u]}{R[u]}. ,
It can be shown (see Gelfand and Fomin 1963) that the minimizing u has two derivatives and satisfies the Euler-Lagrange equation. The associated λ will be denoted by lambda_1; it is the lowest eigenvalue for this equation and boundary conditions. The associated minimizing function will be denoted by u_1(x). This variational characterization of eigenvalues leads to the Rayleigh-Ritz method: choose an approximating u as a linear combination of basis functions (for example trigonometric functions) and carry out a finite-dimensional minimization among such linear combinations. This method is often surprisingly accurate.

The next smallest eigenvalue and eigenfunction can be obtained by minimizing Q under the additional constraint

int_{x_1}^{x_2} r(x) u_1(x) varphi(x) , dx=0. ,
This procedure can be extended to obtain the complete sequence of eigenvalues and eigenfunctions for the problem.

The variational problem also applies to more general boundary conditions. Instead of requiring that φ vanish at the endpoints, we may not impose any condition at the endpoints, and set

Q[varphi] = int_{x_1}^{x_2} left[p(x) varphi'(x)^2 + q(x)varphi(x)^2 right] , dx + a_1 varphi(x_1)^2 + a_2 varphi(x_2)^2, ,
where a_1 and a_2 are arbitrary. If we set varphi = u + epsilon v the first variation for the ratio Q/R is
V_1 = frac{2}{R[u]} left(int_{x_1}^{x_2} left[p(x) u'(x)v'(x) + q(x)u(x)v(x) -lambda u(x) v(x) right] , dx + a_1 u(x_1)v(x_1) + a_2 u(x_2)v(x_2) right) , ,
where λ is given by the ratio Q[u]/R[u] as previously. After integration by parts,
frac{R[u]}{2} V_1 = int_{x_1}^{x_2} v(x) left[-(p u')' + q u -lambda r u right] , dx + v(x_1)[-p(x_1)u'(x_1) + a_1 u(x_1)] + v(x_2) [p(x_2 u'(x_2) + a_2 u(x_2). ,
If we first require that v vanish at the endpoints, the first variation will vanish for all such v only if
-(p u')' + q u -lambda r u =0 quad hbox{for} quad x_1 < x < x_2.,
If u satisfies this condition, then the first variation will vanish for arbitrary v only if
-p(x_1)u'(x_1) + a_1 u(x_1)=0, quad hbox{and} quad p(x_2 u'(x_2) + a_2 u(x_2)=0.,
These latter conditions are the natural boundary conditions for this problem, since they are not imposed on trial functions for the minimization, but are instead a consequence of the minimization.

Eigenvalue problems in several dimensions

Eigenvalue problems in higher dimensions are defined in analogy with the one-dimensional case. For example, given a domain D with boundary B in three dimensions we may define

Q[varphi] = iiint_D p(X) nabla varphi cdot nabla varphi + q(X) varphi^2 , dx , dy , dz + iint_B sigma(S) varphi^2 , dS, ,
and
R[varphi] = iiint_D r(X) varphi(X)^2 , dx , dy , dz.,
Let u be the function that minimizes the quotient Q[varphi] / R[varphi], with no condition prescribed on the boundary B. The Euler-Lagrange equation satisfied by u is
-nabla cdot (p(X) nabla u) + q(x) u - lambda r(x) u=0,,
where
lambda = frac{Q[u]}{R[u]}.,
The minimizing u must also satisfy the natural boundary condition
p(S) frac{part u}{part n} + sigma(S) u =0,
on the boundary B. This result depends upon the regularity theory for elliptic partial differential equations; see Jost and Li-Jost (1998) for details. Many extensions, including completeness results, asymptotic properties of the eigenvalues and results concerning the nodes of the eigenfunctions are in Courant and Hilbert (1953).

See also

Reference books

  • Gelfand, I.M. and Fomin, S.V.: Calculus of Variations, Dover Publ., 2000
  • Lebedev, L.P. and Cloud, M.J.: The Calculus of Variations and Functional Analysis with Optimal Control and Applications in Mechanics, World Scientific, 2003, pages 1-98
  • Charles Fox: An Introduction to the Calculus of Variations, Dover Publ., 1987
  • Forsyth, A.R.: Calculus of Variations, Dover, 1960
  • Sagan, Hans: Introduction to the Calculus of Variations, Dover, 1992
  • Weinstock, Robert: Calculus of Variations with Applications to Physics and Engineering, Dover, 1974
  • Clegg, J.C.: Calculus of Variations, Interscience Publishers Inc., 1968
  • Courant, R.: Dirichlet's principle, conformal mapping and minimal surfaces. Interscience, 1950.
  • Courant, R. and D. Hilbert: Methods of Mathematical Physics, Vol I. Interscience Press, 1953.
  • Elsgolc, L.E.: Calculus of Variations, Pergamon Press Ltd., 1962
  • Jost, J. and X. Li-Jost: Calculus of Variations. Cambridge University Press, 1998.

References

Search another word or see Stochastic Calculus of Variationson Dictionary | Thesaurus |Spanish
Copyright © 2014 Dictionary.com, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature