Definitions

# Chain rule

In calculus, the chain rule is a formula for the derivative of the composite of two functions.

In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x, then the rate of change of y with respect to x can be computed as the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.

## Informal discussion

For an explanation of notation used in this section, see Function composition.
The chain rule states that, under appropriate conditions,

$\left(f circ g\right)\text{'}\left(x\right) = f\text{'}\left(g\left(x\right)\right) g\text{'}\left(x\right),,$

which in short form is written as

$\left(f circ g\right)\text{'} = f\text{'}circ gcdot g\text{'}.$

Alternatively, in the Leibniz notation, the chain rule is

$frac \left\{dy\right\}\left\{dx\right\} = frac \left\{dy\right\} \left\{du\right\} cdotfrac \left\{du\right\}\left\{dx\right\}.$

In integration, the counterpart to the chain rule is the substitution rule.

## Theorem

The chain rule in one variable may be stated more completely as follows. Let g be a real-valued function on (a,b) which is differentiable at c ∈ (a,b); and f a real-valued function defined on an interval I containing the range of g and g(c) as an interior point. If f is differentiable at g(c), then

• $\left(fcirc g\right)\left(x\right)$ is differentiable at x = c, and
• $\left(fcirc g\right)\text{'}\left(c\right) = f\text{'}\left(g\left(c\right)\right)g\text{'}\left(c\right).$

## Examples

### Example I

Suppose that a mountain climber ascends at a rate of 0.5 kilometers per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °C per kilometer. If one multiplies 6 °C per kilometer by 0.5 kilometer per hour, one obtains 3 °C per hour. This calculation is a typical chain rule application.

### Example II

Consider the function f(x) = (x2 + 1)3. Since f(x) = h(g(x)) where g(x) = x2 + 1 and h(x) = x3 it follows from the chain rule that

$f \text{'}\left(x\right) ,$ $= h \text{'}\left(g\left(x\right)\right) g \text{'} \left(x\right) ,$ $= 3\left(g\left(x\right)\right)^2\left(2x\right) ,$ $= 3\left(x^2 + 1\right)^2\left(2x\right) ,$
$= 6x\left(x^2 + 1\right)^2. ,$

In order to differentiate the trigonometric function

$f\left(x\right) = sin\left(x^2\right),,$
one can write f(x) = h(g(x)) with h(x) = sin x and g(x) = x2. The chain rule then yields
$f\text{'}\left(x\right) = 2x cos\left(x^2\right) ,$
since h′(g(x)) = cos(x2) and g′(x) = 2x.

### Example III

Differentiate arctan(sin x).

$frac\left\{d\right\}\left\{dx\right\}arctan x = frac\left\{1\right\}\left\{1+x^2\right\}$

Thus, by the chain rule,

$frac\left\{d\right\}\left\{dx\right\}arctan f\left(x\right) = frac\left\{f\text{'}\left(x\right)\right\}\left\{1+f^2\left(x\right)\right\},,$

and in particular,

$frac\left\{d\right\}\left\{dx\right\}arctan\left(sin x\right) = frac\left\{cos x\right\}\left\{1+sin^2 x\right\},.$

## Chain rule for several variables

The chain rule works for functions of more than one variable. Consider the function z = f(x, y) where x = g(t) and y = h(t), and g(t) and h(t) are differentiable with respect to t, then
$\left\{ dz over dt\right\}=\left\{partial z over partial x\right\}\left\{dx over dt\right\}+\left\{partial z over partial y\right\}\left\{dy over dt\right\}.$

Suppose that each argument of z = f(u, v) is a two-variable function such that u = h(x, y) and v = g(x, y), and that these functions are all differentiable. Then the chain rule would look like:

$\left\{partial z over partial x\right\}=\left\{partial z over partial u\right\}\left\{partial u over partial x\right\}+\left\{partial z over partial v\right\}\left\{partial v over partial x\right\}$

$\left\{partial z over partial y\right\}=\left\{partial z over partial u\right\}\left\{partial u over partial y\right\}+\left\{partial z over partial v\right\}\left\{partial v over partial y\right\}.$

If we considered

$vec r = \left(u,v\right)$
above as a vector function, we can use vector notation to write the above equivalently as the dot product of the gradient of f and a derivative of $vec r$:
$frac\left\{partial f\right\}\left\{partial x\right\}=vec nabla f cdot frac\left\{partial vec r\right\}\left\{partial x\right\}.$

More generally, for functions of vectors to vectors, the chain rule says that the Jacobian matrix of a composite function is the product of the Jacobian matrices of the two functions:

$frac\left\{partial\left(z_1,ldots,z_m\right)\right\}\left\{partial\left(x_1,ldots,x_p\right)\right\} = frac\left\{partial\left(z_1,ldots,z_m\right)\right\}\left\{partial\left(y_1,ldots,y_n\right)\right\} frac\left\{partial\left(y_1,ldots,y_n\right)\right\}\left\{partial\left(x_1,ldots,x_p\right)\right\}.$

## Proof of the chain rule

Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,

$g\left(x+delta\right)-g\left(x\right)= delta g\text{'}\left(x\right) + epsilon\left(delta\right)delta ,$
where ε(δ) → 0 as δ → 0. Similarly,
$f\left(g\left(x\right)+alpha\right) - f\left(g\left(x\right)\right) = alpha f\text{'}\left(g\left(x\right)\right) + eta\left(alpha\right)alpha ,$
where η(α) → 0 as α → 0.

Now

$f\left(g\left(x+delta\right)\right)-f\left(g\left(x\right)\right),$ $= f\left(g\left(x\right) + delta g\text{'}\left(x\right)+epsilon\left(delta\right)delta\right) - f\left(g\left(x\right)\right) ,$
$= alpha_delta f\text{'}\left(g\left(x\right)\right) + eta\left(alpha_delta\right)alpha_delta ,$

where

$alpha_delta = delta g\text{'}\left(x\right) + epsilon\left(delta\right)delta. ,$
Observe that as δ → 0, αδ/δg′(x) and αδ → 0, and thus η(αδ) → 0. It follows that
$frac\left\{f\left(g\left(x+delta\right)\right)-f\left(g\left(x\right)\right)\right\}\left\{delta\right\} to g\text{'}\left(x\right)f\text{'}\left(g\left(x\right)\right)mbox\left\{ as \right\} delta to 0.$

## The fundamental chain rule

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : EF and g : FG are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative (the Fréchet derivative) of the composition g o f at the point x is given by

$mbox\left\{D\right\}_xleft\left(g circ fright\right) = mbox\left\{D\right\}_\left\{fleft\left(xright\right)\right\}left\left(gright\right) circ mbox\left\{D\right\}_xleft\left(fright\right).$

Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.

A particularly clear formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let

f : MN and g : NP

be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write

$mbox\left\{d\right\}left\left(g circ fright\right) = mbox\left\{d\right\}g circ mbox\left\{d\right\}f.$

In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C manifolds with C maps as morphisms.

## Tensors and the chain rule

See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.

## Higher derivatives

Faà di Bruno's formula generalizes the chain rule to higher derivatives. The first few derivatives are
$frac\left\{d \left(f circ g\right) \right\}\left\{dx\right\} = frac\left\{df\right\}\left\{dg\right\}frac\left\{dg\right\}\left\{dx\right\}$

frac{d^2 (f circ g) }{d x^2} = frac{d^2 f}{d g^2}left(frac{dg}{dx}right)^2 + frac{df}{dg}frac{d^2 g}{dx^2}

frac{d^3 (f circ g) }{d x^3} = frac{d^3 f}{d g^3} left(frac{dg}{dx}right)^3 + 3 frac{d^2 f}{d g^2} frac{dg}{dx} frac{d^2 g}{d x^2} + frac{df}{dg} frac{d^3 g}{d x^3}

frac{d^4 (f circ g) }{d x^4} =frac{d^4 f}{dg^4} left(frac{dg}{dx}right)^4 + 6 frac{d^3 f}{d g^3} left(frac{dg}{dx}right)^2 frac{d^2 g}{d x^2} + frac{d^2 f}{d g^2} left{ 4 frac{dg}{dx} frac{d^3 g}{dx^3} + 3left(frac{d^2 g}{dx^2}right)^2right}

+ frac{df}{dg}frac{d^4 g}{dx^4}.