The lambda calculus can be thought of as an idealized, minimalistic programming language. It is capable of expressing any algorithm, and it is this fact that makes the model of functional programming an important one. Functional programs are stateless and deal exclusively with functions that accept and return data (including other functions), but they produce no side effects in 'state' and thus make no alterations to incoming data. Modern functional languages, building on the lambda calculus, include Erlang, Haskell, Lisp, ML, and Scheme.
This article deals with the "untyped lambda calculus" as originally conceived by Church. Most modern applications concern typed lambda calculi.
Informal description
In lambda calculus, every expression is a unary function, i.e. a function with only one input (known as its argument). When an expression is applied to another expression ('called' with the other expression as its argument), it returns a single value (known as its result).
Since every expression is a unary function, and every argument and result are functions too, and as such lambda calculus is quite interesting and unique within both computation and mathematics.
A function is anonymously defined by a lambda expression which expresses the function's action on its argument. For instance, the "add-two" function f such that f(x) = x + 2 would be expressed in lambda calculus as λ x. x + 2 (or equivalently as λ y. y + 2; the name of the formal parameter is immaterial) and the application of the function f(3) would be written as (λ x. x + 2) 3. Note that part of what makes this description "informal" is that the expression x + 2 (or even the number 2) is not part of lambda calculus; an explanation of how numbers and arithmetic can be represented in lambda calculus is below. Function application is left associative: fxy = (fx) y. Consider the function which takes a function as an argument and applies it to the number 3 as follows: λ f. f 3. This latter function could be applied to our earlier "add-two" function as follows: (λ f. f 3) (λ x. x + 2). The three expressions:
(λ f. f 3) (λ x. x + 2)
(λ x. x + 2) 3
3 + 2
are equivalent.
A function of two variables is expressed in lambda calculus as a function of one argument which returns a function of one argument (see currying). For instance, the function f(x, y) = x - y would be written as λ x. λ y. x - y. A common convention is to abbreviate curried functions as, in this example, λ xy. x - y. While it is not part of the formal definition of the language,
λ x_{1}x_{2} … x_{n}. expression
is used as an abbreviation for
λ x_{1}. λ x_{2}. … λ x_{n}. expression
Not every lambda expression can be reduced to a definite value like the ones above; consider for instance
(λ x. xx) (λ x. xx)
or
(λ x. xxx) (λ x. xxx)
and try to visualize what happens when you start to apply the first function to its argument.
(λ x. xx) is also known as the ω combinator;
((λ x. xx) (λ x. xx))
is known as Ω,
((λ x. xxx) (λ x. xxx))
as Ω_{2}, etc.
Lambda calculus expressions may contain free variables, i.e. variables not bound by any λ. For example, the variable y is free in the expression (λ x. y) , representing a function which always produces the result y. Occasionally, this necessitates the renaming of formal arguments. For example, in the formula below, the letter y is used first as a formal parameter, then as a free variable:
(λ xy. yx) (λ x. y).
To reduce the expression, we rename the first identifier z so that the reduction does not mix up the names:
(λ xz. zx) (λ x. y)
the reduction is then
λ z. z (λ x. y).
If one only formalizes the notion of function application and replaces the use of lambda expressions by the use of combinators, one obtains combinatory logic.
Formal definition
Definition
Lambda expressions are composed of
variables v_{1}, v_{2}, . . . v_{n}
the abstraction symbols λ and .
parentheses ()
The set of lambda expressions, Λ, can be defined recursively:
If x is a variable, then x $in$ Λ
If x is a variable and M $in$ Λ, then (λ x . M ) $in$ Λ
If M, N $in$ Λ, then (M N ) $in$ Λ
Instances of 2 are known as abstractions and instances of 3, applications.
Notation
To keep the notation of lambda expressions uncluttered, the following conventions are usually applied.
Outermost parentheses are dropped: M N instead of (M N).
Applications are assumed to be left associative: M N P means (M N) P.
The body of an abstraction extends as far right as possible: λ x. M N means (λ x.M N) and not (λ x. M) N
A sequence of abstractions are contracted: λ x λ y λ z. N is abbreviated as λ x y z . N
Free and bound variables
The abstraction operator, λ, is said to bind its variable wherever it occurs in the body of the abstraction. Variables that fall within the scope of a lambda are said to be bound. All other variables are called free. For example in the following expression y is a bound variable and x is free:
λ y . xxy
Also note that a variable binds to its "nearest" lambda. In the following expression one single occurrence of x is bound by the second lambda:
λ x . y (λ x . z x)
The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by recursion on the structure of the terms, as follows:
FV(x ) = {x}, where x is a variable
FV (λ x . M ) = FV (M ) - {x}
FV (M N ) = FV (M ) $cup$ FV (N )
An expression which contains no free variables is said to be closed. Closed lambda expressions are also known as combinators and are equivalent to terms in combinatory logic.
Reduction
α-conversion
Alpha conversion allows bound variable names to be changed. For example, an alpha conversion of λx.x would be λy.y. Frequently in uses of lambda calculus, terms that differ only by alpha conversion are considered to be equivalent.
The precise rules for alpha conversion are not completely trivial. First, when alpha-converting an abstraction, the only variable occurrences that are renamed are those that are bound to the same abstraction. For example, an alpha conversion of λx.λx.x could result in λy.λx.x, but it could not result in λy.λx.y. The latter has a different meaning from the original.
Second, alpha conversion is not possible if it would result in a variable getting captured by a different abstraction. For example, if we replace x with y in λx.λy.x, we get λy.λy.y, which is not at all the same.
Substitution
Substitution, written E[V := E′], corresponds to the replacement of a variable V by expression E′ every place it is free within E. The precise definition must be careful in order to avoid accidental variable capture. For example, it is not correct for (λ x.y)[y := x] to result in (λ x.x), because the substituted x was supposed to be free but ended up being bound. The correct substitution in this case is (λ z.x), up-to α-equivalence.
Substitution on terms of the λ-calculus is defined by recursion on the structure of terms, as follows.
(λ y. M)[x := N] ≡ λ y. (M[x := N]), if x ≠ y and y∉fv(N)
Notice that substitution is defined uniquely up-to α-equivalence.
β-reduction
Beta reduction expresses the idea of function application. The beta reduction of ((λ V. E) E′) is simply E[V := E′] .
η-conversion
Eta conversion expresses the idea of extensionality, which in this context is that two functions are the same if and only if they give the same result for all arguments. Eta-conversion converts between λ x. fx and f whenever x does not appear free in f.
This conversion is not always appropriate when lambda expressions are interpreted as programs. Evaluation of λ x. fx can terminate even when evaluation of f does not.
Arithmetic in lambda calculus
There are several possible ways to define the natural numbers in lambda calculus, but by far the most common are the Church numerals, which can be defined as follows:
0 := λ fx. x
1 := λ fx. fx
2 := λ fx. f (fx)
3 := λ fx. f (f (fx))
and so on. A Church numeral is a higher-order function—it takes a single-argument function f, and returns another single-argument function. The Church numeral n is a function that takes a function f as argument and returns the n-th composition of f, i.e. the function f composed with itself n times. This is denoted f^{(n)} and is in fact the n-th power of f (considered as an operator); f^{(0)} is defined to be the identity function. Such repeated compositions (of a single function f) obey the laws of exponents, which is why these numerals can be used for arithmetic. Note that 1 returns f itself, i.e. it is essentially the identity function, and 0returns the identity function. (Also note that in Church's original lambda calculus, the formal parameter of a lambda expression was required to occur at least once in the function body, which made the above definition of 0 impossible.)
We can define a successor function, which takes a number n and returns n + 1 by adding an additional application of f:
SUCC := λ nfx. f (nfx)
Because the m-th composition of f composed with the n-th composition of f gives the m+n-th composition of f, addition can be defined as follows:
PLUS := λ mnfx. nf (mfx)
PLUS can be thought of as a function taking two natural numbers as arguments and returning a natural number; it can be verified that
PLUS 2 3 and 5
are equivalent lambda expressions. Since adding m to a number, n can be accomplished by adding 1 m times, an equivalent definition is:
PLUS := λ nm. m SUCC n
Similarly, multiplication can be defined as
MULT := λ mnf . m (nf)
Alternatively
MULT := λ mn. m (PLUS n) 0,
since multiplying m and n is the same as repeating the "add n" function m times and then applying it to zero.
The predecessor function defined by PRED n = n - 1 for a positive integer n and PRED 0 = 0 is considerably more difficult. The formula
PRED := λ nfx. n (λ gh. h (gf)) (λ u. x) (λ u. u)
can be validated by showing inductively that if T denotes (λ gh. h (gf)), then T^{(n)}(λ u. x) = (λ h. h(f^{(n-1)}(x)) ) for n > 0. Two other definitions of PRED are given below, one using conditionals and the other using pairs. With the predecessor function, subtraction is straightforward. Defining
SUB := λ mn. n PRED m,
SUB mn yields m - n when m > n and 0 otherwise.
Logic and predicates
By convention, the following two definitions (known as Church booleans) are used for the boolean values TRUE and FALSE:
TRUE := λ xy. x
FALSE := λ xy. y
(Note that FALSE is equivalent to the Church numeral zero defined above)
Then, with these two λ-terms, we can define some logic operators (these are just possible formulations; other expressions are equally correct):
AND := λ p q. p q p
OR := λ p q. p p q
NOT := λ p a b. p b a
IFTHENELSE := λ p. p
We are now able to compute some logic functions, for example:
AND TRUE FALSE
≡ (λ p q. p q p) TRUE FALSE →_{β} TRUE FALSE TRUE
≡ (λ x y. x) FALSE TRUE →_{β} FALSE
and we see that AND TRUE FALSE is equivalent to FALSE.
A predicate is a function which returns a boolean value. The most fundamental predicate is ISZERO which returns TRUE if its argument is the Church numeral 0, and FALSE if its argument is any other Church numeral:
ISZERO := λ n. n (λ x. FALSE) TRUE
The following predicate tests whether the first argument is less-than-or-equal-to the second:
LEQ := λ m n. ISZERO (SUB m n),
and since m = n iff LEQ m n and LEQ n m, it is straightforward to build a predicate for numerical equality.
The availability of predicates and the above definition of TRUE and FALSE make it convenient to write "if-then-else" expressions in lambda calculus. For example, the predecessor function can be defined as' '
PRED := λ n. n (λ g k. ISZERO (g 1) k (PLUS (g k) 1) ) (λ v. 0) 0
which can be verified by showing inductively that n (λ g k. ISZERO (g 1) k (PLUS (g k) 1) ) (λ v. 0) is the "add n - 1" function for n > 0.
Pairs
A pair (2-tuple) can be defined in terms of TRUE and FALSE, by using the Church encoding for pairs. For example, PAIR encapsulates the pair (x,y), FIRST returns the first element of the pair, and SECOND returns the second.
PAIR := λ xyf. fxy
FIRST := λ p. p TRUE
SECOND := λ p. p FALSE
NIL := λ x. TRUE
NULL := λp. p (λx y.FALSE)
A linked list can be defined as either NIL for the empty list, or the PAIR of an element and a smaller list. The predicate NULL tests for the value NIL.
As an example of the use of pairs, the shift-and-increment function that maps (m, n) to (n, n+1) can be defined as
Φ := λ x. PAIR (SECOND x) (SUCC (SECOND x))
which allows us to give perhaps the most transparent version of the predecessor function:
PRED := λ n. FIRST (n Φ (PAIR 0 0))
Recursion
Recursion is the definition of a function using the function itself; on the face of it, lambda calculus does not allow this. However, this impression is misleading. Consider for instance the factorial function f(n) recursively defined by
f(n) = 1, if n = 0; and n·f(n-1), if n>0.
In lambda calculus, one cannot define a function which includes itself. To get around this, one may start by defining a function, here called g, which takes a function f as an argument and returns another function that takes n as an argument:
g := λ fn. (1, if n = 0; and n·f(n-1), if n>0).
The function that g returns is either the constant 1, or n times the application of the function f to n-1. Using the ISZERO predicate, and boolean and algebraic definitions described above, the function g can be defined in lambda calculus.
However, g by itself is still not recursive; in order to use g to create the recursive factorial function, the function passed to g as f must have specific properties. Namely, the function passed as f must expand to the function g called with one argument -- and that argument must be the function that was passed as f again!
In other words, f must expand to g(f). This call to g will then expand to the above factorial function and calculate down to another level of recursion. In that expansion the function f will appear again, and will again expand to g(f) and continue the recursion. This kind of function, where f = g(f), is called a fixed-point of g, and it turns out that it can be implemented in the lambda calculus using what is known as the paradoxical operator or fixed-point operator and is represented as Y -- the Y combinator:
Y = λ g. (λ x. g (xx)) (λ x. g (xx))
In the lambda calculus, Y g is a fixed-point of g, as it expands to g (Yg). Now, to complete our recursive call to the factorial function, we would simply call g (Yg) n, where n is the number we are calculating the factorial of.
Given n = 5, for example, this expands to:
(λ n.(1, if n = 0; and n·((Y g)(n-1)), if n>0)) 5
1, if 5 = 0; and 5·(g(Y g)(5-1)), if 5>0
5·(g(Y g) 4)
5·(λ n. (1, if n = 0; and n·((Y g)(n-1)), if n>0) 4)
5·(1, if 4 = 0; and 4·(g(Y g)(4-1)), if 4>0)
5·(4·(g(Y g) 3))
5·(4·(λ n. (1, if n = 0; and n·((Y g)(n-1)), if n>0) 3))
5·(4·(1, if 3 = 0; and 3·(g(Y g)(3-1)), if 3>0))
5·(4·(3·(g(Y g) 2)))
...
And so on, evaluating the structure of the algorithm recursively. Every recursively defined function can be seen as a fixed point of some other suitable function, and therefore, using Y, every recursively defined function can be expressed as a lambda expression. In particular, we can now cleanly define the subtraction, multiplication and comparison predicate of natural numbers recursively.
Computable functions and lambda calculus
A function F: N → N of natural numbers is a computable functionif and only if there exists a lambda expression f such that for every pair of x, y in N, F(x) = y if and only if fx == y, where x and y are the Church numerals corresponding to x and y, respectively. This is one of the many ways to define computability; see the Church-Turing thesis for a discussion of other approaches and their equivalence.
Undecidability of equivalence
There is no algorithm which takes as input two lambda expressions and outputs TRUE or FALSE depending on whether or not the two expressions are equivalent. This was historically the first problem for which undecidability could be proven. As is common for a proof of undecidability, the proof shows that no computable function can decide the equivalence. Church's thesis is then invoked to show that no algorithm can do so.
Church's proof first reduces the problem to determining whether a given lambda expression has a normal form. A normal form is an equivalent expression which cannot be reduced any further. Then he assumes that this predicate is computable, and can hence be expressed in lambda calculus. Building on earlier work by Kleene and constructing a Gödel numbering for lambda expressions, he constructs a lambda expression e which closely follows the proof of Gödel's first incompleteness theorem. If e is applied to its own Gödel number, a contradiction results.
Implementing the lambda calculus on a computer involves treating "functions" as first-class objects, which raises implementation issues for stack-based programming languages. This is known as the Funarg problem.
The most prominent counterparts to lambda calculus in programming are functional programming languages, which essentially implement the calculus augmented with some constants and datatypes. Lisp uses a variant of lambda notation for defining functions, but only its purely functional subset ("Pure Lisp") is really equivalent to lambda calculus.
Functional languages are not the only ones to support functions as first-class objects. Numerous imperative languages, e.g. Pascal, have long supported passing subprograms as arguments to other subprograms. In C and the C-like subset of C++ the equivalent result is obtained by passing pointers to the code of functions (subprograms). Such mechanisms are limited to subprograms written explicitly in the code, and do not directly support higher-level functions. Some imperative object-oriented languages have notations that represent functions of any order; such mechanisms are available in C++, Smalltalk and more recently in Eiffel ("agents") and C# ("delegates"). As an example, the Eiffel "inline agent" expression
A Python example of this uses the lambda form of functions:
The same holds for Smalltalk expression
A similar C++ example (using the Boost.Lambda library):
A simple C# delegate taking a variable and returning the square. This function variable can then be passed to other methods (or function delegates)
and returning the result of the function
*/
double Execute(MathDelegate f)
{
return f(100);
}
In C# 3.0, the language has lambda expressions in a form similar to python or lisp. The expression resolves to a delegate like in the previous example but the above can be simplified to below.
//Create an delegate instance
MathDelegate f = i => i * i;
Execute(f);
// or more simply put
Execute(i => i * i);
Reduction strategies
Whether a term is normalising or not, and how much work needs to be done in normalising it if it is, depends to a large extent on the reduction strategy used. The distinction between reduction strategies relates to the distinction in functional programming languages between eager evaluation and lazy evaluation.
The following uses the term 'redex', short for 'reducible expression'. For example, (λ x. M) N is a beta-redex; λ x. M x is an eta-redex if x is not free in M. The expression to which a redex reduces is called its reduct; using the previous example, the reducts of these expressions are respectively M[x:=N] and M.Full beta reductions: Any redex can be reduced at any time. This means essentially the lack of any particular reduction strategy — with regard to reducibility, "all bets are off".Applicative order: The rightmost, innermost redex is always reduced first. Intuitively this means a function's arguments are always reduced before the function itself. Applicative order always attempts to apply functions to normal forms, even when this is not possible.
Most programming languages (including Lisp, ML and imperative languages like C and Java) are described as "strict", meaning that functions applied to non-normalising arguments are non-normalising. This is done essentially using applicative order, call by value reduction (see below), but usually called "eager evaluation".Normal order: The leftmost, outermost redex is always reduced first. That is, whenever possible the arguments are substituted into the body of an abstraction before the arguments are reduced.Call by name: As normal order, but no reductions are performed inside abstractions. For example λ x.(λ x.x)x is in normal form according to this strategy, although it contains the redex (λ x.x)x.Call by value: Only the outermost redexes are reduced: a redex is reduced only when its right hand side has reduced to a value (variable or lambda abstraction).Call by need: As normal order, but function applications that would duplicate terms instead name the argument, which is then reduced only "when it is needed". Called in practical contexts "lazy evaluation". In implementations this "name" takes the form of a pointer, with the redex represented by a thunk.
Applicative order is not a normalising strategy. The usual counterexample is as follows: define Ω = ωω where ω = λ x. xx. This entire expression contains only one redex, namely the whole expression; its reduct is again Ω. Since this is the only available reduction, Ω has no normal form (under any evaluation strategy). Using applicative order, the expression KIΩ = (λ x y . x)(λ x.x)Ω is reduced by first reducing Ω to normal form (since it is the rightmost redex), but since Ω has no normal form, applicative order fails to find a normal form for KIΩ.
In contrast, normal order is so called because it always finds a normalising reduction if one exists. In the above example, KIΩ reduces under normal order to I, a normal form. A drawback is that redexes in the arguments may be copied, resulting in duplicated computation (for example, (λ x.xx)((λ x.x)y) reduces to ((λx.x)y)((λx.x)y) using this strategy; now there are two redexes, so full evaluation needs two more steps, but if the argument had been reduced first, there would now be none).
The positive tradeoff of using applicative order is that it does not cause unnecessary computation if all arguments are used, because it never substitutes arguments containing redexes and hence never needs to copy them (which would duplicate work). In the above example, in applicative order (λ x.xx)((λ x.x)y) reduces first to (λ x.xx)y and then to the normal order yy, taking two steps instead of three.
Most purely functional programming languages (notably Miranda and its descendents, including Haskell), and the proof languages of theorem provers, use lazy evaluation, which is essentially the same as call by need. This is like normal order reduction, but call by need manages to avoid the duplication of work inherent in normal order reduction using sharing. In the example given above, (λ x.xx)((λ x.x)y) reduces to ((λx.x)y)((λx.x)y), which has two redexes, but in call by need they are represented using the same object rather than copied, so when one is reduced the other is too.
A note about complexity
While the idea of beta reduction seems simple enough, it is not an atomic step, in that it must have a non-trivial cost when estimating computational complexity. To be precise, one must somehow find the location of all of the occurrences of the bound variable V in the expression E, implying a time cost, or one must keep track of these locations in some way, implying a space cost. A naïve search for the locations of V in E is O(n) in the length n of E. This has led to the study of systems which use explicit substitution. Sinot's director strings offer a way of tracking the locations of free variables in expressions.
Concurrency and parallelism
The Church-Rosser property of the lambda calculus means that evaluation (β-reduction) can be carried out in any order, even concurrently. This means that various nondeterministic evaluation strategies are relevant. However, the lambda calculus does not offer any explicit constructs for parallelism. Various process calculi have been proposed as minimal languages for concurrency and distributed computation.
Semantics
The fact that lambda calculus terms act as functions on other lambda calculus terms, and even on themselves, led to questions about the semantics of the lambda calculus. Could a sensible meaning be assigned to lambda calculus terms? The natural semantics was to find a set D isomorphic to the function space D → D, of functions on itself. However, no nontrivial such D can exist, by cardinality constraints.
In the 1970s, Dana Scott showed that, if only continuous functions were considered, a set or domainD with the required property could be found, thus providing a model for the lambda calculus.
Church, Alonzo, An unsolvable problem of elementary number theory, American Journal of Mathematics, 58 (1936), pp. 345–363. This paper contains the proof that the equivalence of lambda expressions is in general not decidable.
Kleene, Stephen, A theory of positive integers in formal logic, American Journal of Mathematics, 57 (1935), pp. 153–173 and 219–244. Contains the lambda calculus definitions of several familiar functions.
Landin, Peter, A Correspondence Between ALGOL 60 and Church's Lambda-Notation, Communications of the ACM, vol. 8, no. 2 (1965), pages 89-101. Available from the ACM site A classic paper highlighting the importance of lambda-calculus as a basis for programming languages.