Introduction ------------ References: Field + Harrison, Ch. 6 Stoy, Ch. 5 Glaser, Hankin + Till, Ch. 3 o Lambda-calculus is a model of computation established by Church, A. and Curry, H.B. in 1930's o lambda-calculus vs Turing Machine --------------- -- -------------- - functional languages, - imperative languages e.g. LISP, Scheme, etc. e.g. Pascal, C, etc. - better suited for proving properties of programs. - parallelism. - to compute new function, - to compute a new function, rewrite new lambda-exp; design a new machine; - input + output are also - input + output are on the tape. lambda-exp. o Equivalence : Any problem that can be formulated in one of these two models can be formulated in the other. o example of an "impure" lambda-calculus term : ( from now on, '\' is used as the symbol for lambda ) __________________ / - function body : involves use of the vvv formal parameters. \x.x+1 ^^ \_____________ - function of 1 parameter. - formal parameter is named as x. - the function has no name. o To compute, apply function to arguments. V-------------- the argument e.g. (\x.x+a)3 ^^^^^^^^--------------- the function o Computation involves rewriting. Rewrites the whole expression until it cannot be rewritten. - replacing all formal parameters by function body : e.g., (\x.x+1)3 ==> 3+1 ==> 4 - imprecise rewriting rule : given function applied to argument. rewrite to : function body with all occurrences of formal replaced with argument. o there are no primitive functions in pure lambda-calculus, but there are built-in functions, e.g., "+", and constant, etc. in various "applied" lambda-calculi. The Pure lambda-calculus ------------------------ o Syntax of a lambda-expression : (1) a variable is a lambda-expression. ( by convention, single, lower-case letters ) (2) if M and N are lambda-expressions, then so are : (a) (M) (b) \x.M "abstraction" ( defines a function ) (c) MN "application" ( apply a function to arguments ) no constants, such as integers. no built-in functions, such as '+'. no types o the syntax can be roughly represented by the following grammar : --> ID | ( ) | \ID. | ; - application is left-associative e.g. ABC = ((AB)C) - application has higher precedence than abstraction. e.g. \x.AB = \x.(AB) but not (\x.A)B o Confused? build an abstract-syntax tree. e.g. (\x.\y.x)\z.z parse tree : exp --------------------- application / \ exp \ / | \ \ / | \ \ / exp---\-----|---------------- lambda | / \ | | | / exp--|---exp--------------- lambda | | / \ | / \ | | | exp | | | | | | | | | | ( \x. \y. x ) \z. z abstract syntax tree : apply / \ lambda lambda / \ / \ x lambda z z / \ y x o How to compute? To compute : (1) look for an application where its left child is lambda, the apply's right child is the argument, the lambda's left child is the formal, the lambda's right child is the function body. (2) rewrite to function body with occurrence of formal replaced with argument. rewrite =========> lambda / \ y lambda / \ z z fn. body fn. body vvvvv v e.g. (\x.\y.xy)\z.z ===> \y.((\z.z)y) ===> \y.y ^ ^^^^ ^ ^ formal arg. formal arg. Try : ((\x.x)\y.y)z o problems with current rewriting rules : (1) sometimes we don't want to replace all the occurrences of formal body. e.g. (\x. x+((\x.x+1)(3)) )2 ^ ^ don't want to replace! if we use current rule : ===> 2 + ((\x.2+1)(3)) ===> 2 + 2 + 1 ===> 5 the right one : ===> 2 + ((\x.x+1)(3)) ===> 2 + ( 3 + 1 ) ===> 6 (2) consider \x.\y.x -- a 2-argument function that ignores the second argument and returns the first one. v v ((\x.\y.x)w)z | ((\x.\y.x)y)z | | | | | | v | v (\y.w)z | (\y.y)z | | | | | | v | v w | z | no problem Oops! name clash! real parameter has the same name as formal parameter. o terminology - scoping : associating uses of variables with "declarations" where \x is declaration of x. - bound variable : a variable that is associated with a lambda. - free variable : a variable that is NOT associated with a lambda. o In a lambda expression M, variable x is bound if x is in the subtree of a lambda with left child = x. - like static scoping in Pascal. o Precise definition of free / bound variables. (1) in expression x - x is free. - there do not exist any bound - variables. (2) in expression \x.M - every x in M is bound. - for every other var y, if y is free in M, then it is free in \x.M; if y is bound in M, then it is bound in \x.M. (3) in expression MN - (2) + same for vars in N. note : Ident 'y' can be BOTH free and bound in expression MN, as each individual occurrences of y can be either free or bound. o Problems with imprecise rewriting rules : #1. it didn't respect scope rules. #2. it permits a var that is free in arg, to become bound. e.g. ((\x.\y.x)y)z ===> (\y.y)z ^ ^ free bound - solution to problem #1 : (\x.M)N ===> M with all free occurrences of x replaced with N. - solution to problem #2 : use renaming rule called alpha-reduction. idea : names of formals are irrelevant, so change as needed to avoid capture. o alpha-reduction is applied to abstractions e.g. \x.M it replaces x and all FREE occurrences of x in M with some other ident that is NOT in M. e.g. \x(\y.x) ------> \z.\y.z alpha \x(\x.x) ------> \z.\x.x alpha o The Precise rewriting rule is called beta-reduction : It is defined using substitution ( which includes alpha-reduction ) ------------------------------ redex - reducible expression vvvvvvv (\x.M)N -----> M[N/x] <--- "contractum" beta ^^^^^-------- substitution : substitute N for free occurrences of x in M, also avoid capture. o Define substitution by cases on form of M (1) M : ID ID[N/x] = if ID is x then N else ID (2) M : (AB) (AB)[N/x] = A[N/x]B[N/x] (3) M : \ID.B (\ID.B)[N/x] = if ID is x then \ID.B (solution to problem #1) else (ID is not x) if ID occurs free in N then choose a new ID' that - is not free in N - is not in B - is not x use ID' to alpha-reduce M, so M is now : \ID'.B[ID'/ID] let : B' = B[ID'/ID] in \ID'.(B'[N/x]) else (ID is NOT free in N) \ID.(B[N/x]) e.g. v--- (2) apply ((\x.\y.x) y) z ^------ (1) apply (1) (\x.\y.x) y ---> (\y.x)[y/x] \ ---> \y.x -----> \w.x \ alpha \ ---> \w.(x[y/x]) \ ---------------> \w.y beta (2) (\w.y)z ---> y[z/w] \ ----------> y beta beta-reduction (lambda x.M)N -B> M[N/x] alpha-reduction lambda x.M -a> lambda y.M[y/x] where y is a variable not occurring in M Def: Reduction (->) is a relation on terms defined by the following rules: --- (1) lambda y.M -> lambda v.M[v/y], where v is a variable not occurring in M (2) (lambda y.M)N -> M[N/y] (3) M->M (4) if M->N then PM -> PN (5) if M->N then MP -> NP (6) if M->N then lambda x.M -> lambda x.N (7) if M->N & N->P then M->P Def: Equality (=) is a relation on terms defined by the following rules: --- (1) lambda y.M = lambda v.M[v/y], where v is a variable not occurring in M (2) (lambda y.M)N = M[N/y] (3) M=M (4) if M=N then PM = PN (5) if M=N then MP = NP (6) if M=N then lambda x.M = lambda x.N (7) if M=N & N=P then M=P (8) if M=N then N=M The relation "=" is an equivalence relation. What can we say about this relation? It could be the case that all terms are in the same equivalence class. (The Church-Rosser Theorem will show that this is not the case, but for the moment we will just give one piece of "evidence" that this not all terms are in the same equivalence class.) T = lambda x.lambda y.x F = lambda x.lambda y.y Prop: If there are at least two classes of non-equal terms then T != F. Proof: Assume T=F Assume M,N are in the 2 non-equal classes TM = FM TMN = FMN M = TMN = FMN =N M=N o A related lambda-calculus: beta,eta-calculus Motivating example: S = \x.\y.\z.xz(yz) K = \x.\y.x I = \x.x Let's see if S(KI) = I S(KI) = (\x.\y.\z.xz(yz))(KI) = \y.\z((KI)z)(yz) = \y.\z.I(yz) = \y.\z.yz (N.F.) therefore S(KI) = I is not true in alpha,beta-lambda calculus. However, it is true in lambda calculus with additional eta-reduction, which says : \x.Mx ---> M if x is not free in M eta so ... ... = \y(\z.yz) = \y.y = I eta (Note: In this class, unless explicitly stated otherwise, assume that problems posed are for the alpha,beta-lambda calculus.) o Expressing programming constructs in lambda calculus ------------------------------------------------------- o Readings : Field and Harrison 6.8 Terminology: .- -. | closed terms | -< >- = lambda expression with no free variables | combinators | `- -' Combinators can "only" rearrange (and duplicate) things. We will encode programming constructs in lambda calculus by encoding them as combinators. (I) Booleans 3 things needed : (i) if P then M else N (ii) true = \x.\y.x (iii) false = \x.\y.y (i) ----> PMN either ... trueMN ---> M or falseMN ---> N (II) Pairing (simplified records) is a pairing of M and N 3 things needed : (i) pair M, N ===> (ii) first ===> M (iii) second ===> N idea: trueMN ----> M ((\x.xMN)true) (i) pair = \m.\n.\x.xmn df (ii) first (iii) second first = \y.y true second = \y.y false df df e.g. first e.g. second = (\y.y true)(\x.xMN)) = (\y.y false)(\x.xMN) = (\x.xMN)true = (\x.xMN)false = trueMN = M = falseMN = N (III) we can have 1 = df 2 = > df ... but we don't do that instead ... numerals -------- notation: n natural number n numerals corresponding to n - 0 = \f.\x.x = \fx.x - df 1 = \fx.(fx) - df 2 = \fx.(f(fx)) - df . . . n n = \fx.f x - df - we also need : (i) iszero : ------ we would like ... (\n.n(___?___)(true)) 0 ---- - = 0 (___?___)(true) - ---- = (\f.\x.x)(___?___)(true) ---- = true ---- and ... (\n.n(___?___)(true)) 1 ---- - = 1 (___?___)(true) - ---- = (\f.\x.fx)(___?___)(true) ---- = (\z.false)(true) ----- ---- = false ----- so we set (___?___) to be (\z.false) ----- thus ... iszero = (\n.n(\z.false)(true)) ------ df ----- ---- (ii) succ : ---- idea : k succ k = (\n.\f.\x....n....)(\g.\y.g y) ---- - . k . = \f.\x.f((\g\y g y)fx) . | | . `-----v------' . k . f x k ( or = \f.\x.((\g\y g y)f(fx)) | | `-------v-------' k+1 f x k+1 = \n.\f.f x therefore ... succ = (\n.\f.\x.nf(fx)) ---- df (iii) pred : ---- idea : pred 0 = 0 ---- - - n - 1 <\x.0 , 0> or <\x.x> - - 2 ---- - . . . . . . second() ---- --- notation : \.(.....) i.e. \p.(.....) -------- for all a in (.....) -----> (first p) and for all b in (.....) -----> (second p) for all n, we seek ... n second ( (\.) <\y.0,0> ) ---- - - pred = \n.second(n(\p ) ---- df ---- ----- ------ - - Exponentiation in the Lambda Calculus on two non-zero arguments can be defined: exp = lambda n . lambda m . m n df m Claim: exp n m = n --- - - -- Proof: The method will be induction on m. Base Case: Show exp n 1 = n ---------- --- - - - exp n 1 = 1 n --- - - - - = (lambda f . lambda x . f x) n - = lambda x . n x - n = lambda x . ((lambda g . lambda y . g (y)) x) n = lambda x . lambda y . x (y) _ = n _ Induction Step -------------- First establish a lemma --------------------------------------------------------------------------- m nm Lemma: n lambda y . z (y) = lambda y . z (y) - Proof: Base Case: m m 1 lambda y . z (y) = lambda f . lambda x . f(x)) (lambda y . z (y)) - m = lambda x . ((lambda y . z (y)) x) m = lambda x . z (x) Inductive step: m m n lambda y . z (y) = (succ (m - 1)) (lambda y . z (y)) - ---- ------- n = (lambda n . lambda f . lambda x . f(n f x)) (n-1) (lambda y . z (y)) m m = lambda x . (lambda y . z (y)) ((n - 1) (lambda y . z (y)) (x)) m m(n - 1) = lambda x . (lambda y . z (y)) ( (lambda y . z (y)) (x)) (by the induction hypothesis) m m(n - 1) = lambda x . (lambda y . z (y)) (z (x)) m m(n - 1) = lambda x . z (z (x)) mn = lambda x . z (x)) QED for lemma Induction Step for proof of exponentiation equivalence -------------- The induction hypothesis is: m-1 exp n m-1 = m-1 n = n --- - --- --- - ---- Show: exp n m = m n --- - - - - = (succ m-1) n --- - = ((lambda n. lambda f . lambda x . f (n f x)) m-1) n --- - = lambda x . n ((m-1 n) x) - --- - m-1 = lambda x . n (n x) (by the induction hypothesis) - ---- m-1 (n ) = lambda x . n ((lambda g . lambda y . g (y)) (x)) - m-1 (n ) = lambda x . n (lambda y . x (y)) - Using the Lemma the n can be brought into the exponent - m-1 (n )n = lambda x . lambda y . (x (y)) m n = lambda x . lambda y . x y m = n -- QED ------------------------------------------------------------------------------ "Recursive functions" in the Lambda Calculus See Field and Harrison Section 6.7 also Stoy pp. 71-73 ------------------------------------------------------------------------------ Proposition: Given an equation on lambda expressions of the form f = E(f) (where E(f) is a lambda expression in which f is free) a lambda expression can be derived for f not involving f (i.e., a term for f in "closed form"). Example: factorial function fact = lambda x . if x = 0 then 1 else x * fact(x-1) fi fact = (lambda f . lambda x . if x = 0 then 1 else x * f(x-1) fi) fact Let F denote the term lambda f . lambda x . if x = 0 then 1 else x * f(x-1) fi Let fact^ be a lambda expression such that fact^ = F(fact^) fact^ is a fixed point of F. ----- ----- Defn: x is a fixed point of f iff f(x) = x. ----- ----- Idea: Find a combinator Y' such that Y'F is a fixed point of F. Then fact^ = Y' F. Goal: We will find a Y' that works for *arbitrary* F, not just the F of the factorial example. That is, we will find a Y' such that for all F Y'F = F(Y'F). * Idea: one way to have A = B is to have A --> B beta * Let's try Y' F --> F(Y' F) beta Y' is "self-reproducing" (in somewhat the same way that (lambda x . xx)(lambda x . xx) is self-reproducing. Guess that Y' is of the form theta theta. ----- ----- * theta theta F --> F(theta theta F) alpha,beta theta must be of the form lambda t . lambda f. f(t t f) The Turing fixed-point combinator is Y' = (lambda t . lambda f . f (t t f)) (lambda t . lambda f . f(t t f)) df ------------------------------------------------------------------------------ Example of Turing Fixed point combinator on fact fact 3 = Y' F 3 = (lambda t . lambda f . f (t t f)) (lambda t . lambda f . f(t t f)) F 3 which beta reduces to = F (theta theta F) 3 = (lambda f . lambda x . if x = 0 then 1 else x * f(x-1) fi) (theta theta F) 3 which beta reduces to = if 3 = 0 then 1 else 3 * (theta theta F) 2 fi) which beta reduces to = 3 * (theta theta F) (2) which beta reduces to = 3 * 2 * (theta theta F) (1) which beta reduces to = 3 * 2 * 1 * (theta theta F) (0) which beta reduces to = 3 * 2 * 1 * 1 which beta reduces to = 6 ------------------------------------------------------------------------------ There are many other fixed-point combinators. The one that you will often see discussed in books is the Curry fixed point combinator: Y = lambda f . (lambda x . f (x x)) (lambda x . f(x x)) df We can show that the Curry fixed point combinator Y has the required fixed- point property, Y M = M (Y M), as follows Y M = (lambda f . (lambda x . f ( x x))(lambda x . f ( x x))) M beta reduction on f gives = (lambda x . M ( x x)) (lambda x . M ( x x)) beta reduction on first x gives = M (lambda x . M ( x x))) (lambda x . M ( x x))) = M (Y M) ------------------------------------------------------------------------------ The Church Rosser Theorem See Rosser 1984, Section 4 ------------------------------------------------------------------------------ Several related notions: alpha-reduction (a-reduction) beta-reduction (B-reduction) A =~ B A->B with only alpha steps A ->* B (a,B reduction in context) X walk Y: like A->* B, but with special constraints ->->->->->....\ A /->->.........->B 1 may be a walk and 1 not Thm: (Church-Rosser) x0 / \ V V * * x1 x2 \ / V V * * there exists x3 Minimal cases: ------------- x0 (....((lambda x1.w1) v1) ....((lambda x2.w2)v2)....) / \ / \ / \ V \ (...(w1 [v1/x1])...((lambda x2.w2)v2)...) \ \ V \ (...((lambda x1.w1)v1)...w2[v2/x2]...) \ / \ / \ / \ / V V (...w1[v1/x1]...w2[v2/x2]...) The diamond property only holds above when the steps that take you apart are non-overlapping. Problem: (overlapping redexes) x0 (lambda x ...x....x...)(...*....) {* is a redex that reduces to y} - ------------------------------- / \ / \ / \ V V (...(...*....)....(...*....)...) (lambda x ...x....x...)(...y....) | / \ / | / V / | V -------- / | (...(...y....)....(...*....)...) \ / | \ \ / 1 B-step \ --- \ / \ 2 steps \ \ / ----------------------------> V VV (...(...y....)....(...y....)...) {n occurrences of x in the original -> n steps} Solution to problem: introduce walk Instead of a nice clean diamond everything gets skewed x0 / \ V \ / \ V V / \ / \ / V / \ / \ / \ V V V \ / \ / \ V V / \ / \ / VV V The walk squares everything up, and you get: //\\ // \\ // \\ V V //\\ //\\ // \\ // \\ V V V //\\ //\\ //\\ // \\ // \\// \\ V V V V {well, this should be a nice diamond \\ //\\ //\\ // made out of walk symbols (=>) } \\// \\ // \\/ V V V \\ //\\ // \\ // \\ // \\// \\/ V V \\ // \\ // \\ // V You can only do a certain amount of work in a walk: on every step of the walk you reduce the size of the "web" that connects the root of the term to the roots of "permissible" redexes (i.e., redexes at which reductions would be permitted to extend the walk). the size of "web" from root to all redexes is <= |x0|. Therefore the number of steps in each walk sequence must be <= |x0| WARNING: The walk relation is not transitive. A=>B, B=>C does not imply that A=>C (That is, the existence of a walk sequence that transforms A to B and one that transforms B to C does not mean that there is a walk sequence that transforms A to C.) if A -> B then A=>B if A -> B then A=>*B if A -> B then A->*B Goal: show diamond property for ->* relation Method: show diamond property for => | V (tiling) implies diamond property for =>* | V implies diamond property for ->* Xo = MoNo MoNo | | | | a walk with | | // V a B-reduction | | //(lambda y.W1)N1 at the root | | \\ | V V V B X1 = W1 [N1/y] W1[N1/y] Lemma: If X=>Y then X[P/x]=>Z when Z is in the same alpha class as Y[P/x] Proof idea: In the walk X[P/x]=>Z, the P terms get moved about in exactly the same fashion as the x terms in the walk X=>Y. Lemma: (diamond property for ==>) i.e. if x ===> x and x ===> x then 0 1 0 2 there exists x such that x ===> x and x ===> x . 3 1 3 2 3 Proof: By induction on the size of x . 0 case 1: x is a variable ------- 0 - the property is immediate ... x ===> x , x ===> x must be empty walks. 0 1 0 2 - choose x to be x ; 3 0 x ===> x and x ===> x are also empty walks. 1 3 2 3 case 2: X is a lambda term of the form \x.M . ------- 0 0 X = \x.M 0 0 // \\ // \\ V V X = \x.M X = \x.M 1 1 2 2 It must be that: M 0 // \\ // \\ V V M M 1 2 | | | | Because | M | < | \x.M | , by induction hypothesis we have ... | 0| | 0| M 0 // \\ // \\ V V M M 1 2 \\ // \\ // V V M 3 so ... X = \x.M 0 0 // \\ // \\ V V X = \x.M X = \x.M 1 1 2 2 \\ // \\ // V V choose X = \x.M 3 3 case 3: x is of the form M N . ------- 0 0 0 - 4 subcases : subcase (i) : M N ------------- 0 0 // \\ // \\ no beta-reduction ----------->//------->\\ at the root V V X = M N X = M N 1 1 1 2 2 2 // \\ // \\ // \\ // \\ V \\ // V M N \\ // M N 3 1 \\ // 3 2 \\ \\// // \\ VV // \===> M N <==/ 3 3 - in this case we just concatenate (or interleave) the results of M N 0 0 // \\ // \\ // \\ // \\ V V V V M M and N N 1 2 1 2 \\ // \\ // \\ // \\ // V V V V M N 3 3 as these two results do not 'interfere' with each other. subcase (ii) : There is a beta-reduction at the root in the -------------- walk M N ===> M N . 0 0 1 1 M N 0 0 .-- // \\ there is a | // \\<---- no reduction beta-reduction -< // \\ at the root at the root ... | V V | (\x.W )N X = (\x.W )N ... and it must | 1 1 2 2 2 be the last | / one. -----------|---->/ | / `- V X = W [N /x] 1 1 1 from the above diagram, we can see that ... M ===> \x.W and M ===> M 0 1 0 2 AND N ===> N and N ===> N 0 1 0 2 from the induction hypothesis we have ... M N 0 0 // \\ // \\ // \\ // \\ V V V V \x.W M AND N N 1 2 1 2 \\ // \\ // \\ // \\ // V V V V \x.W N 3 3 thus, we have walk#1 and walk#2... M N 0 0 // \\ // \\ V \\ (\x.W )N \\ 1 1 V / \\ X = M N / \\ 2 2 2 V \\ // X = W [N /x] \\ walk#2 // 1 1 1 \\ // walk#1 .-- \\ \\ // because all | \\ V V occurrences of | V (\x.W )N N ===> N can | W [N /x] 3 3 1 3 -< 1 3 / be performed 1st| \\ / beta-reduction we can combine | \\ / the two walks `-- V V choose X = W [N /x] 3 3 subcase (iii) : --------------- this is just the 'mirror image' of subcase (ii) in the sense that x and x are interchanged. 1 2 subcase (iv) : -------------- M N 0 0 .-- // \\ --. | // \\ | reduction | V V | reduction at root -< .... .... >- at root | / \ | | / \ | `-- V V --' X = W [N /x] X = W [N /x] 1 1 1 2 2 2 \\ // \\ // \\ // V V X = W [N /x] 3 3 3 proceed as in subcase (ii), except that both X and X are 1 2 handled in the way we handled X in subcase (ii). 1 NOTE : ------ - the above proof use the following lemma : If X ===> Y then X[P/x] ===> z where z = Y[P/x] alpha - In this proof, all complications due to the need for alpha-reductions are ignored. Technically, this argument only applies for a representation in which alpha-conversion is unnecessary, such as de Bruijn notation (see Problem Set #1).