Wednesday, February 21, 2007

EOPL - 2.2.2 - Exercise 2.11

Exercise 2.11 asks you to correct the definition of a substitution function on the lambda-calculus, to avoid variable capture. To do that, we need function fresh-id from the previous exercise, but alas, we can't use it as is, because it is also asked to extend the definition of a lambda-calculus expression to includes literals and primitives. Nevertheless, it's an easy exercise.

To define the new lambda-calculus expression type, we first define a type to specify primitives:

type Primitive = Sum | Mult

type LPExp = Identifier of string | Lambda of string * LPExp
| App of LPExp * LPExp | Literal of int
| Prim of Primitive * LPExp * LPExp

And here is the code for auxiliary functions and the substitution function:

let rec all_ids exp =
match exp with
Identifier i -> [i]
| Lambda (x, e) -> x :: (all_ids e)
| App (e1, e2) -> (all_ids e1) @ (all_ids e2)
| _ -> []

let fresh_id exp s =
let syms = all_ids' exp in
let rec loop n =
let sym = s + (string_of_int n) in
if List.mem sym syms then loop (n + 1) else sym in
loop 0

let rec lambda_calculus_subst exp subst_exp subst_id =
let rec subst exp =
match exp with
Identifier id when id = subst_id -> subst_exp
| Identifier id -> exp
| Lambda (id, body) when id = subst_id -> exp
| Lambda (id, body) ->
let id'=fresh_id body id in
let body'=lambda_calculus_subst body (Identifier id') id in
Lambda (id', subst body')
| App (e1, e2) -> App (subst e1, subst e2)
| Literal l -> exp
| Prim (p, e1, e2) -> Prim (p, subst e1, subst e2) in
subst exp

The substitution function always does an alpha conversion (substitution of the identifier bound by the lambda) unless the bound identifier is the same as the one being substituted for. This surely works, as alpha conversion doesn't change the meaning of a lambda expression. An alternative would be to alpha convert only when a capture would occur, but this would require a further check like

if List.mem id (all_ids body) then
// do alpha conversion
else
// just simple substitution

It would make the code bigger, but not better. Both versions require linear scans of the body, so the efficiency shouldn't differ much between them.

No comments: