If you follow programming languages or web technologies closely it's likely that you've heard of Rust. Rust is one part of a larger effort by Mozilla Research to build a new browser engine in Servo, but its value as a development tool certainly extends beyond that initial goal. In particular it has received attention for its memory model which, "encourages efficient data structures and safe concurrency patterns, forbidding invalid memory accesses that would otherwise cause segmentation faults" [1].

In this post we'll take a look at the basics of Hoare logic and an extension Separation logic which aid in reasoning about imperative program behavior and memory state. Then we'll apply those tools to examine the impact that Rust's memory ownership system has on the heap.

While I was writing this post John Reynolds passed away. He was an incredible force in PL research and Separation logic was one of his most important works. There's much more to it than is covered here so if you're curious and you want to learn more please explore the links in the footnotes.

## Hoare logic

In the late 1960s Tony Hoare proposed a formal system for reasoning about programs which would eventually be referred to as Hoare logic. The central feature of Hoare logic is a triple {P} C {Q} where P and Q are predicate logic assertions, referred to as the pre/post conditions, and C is a command (reads: program/program fragment). The idea is that, outside of any guarantee of termination [2], if P is true before C and Q is true after C, the triple proves partial correctness for P and Q. That is, it proves some property asserted by P and Q about C.

A simple example using C with the assertions in comments:

// { x == n }
x = x + 1
// { x == n + 1 }

Here the triple {P} C {Q} asserts that if x is equal to some n before x = x + 1 then x will be equal to n + 1 afterward.

In addition to this basic structure it's possible to define axioms for common programming constructs like assignment, branching, while loops, and for loops that allow for more general reasoning and manipulation of assertions. For assignment it takes the form {P[E/V]} V=E {P}. That is, substituting E for V in P before the assignment should hold and P should hold afterward [3].

// P[E/V]
x = x + 1
// P

Borrowing Q from the earlier example here, P is x == n + 1. Substituting x + 1 for x in P gives x + 1 == n + 1, or x == n, for the precondition.

// The result of substituting `x + 1` for `x` in `x == n + 1`:
// { x == n }
x = x + 1
// { x == n + 1 }

With the help of this and other axioms, established for each programming environment, Hoare logic allows the wielder to write specifications for programs. For most domains (especially those that my usual reader works in) the approach might be heavy handed, but there are many domains where this type of specification is necessary. In particular it's often important to specify the behavior of a program with regards to memory.

## Separation Logic

Separation logic is an extension to Hoare logic that provides tools for specifying memory use and safety with new assertions for how a program will interact with the heap and stack [4].

The four assertions that Separation logic adds for describing the heap are:

• emp - for the empty heap. It asserts that the heap is empty, and it can be used to extend assertions about programs that don't interact with the heap.
• x |-> n - for the singleton heap. It asserts that there is a heap that contains one cell at address x with contents n.
• P * Q - as a replacement for with disjoint heaps. It asserts that, if there is a heap where P holds and a separate (disjoint) heap where Q holds, both P and Q hold in the conjunction of those two heaps.
• P -* Q - as a replacement for implication, =>. It asserts that if there is a heap in which P holds then Q will hold in the current heap extended by the heap in which P holds. An important point of clarity: P -* Q holds for the current heap and not the current heap extended by the heap in which P holds.

There are also some shortcuts for common heap states that are built on top of these four assertions:

• x |-> n, o, p is equivalent to x |-> n * x + 1 |-> o * x + 2 |-> p. That is, x points to a series of memory cells that can be accessed by using x and pointer arithmetic.
• x -> n is a basic pointer assertion. It is equivalent to x |-> n * true, that suggests there is a heap where n is the value at *x which is a part of a larger heap about which we can't make any assertions.

Again we'll turn to C to demonstrate how these assertions fit with common programs [5].

// { emp }
int *ptr = malloc(sizeof(int));
*ptr = 5;
// { ptr |-> 5 }

The first assertions states that the heap is empty (emp). After the malloc call and assignment there exists a singleton heap with a single cell containing the value 5 that is pointed to by ptr.

// { emp }
int *ptr = malloc(sizeof(int));
*ptr = 5;
// { ptr |-> 5 }
int *ptr2 = malloc(sizeof(int));
*ptr2 = 5;
// { (ptr |-> 5) * (ptr2 |-> 5) }

Adding ptr2 means the addition of another singleton heap and the connective *. It's possible to write this as { (ptr -> 5) ∧ (ptr2 -> 5) }, but this assertion provides no information about how the heap is arranged. It simply says that there are two pointers to the value 5 somewhere in a heap. They might be pointing to the same memory location. By using the singleton heap pointer |-> and the connective *, the new assertion makes clear that the two pointers are not pointing to the same memory location.

// { emp }
int *arry = calloc(3, sizeof(int));
*arry = 1;
*(arry + 1) = 2;
*(arry + 2) = 3;
// { arry |-> 1,2,3 }

Here the comma separated list of values following the singleton pointer in { arry |-> 1,2,3 } denotes contiguous memory. It's simply a shorthand notation for the pointer arithmetic.

// { arry |-> 1 * (arry + 1) |-> 2 * (arry + 2) |-> 3 }

It's worth noting that separating implication, P -* Q doesn't appear to have any particularly useful or clear concrete examples. This seems to be the consequence of its relationship to logical implication in that the whole assertion is only false when Q is false. Borrowing from Reynolds [6], something like { x |-> 1 -* Q } for some assertion Q can be extended with the separating implication to show:

// { x |-> 0 * (x |-> 1 -* Q) }
*x = 1
// { Q }

The precondition here says that there are two disjoint heaps. One in which x |-> 0 holds and one in which (x |-> 1 -* Q) holds. The implication on the right is, if second heap was extended so that x was pointing to 1, Q would hold. After the assignment *x is no longer 0 but rather the second heap has been extended so that *x is 1 and as a result Q holds [7].

## Rust Ownership

Rust provides two new type modifiers for dealing with pointers and memory management. Both have very specific semantics that are checked at compile time to help prevent memory leaks.

• ~ - provides a lexically scoped allocation on the heap. That is, when the newly assigned pointer variable goes out of scope the memory is freed.
• @ - provides a garbage collected allocation on the heap. In Rust each task has its own garbage collector responsible for handling this type of heap allocation.

A few simple examples borrowed in part from Rust's tutorials will illustrate when the memory for each of these type modifiers is freed.

fn main() {
{
let a = ~0;
} // a is out of scope and *a is freed
}

With the tilde, the memory at *a on the heap is freed when the variable to which it is assigned goes out of scope. Since a is declared inside an explicit block it goes out of scope at the end of the block and the associated memory is freed.

fn main() {
let a;

{
let b = ~1;

// move ownership of *b to a
a = b;

io::println(fmt!("%?", *b)); // error: use of moved value: `b`
io::println(fmt!("%?", *a)); // => 1
}
} // *a is destroyed here

When the ownership of memory is transferred between variables the compiler prevents further reference to the original owner. In this example a is the new owner and the compiler will prevent any further reference to b. This concept of single ownership is the reason that the memory can be deallocated safely when the current owner goes out of scope.

Alternately, the @ type modifier can be used to request that the run-time manage the allocated memory on a per-task basis. This presents some interesting issues when creating pointers to memory allocated as part of a record.

struct X { f: int, g: int }

fn main() {
let mut x = @X { f: 0, g: 1 };
let y = &x.f;

x = @X { f: 2, g: 3 };

// original *x is now subject to gc

io::println(fmt!("%?", *x)); // => {f: 2, g: 3}
io::println(fmt!("%?", *y)); // => 0
}

Note that x is declared as a mutable variable (Rust's default is immutability). When a pointer is made to the field f with &x.f and then x is reassigned, the memory at *x would be subject to garbage collection. Luckily Rust does a bit of work for you and inserts a lexically scoped reference to the original record to prevent deallocation by the garbage collector.

struct X { f: int, g: int }

fn main() {
let mut x = @X { f: 0, g: 1 };
let x1 = x;
let y = &x.f;

x = @X { f: 2, g: 3 };

io::println(fmt!("%?", *x)); // => {f: 2, g: 3}
io::println(fmt!("%?", *y)); // => 0
}

You might also wonder how Rust handles references to lexically scoped record fields. In this case the compiler raises an error and (rather nicely) highlights the discrepancy in scoping expectations.

struct X { f: int, g: int }

fn main() {
let a;

{
let mut b = ~X { f: 1, g: 2 };

// move ownership of *b to a
a = &b.f;
}

// error: illegal borrow: borrowed value does not live long enough
io::println(fmt!("%?", *a));
}

Here, a is assigned the memory location of the field f of b, but the scope of a is larger than the scope of b which means that *b will be freed long before *a.

## Formalizing Ownership

Rust's memory management facilities exist mostly at compile time to prevent users from shooting themselves in the foot, but it's still worth applying Separation logic to get a feel for what's happening to the heap.

fn main() {
// { emp }
{
// { emp }
let a = ~0;
// { a |-> 0 }
let b = ~1;
// { a |-> 0 * b |-> 1 }
}
// { emp }
}

Both a and b are scoped to the explicit block and exist in disjoint parts of the heap. When they go out of scope the memory associated with both is deallocated leaving an empty heap.

fn main() {
let a;
// { emp }
{
// { emp }
let b = ~1;
// { b |-> 1 }
let a = b;
// { a -> 1 ∧ b -> 1 }
}
// { a |-> 1 }
}

In this case if there was a reference to b after the ownership of *b was transferred to a the compiler would complain. Since there is no reference we assume that b is pointing to the same location until the pointer is removed at the close of the explicit block.

struct X { f: int, g: int }

fn main() {
// { emp }
let mut x = @X { f: 0, g: 1 };
// { x |-> 0, 1 }
let x1 = x;
// { x1 -> 0, 1 ∧ x -> 0, 1 }
let y = &x1.f;
// { x1 -> 0, 1 ∧ x -> 0, 1 ∧ y -> 0 }
x = @X { f: 2, g: 3 };
// { (x1 -> 0, 1 ∧ y -> 0) * x |-> 2, 3 }

io::println(fmt!("%?", *x)); // => {f: 2, g: 3}
io::println(fmt!("%?", *y)); // => 0
} // { emp }

Here we've re-purposed the managed memory example from earlier with the explicit addition of the reference that would otherwise be inserted by Rust to prevent GC, x1. Let's examine each expression and the associated assertions in turn.

// { emp }
let mut x = @X { f: 0, g: 1 };
// { x |-> 0, 1 }

As before the allocation of managed memory creates a singleton heap pointer to the memory containing the record values [8].

// { x |-> 0, 1 }
let x1 = x;
// { x1 -> 0, 1 ∧ x -> 0, 1 }

Adding an additional reference to that same memory means that we have two pointers to the same memory. As a result we cannot use the * connective or the the singleton heap pointer |-> to represent the heap.

// { x1 -> 0, 1 ∧ x -> 0, 1 }
let y = &x1.f;
// { x1 -> 0, 1 ∧ x -> 0, 1 ∧ y -> 0 }

A new reference to the memory location of the first record field in x and x1 adds another pointer that overlaps with the existing heap. It's important to keep in mind that the basic conjunction simply says that the heap may overlap. To the reader it may be obvious, but in terms of specifying program behavior it's a weaker assertion that the equivalent made with the * connective.

// { x1 -> 0, 1 ∧ x -> 0, 1 ∧ y -> 0 }
x = @X { f: 2, g: 3 };
// { (x1 -> 0, 1 ∧ y -> 0) * x |-> 2, 3 }

Finally with the allocation of a wholly new record and pointer for x we can employ the more powerful connective because the new record lives in a newly allocated section of memory on the heap. The remaining pointers to the original record and its first field remain ambiguous.

Not covered here for brevity's sake is an important part of Separation logic, the Frame Rule. The Frame Rule provides for rigorous local reasoning about the heap without concern for other possibly overlapping references to the same memory locations. That is, it allows each assertion to be correct in spite of the fact that the program fragments they pertain to are often operating in a larger application that manipulates the heap.

Also, the concept of borrowed pointers is important reading if you're interested in Rust. A common but effective memory efficiency is achieved in C by passing pointers to data structures instead of using the default pass-by-value semantics. Similarly one can "borrow" a pointer to a data structure in Rust, but because of the type level restrictions it's both safer and more complex [9]. The borrowed pointers tutorial makes those complexities clear.

## Conclusion

Hopefully this post has given you an initial sense of a portion of Rust's memory management facilities and also the formalism of Separation logic.

Special thanks goes to @steveklabnik, @evanphx, and Matt Brown for reviewing various bits of my post.

### Footnotes

1. http://static.rust-lang.org/doc/0.6/tutorial.html#introduction
2. When the program fragment can be shown to terminate the triple proves total correctness.
3. There are actually two forms of the assignment axiom. The second proposed by Floyd is more complex but addresses issues that exist in common imperative languages the first cannot.
5. The examples that follow assume that malloc is always operating by allocating fresh memory not pointed to elsewhere.
6. John C. Reynolds: Separation Logic: A Logic for Shared Mutable Data Structures.
7. It certainly feels convoluted but the separating implication is used as a powerful tool when reasoning about the execution of programs moving backwards (among other things). All the examples in this post start from some initial state and then move forward following the execution of the program. Doing this can produce useless assertions if the thing you are trying to prove isn't affected by some of the program snippets. Many times it's easier to reason from the end goal, that is from the final result of a set of commands/expressions/program snippets and work backwards. As you can see { x |-> _ * (x |-> 1 -* Q) } *x = 1 { Q } the final assertion can be anything. This means that you can start with some assertion you want to show for a program and move backward "over" a memory mutation!
8. This assumes the memory allocated for a struct in Rust is sequential (for reference). It also assumes that x.f and x.g aren't explicitly scoped pointers but rather tranlate into an operation on the pointer x at a low level (they aren't represented in the assertions).
9. Obviously this depends on your perspective and how complex the code is that uses the pointer. That is, it may be exceptionally hard to get the memory freed properly as result of passing around pointers in which case the compiler might be extremely valuable when writing the Rust equivalent.

30 Apr 2013

#### Tags

I'd like to thank the good folks at js.la for putting on a great meetup and having me out to participate. You can see the slides at http://johnbender.us/presentation-cs-foibles.

NOTE: The larger format presented on the Vimeo site itself might be better viewing.

07 Feb 2013

#### Tags

System F and its variants are used frequently in the study of typed computation and an extended form, System FC, plays an important role in GHC's compilation process [1]. This post will attempt to translate the grammars of the Lambda Calculus, Simply Typed Lambda Calculus, and finally System F into a small subset of CoffeeScript while covering the basics of application and abstraction in each. Hopefully, through studying the basics here, readers who are interested in learning more about type theory and the lambda calculi will feel a little less intimidated by the notation and concepts.

## Lambda Calculus

Representing the Lambda Calculus only requires a small subset of CoffeeScript's total syntax. Let's define both grammars side by side.

In both cases x represents a variable and t t represents application of one term to another. The only real difference is the lambda abstraction syntax (red).

In the Lambda Calculus, abstraction involves preceding or wrapping a term t with λx.t which captures or binds the variable x inside the term t. In the example t is λx.y, using the abstraction λy.t creates a new term λy.λx.y (otherwise known as the K-combinator). Note that the x in λx.t shown in the grammar could be any variable like y or z.

In CoffeeScript we can do that same thing with the term (x) -> y by applying the CoffeeScript version of abstraction, yielding (y) -> (x) -> y.

The application of a term that shows two abstractions (y and x) to a single argument results in another term. The translation to JavaScript shows that a function object is returned ready for application. The concept of application in the Lambda Calculus is really the idea of substitution. You can see that the z is substituted for the instances of y in the body of the inner lambda term. Similarly in CoffeeScript the outer lambda puts y in scope and now all occurrences in the body of the lambda term will behave as z.

The notion of closed terms in the Lambda Calculus is a point of departure with CoffeeScript. A closed term has no free variables. For example, in λx.y the y variable is called free because it hasn't been bound by an enclosing abstractions. Adding an additional abstraction to get λy.λx.y captures the y. The JavaScript generated by the CoffeeScript (x) -> y will throw a reference error without a definition for y in scope, in contrast to the lambda term which will simply discard its argument and produce the y.

## Simply Typed Lambda Calculus

The Simply Typed Lambda Calculus expands on the Lambda Calculus with type information in abstractions.

It adds a type requirement to the argument of abstractions. The CoffeeScript equivalent adds a function that compares a type argument to the argument term. To keep things simple, ifft will throw an exception when the types don't line up.

# If and only if (iff) the type (t) is correct
ifft = (type, obj) ->
if !(obj instanceof type)
throw "Type mismatch"

This definition assumes that the browser's instanceof implementation is behaving properly but certainly serves the purpose of illustration here. Assuming the type Foo is defined for the calculus and with class Foo in the CoffeeScript, we can add types to a simple example.

The types τ can represent are an important part of the Simply Typed Lambda Calculus. Some definitions provide a set of base types like natural numbers or booleans. Along with that comes a set of base constant values that can represent those types. For natural numbers a base constant of 0 and church numerals allows users to represent all the natural numbers. Here we've assumed the simple case allowing only the function type τ → τ, meaning every argument must be a lambda term.

Unfortunately even the simple case throws a rather large monkey wrench into the works. That's because there's no way to write ifft such that it can know or inspect the arguments type and return type of a lambda term (JavaScript function). In spite of that let's proceed to see what we can learn about System F.

## System F

System F adds the notion of polymorphic types to the Simply Typed Lambda Calculus. The term polymorphic carries a lot of baggage, but here it is parametric polymorphism [3]. That is, the type of a term can change depending on the setting.

System F makes two extensions. Λα.e is a type abstraction on the type variables in terms and e [τ] is the application of a type to a term that has been wrapped by a type abstraction. Note that the abstraction uses α to represent a type variable and τ to represent types that may or may not contain type variables. Since CoffeeScript's "types" are really constructor functions, all that's needed is an enclosing lambda to pull the type argument to ifft into scope and an application to a constructor function at runtime.

For System F the type abstraction is very specific. It works at the type level [4]. In CoffeeScript, on the other hand, it's just another lambda abstraction that puts the constructor function in scope as a lambda argument. In both cases when the term is applied to a type a substitution of the type for the type variable α takes place, just like regular (term) abstractions.

Again there is an important exception with regards to the types permitted in System F that makes the ifft even more inadequate than it was with the Simply Typed Lambda Calculus. Not only is there no way to write ifft so that it can properly check against lambda types but in addition System F defines universally quantified types. This means that a lambda can be polymorphic and operate without knowing anything about the type of its argument.

For example the identity combinator Λα.λx:α.x, or in CoffeeScript (α) -> (x) -> ifft(α, x); x, has the polymorphic type ∀α.α → α. That is, its type isn't determined until a type application takes place or it "works" for all types. The issue arises when a combinator like identity is passed as an argument to another lambda term that expects a polymorphic function. Again, ifft can't check the types of a combinator if a type has been applied and it certainly can't check that the combinator is type agnostic in its dealings (polymorphic).

## Strong and Normal

System F has some really neat properties. Most importantly System F is strongly normalizing. That means that it always terminates and always reduces to a normal form when the terms are well typed (the types line up properly). That alone makes it worth studying.

It's also possible to make the type abstraction a bit more robust in the CoffeeScript by verifying that the argument is at least a function. A more adventuresome individual might alter CoffeeScript to annotate constructor function objects with a property that could be used to set them apart and compare them [5].

Even if we came up short of a really meaningful translation, hopefully it has demystified parts of how these calculi work. If you want to learn more I suggest reading Types and Programming Languages by Pierce (not an affiliate link).

### Footnotes

1. There are some interesting papers to read from Microsoft Research this particular extension of System F.
2. This is a more intrinsic approach to the translation as we aren't attempting to use types to reason about the terms/grammars.
3. Specifically rank-n polymorphism since the type variable α can be replaced by a quantified type.
4. This is actually captured in the reduction rules for System F.
5. Gasp! types in CoffeeScript.

15 Jan 2013

#### Tags

In the previous post I presented the basics of operational semantics and showed how derivations trees can be used to differentiate two terms that are syntactically similar. This post develops the closing thoughts further with the introduction of type rules, example tools for automating evaluation and type derivation, and a concrete definition of semantic ambiguity. The primary goal is to establish the best way to detect ambiguous term pairings and then outline what will work for a tool that can be generalized beyond the CoffeeScript subset.

## Type Rules

Type rules are similar in construction to evaluation rules, consisting of a premise and conclusion. As with evaluation rules the premise establishes the preconditions for the conclusion. Again, each rule is tagged with a name for reference but preceded by a t- in this case to distinguish them from inference rules (e-).

Type rules without a premise like t-true and t-false are taken to be true out of hand. That is, the terms true and false both have the type Bool. The others are more complicated.

t-lambda illustrates how to determine the type of a lambda term like (-> true). The premise above the line states that if the subterm t has the concrete type T, then the conclusion λt has the type X -> T. Here X is a type variable because we don't know whether the lambda will be evaluated with the invocation operator () or applied to an argument. T will be concrete because it can be determined from the body of the lambda expression. For example, in (-> true) the subterm true has the type Bool so the lambda term has the type X -> Bool.

t-inv shows how to determine the type of lambda invocation like (-> true)(). The premise states if the lambda term has the type X -> T the term λt() has the type T. For example, (-> true)() evaluates to true and has the type Bool. It's worth noting that X is constrained to be the Unit or empty type since no argument is used.

t-app is the type rule for lambda applications, e.g. (-> true) false. The premise says that if the lambda term λt on the left has the type X -> T the conclusion is that the application will have the result type of the lambda term. Again, the type of an application, like invocation, is only concerned with the type of the first lambda's subterm t and it ignores the type of the argument that it's applied to.

## Type Rule Stacking

This notation makes it easy to establish the type of a term by stacking the type rules on one another in the same fashion as evaluation rules. Taking a very simple example, some diagrams will illustrate how this works:

(-> true)

This highlights how to derive the type at the bottom from its subterms. Typing the innermost subterm true with t-true can be "stacked" by using it to replace the premise of type rule t-lambda. The type derivation expands from the subterm to establish each subsequent parent term's type. Another more complex derivation:

(-> (-> false))() true

The second subterm of the application is unimportant where the type of the term is concerned and as a result it's wholly ignored. Working on the left term, the tree extends upward until it reaches the atomic value type, false : Bool. The complexity of nested lambdas and invocation make for a taller stack of type rules to reach the atomic false when compared with the previous example.

## Not Quite There

At this point the type rules can describe the original issue. A derivation tree based on the typing rules highlights that the term is untypable. Taking our canonical example, (-> true)() -> false:

Once the derivation tree reaches the outermost term it breaks. There is no type rule for the application of something with type Bool to something with type X -> Bool since t-app requires the first term have the type X -> T in its premise. It's a type error.

Previously we saw that this would result in a type error under evaluation by the CoffeeScript interpreter. We also saw that it was easy to construct a term that suffered the same semantic confusion without the type error (-> (-> true))() -> false. This issue applies to the type derivation as well.

In addition we saw that it's possible to construct terms, albeit in the boolean example language, that might produce the same value through different evaluation paths. That is, they had different derivation trees in the evaluation relation but the same evaluation result. This issue also applies to type derivations.

In both cases useful information is lost when the derivation is discarded in favor of the final value or type. The advantage with the type information is obviously that no evaluation is required to determine if two terms are "different" in some way other than their syntax. The disadvantage is that not all languages make determining type information easy.

Ultimately the type information provides a second way to differentiate syntactically similar terms. Indeed there are cases where both the evaluation and type information are necessary to distinguish terms. For example ((x, y) -> x + y)(1, 1) has a type derivation identical to ((x, y) -> x * y)(1, 2) and the same evaluation result, but it clearly takes a different evaluation path [1].

Note: The next five sections cover an implementation of a lexer, parser, evaluator and mechanisms for type/evaluation derivation. If you'd rather just read about how the generated evaluation and type derivations are used to find confusing term pairings you can skip to Detecting Ambiguity

## Happy Parsing

It's time to build something concrete from the formal notion of evaluation and types. An AST for this CoffeeScript subset will provide enough information to perform evaluation and establish derivation trees for both evaluation and types. I've chosen Haskell along with the Alex and Happy tools to implement a simple lexer and parser. As you would expect the parser grammar definition looks very similar to the grammar definition presented in the previous post:

%token
white   { Whitespace }
bool    { Boolean \$\$ }
'()'    { Unit       }
'->'    { Arrow      }
'('     { LeftParen  }
')'     { RightParen }

Expr   : Value                      { \$1 }
| Lambda '()'                { Invoke \$1 }
| Expr white Expr            { Apply \$1 \$3 }

Lambda : '()' white '->' white Expr { Lambda \$5 }
| '->' white Expr            { Lambda \$3 }
| '(' Lambda ')'             { \$2 }

Value  : bool                       { BooleanExpr \$1 }
| Lambda                     { \$1 }

You can view the full lexer and parser implementations here.

There are two differences from the original grammar definition. Lambda terms in parentheses are just a convenience for readability. More importantly a correction must be made to application of two terms, allowing for any term as the left side (the applicand) [2]. This enables the grammar to reproduce the original issue, since (-> true)() -> false translates to an invocation applied to a lambda term. The corrected grammar:

Also, a correction and an addition must be made to the inference rules presented in the previous post. This will ensure that any term type is permitted as the left half of an application, and that it is fully evaluated before applying it.

Where e-arg-eval ensures that the argument of an application is fully evaluated, e-app-eval ensures that the applicand is fully evaluated before the application takes place.

## Matching Rules

The abstract representation produced by the parser is a simple tree structure built with Haskell types. Pattern matching can be used with the type and inference rules to produce evaluation and derivation results. To start let's look at a simple evaluator and derivation builder.

-- an enumeration of each inference rule
data InfRule = Inv | App | ArgEval | AppEval

The InfRule Haskell type is a simple enumeration of the tags belonging to each inference rule. e-inv corresponds to Inv and so on.

-- an intermediate form for performing derivation and evaluation
data RuleMatch = None | RuleMatch InfRule (Maybe Expr) Expr

-- match a rule and provide the relevant sub terms for action
matchRule :: Expr -> RuleMatch

Both the evaluator and the derivation builder will operate based on the inference rules that apply to each term and its subterms. The function matchRule takes an expression, Expr, and provides three pieces of information in a RuleMatch result: the inference rule that applies to the term, an optional term for the premise of an inference rule pulled from the body of the parent term, and a term for the conclusion of the inference rule also pulled from the body of the parent term. There are pattern matching definitions for each rule.

matchRule (BooleanExpr _) = None
matchRule (Lambda _)      = None

The value terms true, false and (-> x) are the base case of matchRule. That is, whenever another function requests a rule match on the value terms None is provided to signal that the term has been fully evaluated.

-- Rule: e-inv
matchRule (Invoke (Lambda t)) = RuleMatch Inv Nothing t

Invocation can only be applied to a lambda term and the result of the invocation is the lambda's subterm, e.g. (-> true)() evaluates to true. An invocation on anything else will simply drop through this match and ultimately to the catch all error case. For example the CoffeeScript true() is invalid. Its abstract representation from the parser is Invoke (BooleanExpr True) which clearly won't match here. On a match, the RuleMatch result contains the rule tag for invocation Inv, nothing for an inference rule premise since there isn't one for e-inv and the subterm t for further derivation in the conclusion.

-- Rule: e-app
matchRule (Apply (Lambda t) (BooleanExpr _)) = RuleMatch App Nothing t
matchRule (Apply (Lambda t) (Lambda _))      = RuleMatch App Nothing t

Like invocation e-app only works with lambda terms, but it carries the addition requirement that the argument be a value term. The grammar shows that the only v (value) terms are lambdas and boolean values so there's a match for those cases here. When there's a match the rule tag is App and the lambda subterm is again provided for possible further inspection/operation.

-- Rule: e-arg-eval
matchRule (Apply t i@(Invoke _))  = RuleMatch ArgEval (Just i) t
matchRule (Apply t a@(Apply _ _)) = RuleMatch ArgEval (Just a) t

-- Rule: e-app-eval
matchRule (Apply i@(Invoke _) t)  = RuleMatch AppEval (Just i) t
matchRule (Apply a@(Apply _ _) t) = RuleMatch AppEval (Just a) t

e-arg-eval and e-app-eval are more complicated than either e-inv or e-app which makes sense when comparing them as inference rules. Both e-arg-eval and e-app-eval carry a premise.

Both rules require that some evaluation take place on one of the subterms. More importantly the shape of the term remains the same. Neither e-arg-eval or e-app-eval change the shape of the term to which they apply, only the shape of the sub terms. This is in contrast to e-inv and e-app which discard the invocation operator and second term respectively. As a result the RuleMatch contains the subterm that needs to be evaluated further and the other subterm that remains stagnant. Note that in the function definition the e-arg-eval rule is matched first so that the e-app-eval rule can ignore the second subterm under the assumption that it's a value term (not Invoke or Apply).

matchRule t = error \$ "No inference rule applies for: " ++ (show t)

Finally, in situations like true() or true (-> true) where no rule applies, an error is raised.

## Evaluating the Options

The information contained in a RuleMatch instance can be used to evaluate or derive a given term. Evaluation is a simple matter of applying the rules recursively.

-- perform a single evaluation step
eval :: Expr -> Expr
eval t =
case (matchRule t) of
None                             -> t
(RuleMatch _ Nothing t1)         -> t1
(RuleMatch ArgEval (Just t1) t2) -> Apply t2 (eval t1)
(RuleMatch AppEval (Just t1) t2) -> Apply (eval t1) t2

eval performs a single step of evaluation according the the inference rules. The first case match returns the original term t because None is the match for fully evaluated value terms like true, false, and (-> x). The second match handles both the Inv and App by returning the subterm of the invoked or applied lambda term. The matchRule function does a bit of evaluation for these two rules by stripping the applied lambda term. For example, (-> true) true and (-> true)() become true.

(RuleMatch ArgEval (Just t1) t2) -> Apply t2 (eval t1)
(RuleMatch AppEval (Just t1) t2) -> Apply (eval t1) t2

For ArgEval and its cousin AppEval the subterm that needs further evaluation gets it and then the whole term is reassessed. The order of which evaluation happens first is preserved here by recursion. If the argument in an application needs more than one evaluation step, eval will continue to work on it until the result is returned to the original invocation. Subsequently if the applicand needs evaluation it will do the same. For example, in (-> true) (-> true)() the second term is evaluated with an Inv and then the boolean result is the argument to the first lambda term.

-- reduce an expression to a value term
fullEval :: Expr -> Expr
fullEval t =
case (matchRule t) of
None -> t
_    -> fullEval \$ eval t

fullEval simply applies eval to t until it reaches a value term.

## Automating Evaluation Derivation

The RuleMatch instance is primarily geared toward building derivation trees. That's why the structure appears so awkward in use with eval.

data Derivation a = Empty | Derivation InfRule Derivation Derivation Expr

The Derivation data type is comprised of a tag from the InfRule enumeration, one possible derivation as a premise, the final derivation as the conclusion, and the expression representing the state of evaluation at a given moment. Taking the derivation tree of a simple example (-> (-> true))() false which is parsed to:

Apply (Invoke (Lambda (Lambda (BooleanExpr True)))) (BooleanExpr False)

In english, the application of an invocation of a lambda with a lambda subterm to a boolean value. The resulting tree in the original notation takes the form:

The Derivation instance has to work from the outside in so it's much harder to read than the notation, but it contains the same information

Derivation AppEval
-- |       premise        | |   e-inv value            |
(Derivation Inv Empty Empty (Lambda (BooleanExpr True)))

-- |      conclusion      | |   e-app value   |
(Derivation App Empty Empty (BooleanExpr True))

-- |                e-app-eval value                  |
(Apply (Lambda (BooleanExpr True)) (BooleanExpr False))

It's clear that the applicand (-> (-> true)() needs evaluation using e-app-eval before it can be applied to the argument false. The premise of e-app-eval requires that the applicand take a step and here that means an invocation with e-inv. Finally the result of the invocation (-> true) is applied to the false with e-app as the "conclusion" of the e-app-eval. In reality, e-app is applied to the result of the first derivation tree as it is with the logic notation.

-- build a derivation from an expression
derive :: Expr -> Derivation
derive t =
case (matchRule t) of
None                         -> Empty
(RuleMatch rule Nothing t1)  -> Derivation rule Empty (derive t1) (evald)
(RuleMatch rule (Just t1) _) -> Derivation rule (derive t1) (derive \$ evald) (evald)
where evald = eval t

The derive function works in a similar fashion to eval. For a value/None result from matchRules there are no inference rules that apply. For e-inv or e-app, derive can recurse and build a derivation from the lambda's subterm. For e-arg-eval or e-app-eval the premise must be further derived and the conclusion is a derivation for the original term t with one evaluation step applied. That is, evaluating the subterm t1 once inside the original term t. The use of eval to do that may look funny but it's just a convenience.

## Automating Type Derivation

Deriving the type for a term in the CoffeeScript subset is slightly less complex than deriving the evaluation. Again, a type rule is matched to each valid AST construction.

data RuleMatch  = None | RuleMatch TypeRule Expr

-- match a rule and provide the relevant sub terms for action
matchRule :: Expr -> RuleMatch

The RuleMatch definition for types requires one less Expr. The derive and fixType definitions for types only require the first subterms in each expression. This is in contrast to eval which required both the conclusion and premise terms.

-- t-true & t-false
matchRule b@(BooleanExpr True)    = RuleMatch TrueType b
matchRule b@(BooleanExpr False)   = RuleMatch FalseType b

-- t-lambda
matchRule (Lambda t)              = RuleMatch LambdaType t

-- t-inv
matchRule (Invoke (Lambda t))     = RuleMatch Inv t

-- t-apply
matchRule (Apply t@(Lambda _) _)  = RuleMatch App t
matchRule (Apply t@(Invoke _) _)  = RuleMatch App t
matchRule (Apply t@(Apply _ _) _) = RuleMatch App t

The Apply matches capture only valid applicands and let the rest fall through to the error case. It's also worth noting that each of the Apply matches discards the argument term because it's unnecessary to the type of the expression. This fits with the definition of the type rules.

Fixing the type of a given expression is a simple recursive effort on applications. The Type data type captures both the Bool result, and the recursive Arrow type. For example (-> (-> true)) has the type Arrow (Arrow Bool).

-- the two possible types for a given expression
data Type = Bool | Arrow Type

-- determines the type of a given expression
data Derivation = Empty | Derivation TypeRule Derivation Type

fixType :: Expr -> Type
fixType t =
case (matchRule t) of
(RuleMatch TrueType _)    -> Bool
(RuleMatch FalseType _)   -> Bool
(RuleMatch LambdaType t1) -> Arrow \$ fixType t1
(RuleMatch Inv t1)        -> fixType t1
(RuleMatch App _)         -> fixType \$ eval t

The type of an invocation is determined by the lambda's subterm, so matchRule provides that as t1 here for further type information. The type of an application is dependent on the type of it's first argument, so we cheat a bit here and use the single step eval to get at the result of the application.

derive :: Expr -> Derivation
derive t =
case (matchRule t) of
(RuleMatch TrueType _)    -> Derivation TrueType Empty \$ fixType t
(RuleMatch FalseType _ )  -> Derivation FalseType Empty \$ fixType t
(RuleMatch rule t1)       -> Derivation rule (derive t1) \$ fixType t

The type rules are much easier to apply, they simply descend into the terms to build up the type, providing the fixed type at each step as the conclusion. Taking the same example from the evaluation rules earlier (-> (-> true))() false which is parsed to:

Apply (Invoke (Lambda (Lambda (BooleanExpr True)))) (BooleanExpr False)

The type derivation using logic notation looks like:

The Derivation instance corresponding the the logic notation is again much larger but captures the same information (formatting added after the fact):

Derivation App
(Derivation Inv
(Derivation LambdaType
(Derivation TrueType Nothing Bool)
) (Arrow Bool) -- Lambda type is X -> <subterm type>
) (Arrow Bool) -- Inv type is it's subterm's type
) Bool -- App type T in the applicand's X -> T

As noted in the comments each step in the derivation resolves the type at that step based on the inference rules.

## Detecting Ambiguity

So far we've seen that it's possible to build an understanding of evaluation and typing that provides more information than just the evaluation result or the fixed type for a term. Capturing that extra information, a term can be represented by a triple (S, E, T), where S is the syntax string of the term, E is the evaluation derivation, and T is the type derivation. This triple can be used to determine whether two terms will cause confusion.

One approach is to first compare the S values for two terms and then determine if the E and T values match. Terms with "similar" S values but different E or T values might be ambiguous and could be flagged for review. Using the Levenshtein Distance to keep the calculation for similarity simple:

lev is the Levenshtein distance function and dist is just the ratio of the distance between the two strings and the maximum length of both. This is sometimes referred to as the Levenshtein Ratio. For (-> true)() -> false and (-> true) () -> false:

A relative value for string distance that can be used as a threshold "setting" makes building a tool for automating the process easier. That is, if two terms are deemed "close enough" by virtue of their dist value being below a predetermined threshold and they have different information in either E or T then they might be flagged [3].

## Fuzzy Search

We now have enough information to define a system that will automate the exploration of the "term space" (all term combinations), and run a check against existing known terms for ambiguous pairs for each generated term.

Storing the triple of known terms for comparison is fairly easy with the text search capabilities available in most modern databases. One might even implement the Levenshtein Distance function and use it to check a new term against known terms. It may be that a purpose built data structure for the storage and retrieval based on a text search algorithm would perform better, but a good all purpose RDBMS would be fine for a first pass.

More interesting is the generation of terms for a non-trivial language. A term generator would start with atomic types and successively wrap them in terms defined to have subterms. That part can likely be performed with nothing more than knowledge of the grammar. There are two issues with this.

First, the complexity of many programming languages makes re-examining the same terms an enormous waste of time. Tracking the explored terms and "resuming" the exploration process would have a lot of value. Second, generating the derivations to store and compare along with the syntax is an involved effort. Again, it's easy to tag a piece of syntax with the result of execution or typing but information is lost.

## Quick and Dirty

A less complicated representation of a term might still be effective, and could avoid extra effort required of the language creator in generating the evaluation and type derivations. For example the tuple (S, A), where S remains the syntax of the term and A is the AST representation.

( "(-> true)() -> false",
Apply (Invoke (Lambda (BooleanExpr True))) (Lambda (BooleanExpr False)) )

( "(-> true) () -> false",
Apply (Lambda (BooleanExpr True)) (Lambda (BooleanExpr False)) )

It's obvious that the abstract representations capture the issue at hand even if there is some information lost [4]. Best of all the AST for a term is available regardless of the host language and serialization is the only extra requirement. Having a term generator that works with a (E)BNF, a way to generate the AST for a term (presumably through the language parser), and a database equipped with the ability to find like terms it seems entirely possible to alert the language creator of complex or convoluted pairings.

## Further Work

First I have to apologize for not building out a tool for generating terms or a schema for term storage. I wanted to do the automated evaluation and type derivations to get a feel for the effort involved and the result was an exceptionally long post. If I find the time to return to this I'd like to build out the term generator and couple it with a simple database. I think that going through the process of building a BNF parser would be a lot of fun by itself.

In the course of these two posts we've seen what it looks like to formalize both the evaluation and type semantics of a simple programming language. We've also come to a relatively satisfying formalization of semantic ambiguity that could be used in conjunction with a common language definition form (BNF, EBNF) to alert a language designer of potential issues [5] [6].

### footnotes

1. It might be that when a function identifier is the only difference between terms, here * and +, it's reasonable to ignore ambiguous terms. In this case because the total string length for both terms is small it might be that a single character difference is enough to break some arbitrary threshold. I'm leaving this for further consideration.
2. The implementation in Haskell forced these issues out into the open. I'm curious if proving progress and preservation would have pointed out the flaws in my approach (this may be obvious one way or another to a better educated reader).
3. Assuming it's possible, it's interesting to think about what the inverse result means. That is, when two terms are very syntactically different but have identical types/evaluation derivations. This might signal the two terms or the parent language as antithetical to Python's slogan of "one and only one way to do it".
4. For example, the AST doesn't capture the type of lambda form that was used. This may be useful information even if this particular example doesn't require it.
5. Though it would be infinitely more satisfying if we could build a tool based on the ideas and arrive at that same conclusions about this CoffeeScript subset and a few other BNF friendly languages.
6. It's worth pointing out that the CoffeeScript issue with lambdas and invocation has been/was known to Jeremy. It was simply a choice in favor of flexibility. I like to think that the hypothetical tool presented here would be useful in cases where ambiguous term pairings are less obvious and for people who may want less flexibility.
7. A special thanks to keyist for proofreading.

09 Jan 2013