Saturday, February 13, 2010

A Syntax Option for Project Lambda (Closures for Java)

It was noted recently on the Project Lambda mailing list that allowing arrays of function type would undermine the type system. The reason for this is a combination of Java's covariant arrays, the natural subtypes among function types (they are covariant on return type and contravariant on argument types), exception checking, and the erasure implementation of generics. We certainly can't remove covariant arrays or checked exceptions, and removing the subtype relationship among function types really reduces their utility. Unfortunately, it is almost certainly too late to reify generics too. Although you can't have an array of function type, there is no problem having, for example, an ArrayList of a function type. So while we might prefer not to have this restriction, it won't be much of a problem in practice.

There are many ways in which arrays of function type would undermine the type system, but to give you some flavor of the problem, the following is a method written assuming that arrays of function type are allowed. This method must fail in some way - either fail to compile, or throw a ClassCastException, or an ArrayStoreException, or something. In all of the implementations being considered, this method would throw IOException, even though that isn't declared in the throws clause of the method. We have undermined Java's exception checking without even a cast!

public void main(String[] args) {
    #void()[] arrayOfFunction = new #void()[1];
    Object[] objectArray = arrayOfFunction;
    objectArray[0] = #(){ throw new IOException(); };
    arrayOfFunction[0].(); // invoke the function out of the array
}

The realization that supporting arrays of function type would introduce a loophole in the type system makes it possible to consider some very nice syntax options that were previously eliminated because there was no syntactic way to write an array of function type. But if we disallow arrays of function type, those options can be considered.

My favorite of the alternatives is based on this grammar

FunctionType:
( TypeListopt Throwsopt ) -> ResultType
Expression:
LambdaExpression
LambdaExpression:
( FormalParameterListopt ) -> Expression

A function type is written with the argument types between parentheses, then a right arrow (dash greater-than), and then the result type.

The most important distinction between this proposal and the existing draft specification is that this proposal places the result type after the argument types, instead of before. The benefits of its simplicity compared to the current specification are most obvious when you combine function types and lambdas with other features.

Here is a simple example to show how the two proposals look. The example is currying. The following method takes as a parameter a function of two arguments, and returns a function of one argument that returns a function of one argument. Applying the resulting function to one value, and then the result of that to another value, should have the same effect as applying the original function to both values. To make it interesting, we make the argument and result types generic, and we allow the function to throw an exception.

Using the currently proposed syntax for Project Lambda, the code would look something like this:

static <T,U,V,X extends Throwable>
##V(U)(throws X)(T) curry(#V(T,U)(throws X) function) {
   return #(T t)(#(U u)(function.(t,u)));
}

On the other hand, with the proposal described above it looks something like this:

static <T,U,V,X extends Throwable>
(T)->(U throws X)->V curry((T,U throws X)->V function) {
   return (T t)->(U u)->function.(t,u);
}

I've intentionally selected an example that uses the new features heavily, so either may take a few moments of studying to understand. But I claim the latter is much easier to follow than the former, even once you've become familiar with the syntax.

You can easily write a function that yields an array as its result

()->int[] arrayLambda = ()->new int[]{1, 2, 3};

With this proposed syntax there is no way to write an array of functions. But that is exactly what we concluded could not be made to work within the type system anyway.