Friday, August 18, 2006

Closures for Java

I'm co-author of a draft proposal for adding support for closures to the Java programming language for the Dolphin (JDK 7) release. It was carefully designed to interoperate with the current idiom of one-method interfaces. An abbreviated version of the original proposal is reproduced below. The latest version of the proposal and a prototype can be found at http://www.javac.info/.

Gilad Bracha, Neal Gafter, James Gosling, Peter von der Ahé

Modern programming languages provide a mixture of primitives for composing programs. C#, Javascript, Ruby, Scala, and Smalltalk (to name just a few) have direct language support for function types and inline function-valued expression, called closures. A proposal for closures is working its way through the C++ standards committees as well. Function types provide a natural way to express some kinds of abstraction that are currently quite awkward to express in Java. For programming in the small, closures allow one to abstract an algorithm over a piece of code; that is, they allow one to more easily extract the common parts of two almost-identical pieces of code. For programming in the large, closures support APIs that express an algorithm abstracted over some computational aspect of the algorithm. We propose to add function types and closures to Java. We anticipate that the additional expressiveness of the language will simplify the use of existing APIs and enable new kinds of APIs that are currently too awkward to express using the best current idiom: interfaces and anonymous classes.

Function Types

We introduce a new syntactic form:

Type
Type ( TypeList ) { throws ThrowsTypeList }
ThrowsTypeList
Type
ThrowsTypeList
ThrowsTypeList VBAR Type
VBAR
|

These syntactic forms designate function types. A function type is a kind of reference type. A function type consists of a return type, a list of argument types, and a set of thrown exception types.

Note: the existing syntax for the throws clause in a method declaration uses a comma to separate elements of the ThrowsTypeList. For backward compatibility we continue to allow commas to separate these elements in method and function declarations, but in function types we require the use of the '|' (vertical-bar) character as aseparator to resolve a true ambiguity that would arise when a function type is used in a type list. For uniformity of syntax, we also allow the vertical-bar as a separator in the throws clause of method and function definitions, and as a matter of style we recommend that new code prefer the vertical-bar.

Local Functions

In addition to function types, we introduce local functions, which are one way to introduce a name with function type:

BlockStatement
LocalFunctionDeclarationStatement
LocalFunctionDeclarationStatement
Type Identifier FormalParameters { throws ThrowsTypeList } Block

A local function declaration has the effect of declaring a final variable of function type. Local functions may not be declared with a variable argument list. Local functions may invoke themselves recursively.

Note: this syntax omits annotations, which should be allowed on local functions.

Example

Combining these forms, we can write a simple function and assign it to a local function variable:

 public static void main(String[] args) {
     int plus2(int x) { return x+2; }
     int(int) plus2b = plus2;
     System.out.println(plus2b(2));
 }

Namespaces and name lookup

The Java programming language currently maintains a strict separation between expression names and method names. These two separate namespaces allow a programmer to use the same name for a variable and for methods. Local functions and closure variables necessarily blur the distinction between these two namespaces: local functions may be used as expression values; contrariwise, variables of function type may be invoked.

A local function declaration introduces the declared name as a variable name. When searching a scope for a method name, if no methods exist with the given name then local functions and variables of the given name that are of function type are considered candidates. If more than one exists (for example, function-typed variable declarations are inherited from separate supertypes), the reference is considered ambiguous; local functions do not overload.

When searching a scope for an expression name, local functions are treated as variables. Function names and values can therefore be used like other values in a program, and can be applied using the existing invocation syntax. In addition, we allow a function to be invoked from an arbitrary (for example, parenthesized) expression:

Primary
Primary Arguments

Anonymous Functions (Closures)

We also introduce a syntactic form for constructing a function value without declaring a local function (precedence is tentative):

Expression3
Closure
Closure
FormalParameters Block
Closure
FormalParameters : Expression3

Example

We can rewrite the assignment to plus2b in the previous example using an anonymous function:

    int(int) plus2b = (int x) {return x+2; };

Or, using the short form,

    int(int) plus2b = (int x) : x+2;

The type of a closure

The type of a closure is inferred from its form as follows:

The argument types of a closure are the types of the declared arguments.

For the short form of a closure, the return type is the type of the expression following the colon. For a long form of a closure, if the body contains no return statement and the body cannot complete normally, the return type is the type of null. Otherwise if the body contains no return statement or the form of return statements are without a value, the return type is void

Otherwise, consider the set of types appearing in the return statements within the body. These types are combined from left to right using the rules of the conditional operator (JLS3 15.25) to compute a single unique type, which is the return type of the closure.

The set of thrown types of a closure are those checked exception types thrown by the body.

Example

The following illustrates a closure being assigned to a variable with precisely the type of the closure.

    void(int) throws InterruptedException closure =
  (int t) { Thread.sleep(t); }

Subtyping

A function type T is a subtype of function type U iff all of the following hold:

  • Either
    • The return type of T is either the same as the return type of U or
    • Both return types are reference types and the return type of T is a subtype of the return type of U, or
    • the return type of U is void.
  • T and U have the same number of declared arguments.
  • For each corresponding argument position in the argument list of T and U, either both argument types are the same or both are reference types and the type of the argument to U is a subtype of the corresponding argument to T.
  • Every exception type in the throws of T is a subtype of some exception type in the throws of U.

Exception handling

The invocation of a function throws every checked exception type in the function's type.

It is a compile-time error if the body of a function can throw a checked exception type that is not a subtype of some member of the throws clause of the function.

Reflection

A function type inherits all the non-private methods from Object. The following methods are added to java.lang.Class to support function types:

    public final class java.lang.Class<T> ... {
        public boolean isFunction();
        public java.lang.reflect.FunctionType functionType();
        public Object invokeFunction(Object function, Object ... args)
           throws IllegalArgumentException | InvocationTargetException;
    }
    public interface java.lang.reflect.FunctionType {
        public Type returnType();
        public Type[] argumentTypes();
        public Type[] throwsTypes();
    }

Note: unlike java.lang.reflect.Method.invoke, Class.invokeFunction cannot throw IllegalAccessException, because there is no access control to enforce; the function value designates either an anonymous or local function, neither of which allows access modifiers in its declaration. Access to function values is controlled at compile-time by their scope, and at runtime by controlling the function value.

The type of null

We add support for null and the type of null in function types. We introduce a meaning for the keyword null as a type name; it designates the type of the expression null. A class literal for the type of null is null.class. These are necessary to allow reflection, type inference, and closure literals to work for functions that do not return normally. We also add the non-instantiable class java.lang.Null as a placeholder, and its static member field TYPE as a synonym for null.class.

Referencing names from the enclosing scope

Names that are in scope where a function or closure is defined may be referenced within the closure. We do not propose a rule that requires referenced variables be final, as is currently required for anonymous class instances. The constraint on anonymous class instances is also relaxed to allow them to reference any local variable in scope.

Note: Some who see concurrency constructs being the closure construct's primary use prefer to either require such referenced variables be final, or that such variables be explicitly declared for sharing, perhaps by requiring them be declared volatile. We reject this proposal for a few reasons. First, concurrency has no special role in the need for closures in the Java programming language; the proposal punishes other users of the feature for the convenience of these few. Second, the proposal is non-parallel with the closest existing parallel structure: classes. There is no constraint that a method may only access, for example, volatile fields of itself or other objects or enclosing classes. If compatibility allowed us to add such a rule to Java at this time, such a rule would obviously inconvenience most programmers for very little benefit. Third, marking such variables volatile, with all the semantic meaning implied by volatile, is neither necessary nor sufficient to ensure (and hardly assists!) appropriate use in a multithreaded environment.

Non-local transfer

One purpose for closures is to allow a programmer to refactor common code into a shared utility, with the difference between the use sites being abstracted into a local function or closure. The code to be abstracted sometimes contains a break, continue, or return statement. This need not be an obstacle to the transformation. A break or continue statement appearing within a closure or local function may transfer to any matching enclosing statement provided the target of the transfer is in the same innermost ClassBody.

Because the return statement within a block of code is given new meaning when transformed by being surrounded by a closure, a different syntactic construct is required to return from an enclosing function or method. The following new form of the return statement may be used within a closure or local function to return from any enclosing (named) local function or method, provided the target of the transfer is in the same innermost ClassBody:

NamedReturnStatement
return Identifier : ;
NamedReturnStatement
return Identifier : Expression ;

No syntax is provided to return from a lexically enclosing closure. If such non-local return is required, the code should be rewritten using a local function (i.e. introducing a name) in place of the closure.

If a break statement is executed that would transfer control out of a statement that is no longer executing, or is executing in another thread, the VM throws a new unchecked exception, UnmatchedNonlocalTransfer. (I suspect we can come up with a better name). Similarly, an UnmatchedNonlocalTransfer is thrown when a continue statement attempts to complete a loop iteration that is not executing in the current thread. Finally, an UnmatchedNonlocalTransfer is thrown when a NamedReturnStatement attempts to return from a function or method invocation that is not pending in the current thread.

Closure conversion

We propose the following closure conversion, to be applied only in those contexts where boxing currently occurs:

There is a closure conversion from every closure of type T to every interface type that has a single method with signature U such that T is a subtype of the function type corresponding to U.

We will want to generalize this rule slightly to allow the conversion when boxing or unboxing of the return type is required, e.g. to allow assigning a closure that returns int to an interface whose method returns Integer or vice versa.

Note: The current Java idiom for capturing a snippet of code requires the use of a one-method interface to represent the function type and an anonymous class instance to represent the closure:

        public interface Runnable {
           void run();
        }
        public interface API {
           void doRun(Runnable runnable);
        }
        public class Client {
            void doit(API api) {
                api.doRun(new Runnable(){
                   public void run() {
                       snippetOfCode();
                   }
                });
            }
        }

Had function types been available when this API was written, it might have been written like this:

        public interface API {
            void doRun(void() func);
        }

And the client like this:

        public class Client {
            void doit(API api) {
                api.doRun(() {snippetOfCode(); });
            }
        }

Unfortunately, compatibility prevents us from changing existing APIs. One possibility is to introduce a boxing utility method somewhere in the libraries:

        Runnable runnable(final void() func) {
            return new Runnable() {
                public void run() { func(); }
            };
        }

Allowing the client to write this:

        public class Client {
            void doit(API api) {
                api.doRun(runnable(() {snippetOfCode(); }));
            }
        }

This may be nearly good enough from the point of view of how concise the usage is, but it has one more serious drawback: every creation of a Runnable this way requires that two objects be allocated instead of one (one for the closure and one for the Runnable), and every invocation of a method constructed this way requires an extra invocation. For some applications -- for example, micro-concurrency -- this overhead may be too high to allow the use of the closure syntax with existing APIs. Moreover, the VM-level optimizations required to generate adequate code for this kind of construct are difficult and unlikely to be widely implemented soon.

The closure conversion is applied only to closures (i.e. function literals), not to arbitrary expressions of function type. This enables javac to allocate only one object, rather than both a closure and an anonymous class instance. The conversion avoids any hidden overhead at runtime. As a practical matter, javac will automatically generate code equivalent to our original client, creating an anonymous class instance in which the body of the lone method corresponds to the body of the closure.

Example

We can use the existing Executor framework to run a closure in the background:

 void sayHello(java.util.concurrent.Executor ex) {
     ex.execute((){ System.out.println("hello"); });
 }

Further ideas

We are considering allowing omission of the argument list in a closure when there are no arguments. Further, we could support a sugar for calls to functions whose last argument is a zero argument closure:

 void foo(T1 p1, ..., Tn pn, R() pn+1) {...}

could be called as

 T1 a1; ... Tn an;
 foo(a1, ..., an){...};

where the call is translated to

 foo(a1, ..., an, {...});

In the special case where there is only one argument to foo, we also would allow

 foo{...}

for example

 void sayHello(java.util.concurrent.Executor ex) {
     ex.execute { System.out.println("hello"); }
 }

We are also experimenting with generalizing this to support an invocation syntax that interleaves parts of the method name and its arguments, which would allow more general user-defined control structures that look like if, if-else, do-while, and so on.

This doesn't play well with the return statement being given a new meaning within a closure; it returns from the closure instead of the enclosing method/function. Perhaps the return from a closure should be given a different syntax:

 ^ expression;

With this, we probably no longer need the nonlocal return statement.

Acknowledgments

The authors would like to thank the following people whose discussions and insight have helped us craft, refine, and improve this proposal:

Lars Bak, Cédric Beust, Joshua Bloch, Martin Buchholz, Danny Coward, Erik Ernst, Christian Plesner Hansen, Doug Lea, "crazy" Bob Lee, Martin Odersky, Tim Peierls, John Rose, Ken Russell, Mads Torgersen, Jan Vitek, and Dave Yost.

35 comments:

Marcin 'Qrczak' Kowalczyk said...

Sun finally changed its mind (bindings and assignments)? Inner classes are no longer an acceptable substitute (About Microsoft's "Delegates")? I'm glad, but it's sad that this was so late.

Tim Vernum said...

I'm not sure I understand the reasoning behind adding the invokeMethod method to Class.
Is this just a way of avoiding having a supertype for all function objects? That seems quite ugly.

It seems that I need to call:
doSomething.getClass( ).invokeFunction( doSomething, arglist );
rather than:
doSomething.invoke( arglist )

Clearly the latter is a better use of objects, why was the former chosen?


I don't see the point of the syntactical sugar in the "Further ideas" section. Does ex.execute { System.out.println("hello"); }
really provide enough value over
ex.execute ( (){ System.out.println("hello"); });
to complicate the language that much?


I don't really see the value of the non local return - do you have any interesting use cases?
I'm not sure I like that calling
String message = getMessage();
could actually force my method to return if getMessage is a function variable rather than a method call.
It greatly increases the number of possible exit points from a method - what is the intended benefit from such a complexity?

Also, the lack of continutation support seems to make this feature less useful that it could be any way.
Sure it's nice that I can call break and continue from within my closure, but that seem like half an implementation of co-routines. If you're going to go half way there, then surely we can get a full implementation.

Neal Gafter said...

The reason to add invokeMethod to class Class is that Class is final, and it must have reflection support for all object types. Currently it has reflection support methods for classes and interfaces, arrays, and enums, even though only one of those sets of methods are applicable at any given time. Adding a new reference type necessitates adding reflection support for it as well.

The invokeFunction method is part of class Class, not a member of the function object. To invoke a function object directly you simply place a pair of parentheses after it with the arguments between the parens. Class.invokeMethod is mainly intended for metaprogramming applications like debuggers. It isn't something you would normally see in an ordinary program.

Hanson Char said...

Since Function Type is added to support local function and closure, can all methods (at least static methods) be generalized as function type, so that even non-local (static) methods can be assigned to a function type variable, for instance ?

Eirik Maus said...

Congratulations on a well thought through proposal. I say it is very much needed, too. Closures (or "local methods" and "method parameters" to methods) are the two really usefull feature for experienced prgrammers that existed in Simula but still hasn't made it to Java.

One question: Aren't non-local returns more confusing than helpful?

Why should a closure be allowed to exit not only it's own method, but also the method that called it (or it's definition's enclosing method two callers up)?

Closures are there to replace the somewhat cumbersome use of one-method interfaces, like Comparator, like in

void myMethod() {
....
Arrays.sort(myList, new Comparator(){public int compare(Object o1, Object o2} {
return o1.hashCode() - 02.hashCode(); // or something
}});
...
}

I understand and appreciate the need to replace the Comparator with a closure, but can there be any use case for the Comparator.compare() method to return from anything but itself (i.e. force the return of sort() or myMethod()) without the use of an exception?

I mean, java is already quite hard to learn if you haven't been following it from the beginning. We are already trying to re-train a bunch of Cobol-staff in our java enterprise projects. The more fancy or automagical behaviours you can accidentally trigger via typos, the harder that task becomes. Are those returns really necessary?

Anonymous said...

Could you expand on the uses of non-local returns? Tcl treats return/break/continue as exceptions, which always seemed the right thing to me as it allows you to use the normal exception handling machinery to control propagation.

Overall, this looks very promising though. I've long wanted a neater way to abstract over lock-handling patterns in concurrent code, and this looks like it will allow that nicely (i.e., writing my own "synchronized").

Karsten Wagner said...

I think your proposal is a very bad idea. I quote Guy Steele here who wrote:

"Please remember that the design of a programming language consists
not just in a laundry list of features, but also in making judicious
choices of what to *omit* and, more importantly, in establishing
design principles that are easy to understand."

If you add another possibilitiy to create closures, we have several problems:
- Two different language features which provides the same abstraction: To choose between each other requires thinking time from the programmers and thus reduces productivity.
- Also it makes code reuse more difficult: What if one uses for inner classes and one is using your approach?
- Lack of lib-support: It's impossible to change the existing libs to use closures only so the best thinkable think would be to create additional methods with closure use for most relevant part of the standard lib. But even than we would have two different features for the same thing and lots of duplicate code in the standard libs.
- Lack of additional important features. If you really want to use closures well you also need multiple return types, so this has to be an additional addition to the language.


If the requirement of 'boxing' finals is really a problem for you people, why not simply add a syntax for this:

transient int x = 10;
x = 20;

The above definiton would be internally rewritten to

final int[] x = new int[] { 10 };
x[0] = 20;

With this little sugar it's easy to use any variable from a inner-class and it could also used for call by reference.

Also there could be a litte sugar for the callsite:

instead of

doSomething(new Closure() { int eval(int val) { return val + 1; }})

maybe allow to write something like

doSomething(new Closure()(int val) { return val + 1; })

for classes with only has a single method. This wouldn't be perfectly short, but is also only a simple addition.

Jesse Glick said...

If you want to simplify usages such as those mentioned in

http://java.sun.com/docs/white/delegates.html

then you must make sure that closure -> inner class promotion respects generic type parameters. E.g.:

List<String> words = ...;
Arrays.sort(words, (String s1, String s2) : s1.toLowerCase().compareTo(s2.toLowerCase()));

The compiler must be able to promote int(String,String) to Comparator<String>.

Roman Elizarov said...

Neal, "non-local" return is a great feature to have (and it should be
supported by local and anonymous classes, too), but it must be
checked, that is the fact that method does any kind of non-local
return must be explicitly specified in its signature or implicitly
added to its signatures via type inference mechanism. It must work
with generic non-local return types, too. It will be require to
support non-local returns from things like executor.execute(), where
you'd like to write this method in such a way as to catch non-local
return on one thread and pass it to another thread. It should be
designed to accidental unchecked exceptions. For example, take the
code from you further examples:

Executor ex;
boolean method() { ex.execute { /* non-local */ return true; } }

Existing version of "Executor.execute" does not know anything about non-local return and this code will end up with uncaught unchecked exception in executor thread. It means, that methods like "execute" shall be designed with non-local return in mind (to pass result to a correct thread) and this fact must be properly reflected in their [generic] signature, too. If this problem is not addressed, I'd say NO to non-local returns.

Additionally, I would like to see this proposal being split in two parts. In the first part "final" constrains shall be lifted for local and anonymous classes, non-local returns (maybe), and other syntactic sugar from your "further ideas" added for local and anonymous classes. That first part, in itself, will add considerable power to the Java language -- almost all your example will become considerably simpler as intended. Only when you play with it (say introduce it in 1.7) and you find that it is not enough, you should consider adding new non-orthogonal concept (functions) into the language (say in version 1.8).

Sven Mawson said...

It seems like this proposal is trying to wrap too many different additions into one. We have the functions-as-objects part, which gives us local functions, function pointers, etc, and then we have the closures part, which in and of itself is an incredibly complex feature.

Looking at the function pointer part, I believe this proposal is much more complex than it needs to be. Instead, I'd propose a much simpler syntax that just solves the immediate problem of overuse of single-method interfaces.

1) Add a "function" keyword that lets a programmer define top-level functions in a file.

2) Add syntax support for creating and passing these function objects.

As an example, here is a current-style single-method interface-as-function:

interface MyInterface {
public int myMethod(int a, int b);
}

class Test {
public void test() {
doSomething(new MyInterface() {
public int myMethod(int a, int b) {
return a * b;
}
});
}

private void doSomething(MyInterface func) {
System.out.println("Got " + func.myMethod(2, 3));
}
}

Here is what this would look like with top-level method support:

function int myFunction(int a, int b);

class Test2 {
public void test() {
doSomething(new myFunction(a, b) {
return a * (2 * b);
});
}

private void doSomething(myFunction func) {
System.out.println("Got " + func(2, 4));
}
}

Now, one of the problems with this solution is that a function doesn't automagically stand in for all other functions (or single-method interfaces) with the same signature. But if you allow:

function int myFunction(int a, int b) extends bobsFunction, joesFunction;

Then you don't have a problem. Alternatively, you could do the same as the current proposal, which is to make this just work correctly, but that seems against the principle of strong-typing.


Now, onto the closures aspect. When I think of a closure, I think of a partially-bound function call. But this is trivial in a syntax where each function is strongly-typed. Once you have the function with all arguments, it automatically defines the rest of the functions you need for closures:

function int myFunction(int a, int b);

Also defines:

function Function myFunction(int a);

Where the returned function would be an anonymous function of type:

function int anonymousFunction(int b);


Anyway, I think the current proposal is much too complicated, and way too convoluted. We need something nice and simple that doesn't modify the language too much, but gives us the flexibility we need to get the job done.

John Rose said...

Hi Neal. For my part, I think Java needs (a) better closures and (b) slightly better type parameters but not (c) function types. I've just blogged on point (c). Cheers.

Neal Gafter said...

John: Your approach doesn't work if a closure throws more than one exception type.

MAEDA Atusi said...

Nitpicking comment: Closures are not anonymous functions.

In standard computer science terminology, the word closure means a function that refers to lexical variables bound by outer context.

So, in this proposal, local function can also be a closure. Please do not distort the usage of already existing (widely accepted) technical terms.

Axel Rauschmayer said...

Single-method interfaces really seem awfully close to what you propose. Isn't the problem mainly at the client (invocation) side? Then all you'd need would be some clever syntactic sugar for concisely/anonymously implementing one-method interfaces.

Where I do fully understand your desire to extend Java is that I'd also really like a simple way to abstract iteration constructs. But then your support of "break" and "continue" is too simple. I'd also like something along the line of Python's generators (=simplified coroutines) to make iterator implementation less painful.

I think that Sun is considering extending the language again is *good* news, as long as things remain simple (I think annotations, generics and the new for loops were great successes in Java 5). Features I still sorely miss in Java are:

- Code-only multiple inheritance. We have interface-only inheritance with interfaces and interface+code single inheritance with classes. Code-only multiple inheritance seems to be missing.

- More things that should be borrowed from other languages are: multi-methods (MultiJava), open classes (MultiJava), map/list literals.

Tom said...

Can't we make the notation simpler by introducing anonymous methods?

Like this: every interface that defines just one method can be used in a short-hand notation.

For example:
interface x{ public String y(int z);}

may be used as:
new x(int z){...};

I haven't through it all through yet, but the one-method-allows-anonymous-method seems interesting.

Neal Gafter said...

Tom: see my later blog post about use cases. The purpose of the proposal is not just to make it more convenient to do some things that are currently possible, it is to make it possible to do things that are currently impossible.

Anonymous said...

If you do that you should't you do method invocation? The most ugly thing about one function anomynous inner classes is the declation of the type of the method and the method body.

I'd like something like:

void do(){
list.map(functionName(arg1));
}

@Closure
int functionName(arg1){
return arg1 * 2;
}

I don't know people might confuse the thing with a real method invocation.
Currying is what look exciting about this.
Pity its probably not on the table.

Brennan said...

Hmm, I think that some people are forgetting what made Java so successful...that it was simpler and more elegant than C++ (it was only a few years ago, was it not, that Microsoft ran an ad in several computing journals bragging that their C++ compiler was greater than 90% ANSI compliant?). Some very well-intentioned people are leading Java down this same path of complexity that may well cause programmers to jump ship in the future for a more simple and elegant language. Not that I blame the people who are stewarding Java, having to sit between people who are complaining on one hand that Java is too complex (J2EE) and on the other that it doesn't have more features like continuations or closures. But I think those who are altering Java now should be very careful: whatever decisions are made now will be forever stuck in the language due to the need for backward compatibility.

Instead of simply borrowing features from other languages and grafting them onto Java (Generics comes to mind), you should re-think them in terms of how they can fit organically within the design of the language. Function pointers simply do not fit with the spirit of OOP or Java. Closures are promising, they certainly do solve certain types of problems more elegantly, but there has to be a way to make them fit better within Java. An abbreviated syntax for anonymous inner classes, perhaps? Just a thought.

Tom said...

Hear hear. That is what my suggestion was about: what are we actually solving?

Generics are required to get strong typing correct, but local functions do not make me happy.

Peter C said...

Oh, good lord, no!

I'm sure these are great features in the languages that had them by design from the start. But adding them now to Java makes the syntax hideous and the API confusing (do they provide duplicate methods using the new features, ignore them or what?). Our nice simple explicit language becomes a pastiche of all sorts of others.

Generics are a distressingly complicated addition to a previously-simple language, but I do think they just about provided enough benefit to justify it. And they helped to reinforce collections, something that is a core element of how one programs in Java. In contrast, the proposed new features are a big departure from how one currently programs in Java.

If you don't like Java, use another language. Don't mess up Java.

Avner said...

Maybe there is a way you can change existing APIs to accept function types without breaking backward-compatiblity. I've written a short description of my suggestion for .

Sven Efftinge said...

How about inferring the types of the arguments if possible?
I would like to write:

myListOfPersons.select((p) : p.age>=18)

instead of

myListOfPersons.select((Person p) : p.age>=18)

Anonymous said...

Can anyone prove to me mathematically the power that closures add to the Java language? If it adds no power why complicate the language?

As far as applying filters to collections of objects goes (as in the examples provided by Martin Fowler in his Closure blog), there are various nice object oriented approaches that are both, dynamic and extansible - Predicates, JXPath based filters etc.

What reusability does an anonymous class provide? In the same vein, what reusability will an anonymous method provide?

In fact, instead of the marginal (and already available in more object oriented forms) functionality that closures provide, they also provide all the horrors of GoTo - which Java has avoided so far.

Or is this a way of implementing GoTo and introducing all the spaghetti mess of a Basic program into the J2EE environment?

Neal Gafter said...

Most programming languages are turing-complete, so the power of closures do not arise from their ability to compute things that cannot otherwise be computed. Rather their expressive power arises from being able to abstract over things that you simply could not abstract over before. In the case of closures, it is the ability to define APIs that abstract over an arbitrary statement or expression. Writing such APIs simply isn't possible in Java today.

Anonymous said...

Since we are agreed that most languages are turing-complete, it should follow that the ability to define APIs that abstract over an arbitrary statement or expression can be expressed in other, more object oriented ways. For example, all of the examples provided by Martin Fowler on his blog can be re-written using Predicates or JXPath in a compact AND object-oriented way. I have yet to see an example that can not be expressed in an object-oriented way with equivalent simplicity and greater reusability.

What you've said simply argues a functional programming stand-point. This is orthogonal to more important issues like consistency, readability and most importantly, maintainability.

Strongly typed languages avoid certain features otherwise available in weakly typed, dynamic scripting languages exactly for this purpose - to ensure maintainability regardless of the programmer who spews out the code. This should be clear on comparing any sizeable project in PHP with a simillar sized project in Java. The same is true for the high-level v/s low-level language argument. For example, C is an order of magnitude more maintainable than assembly and Java is an order of magnitude more maintainable than C.
Reusability, maintainability and consistency should be key considerations for any high-level language playing a part as influential as Java is playing in the enterprise application arena.

I see the move to features supported by weakly typed dynamic languages as a move away from these principles.
You are making it easier than ever before to write spaghetti code.

While closures can be used to express some things in a more terse syntax, one must remember that the use of terse syntax in a solution does not make the solution more mathematically elegant than the equivalent object-oriented solution. Nor does it necessarily improve the solution's readability or maintenability. On the contrary, the ability of closures to be used as function pointers, do exactly the opposite for readability, maintainability, consistency and mathematical elegance.

Igor Sereda said...

Why not make it possible, as an option, to omit FormalParameters from closure literals? Argument types may be inferred from the expected result and argument names may be either declared in the receiving type

int(int x) func = {x + 1};

or they may have pre-defined special names (e.g. "a", "b", "c" ...)

int(int) func = {a + 1};


Otherwise you'll have the same sort of code bulkiness that is apparent when Generics make you repeat type params:

List<String> list = new ArrayList<String>();

It's not much overhead unless you have multiple type parameters with long, self-commenting class names.

Another thing of concern is the runtime impact of using closures and function types. What's bad with the anonymous classes is that they require a class for base interface, a class per anonymous usage in code and an object to be initialized.

Will function types be implemented with classes?

Will an object be created each time this code is invoked:

doIt((){System.out.println("Hello");});


Overall, I think closures is a good thing to have in Java. In our project, we use anonymous classes a lot (the number of classes is about twice the number of files), and I'm looking forward to throwing away unneeded classes and lines of code.

Home equity loan said...

java is already quite hard to learn if you haven't been following it from the beginning. We are already trying to re-train a bunch of Cobol-staff in our java enterprise projects. The more fancy or automagical behaviours you can accidentally trigger via typos, the harder that task becomes.

Bojan Antonovic said...

I was quite happy when Sun made a minimalistic Java and removed function pointers. Closures can be simulated by implementing an interface that holds one or multiple(!) functions/methods.

Closures are not my most demanded feature for Java. I would like to have an erasure free and covariant (!) generics implementation. Further I would like to see conditions as a generalization of type checks to increase safty: I want to have a collection of numbers that are greater than 5 or prime or strings with a certain pattern. A cast would be a condition check. But even more missing are const from C++ and the nonnull attribute.

Java shouldn't implement each nice idea from other languages. Especially you see a big hype of dynamic languages in the recent years which are impossible to debug in large scale. Java should stand where it is. It can support the development of robust software with a minimal(!) change. A DSL (domain specific language) will always be necessary. So let us have the dynamic language hype for now, and see what are the best parts of it.

In the case of doubt Java should be frozen to not become a second C++! Sun's sandbox is Groovy. Separate Java from Groovy!

Fireblaze said...

What problem does closures solve that is not possible to solve using Java 1.6?

If none, then closures should not be in Java.

Neal Gafter said...

@Fireblaze: see http://www.bejug.org/confluenceBeJUG/display/PARLEYS/Closures+for+Java

Steve Morin said...

I would love to see closures in java. Yahoooooo!!!!

David Foster said...

Most of this looks solid. One detail I have to quibble over is the introduction of vertical bars to delimit throws-types. This seems somewhat ugly and inconsistent with the look & feel of the existing mechanisms (e.g., implements TypeList, throws TypeList).

I suggest resolving ambiguity by requiring the use of parentheses around function-types if they have a throws clause with multiple elements.

Sample declarations:
--int(int) inc = (int x) { return x+1; };
--int(String) throws NumberFormatException parse = Integer.parseInt;
--(Object(String) throws IOException, ClassNotFoundException) make = (String beanName) { return Beans.instantiate(Object.class, beanName); };

Grammar changes:
--Type:
----Type ( TypeList ) { throws Type }
----( Type ( TypeList ) { throws ThrowsTypeList } )
--ThrowsTypeList:
----Type
----ThrowsTypeList , Type

On another tack, I think that non-local transfer is probably a bad idea. It seems rather complex to learn, difficult to compile, and difficult to provide tool support for. If you could provide a reference implementation of a compiler that supported this feature, however, I'd consider it.

Michael Maraist said...

point 1: lexical closures are BUGGY

point 2: java already facilitates highly useful pseudo-closures, but it's syntactically verbose.. So all we need are shortcust compilation syntax

point 3: the shortcuts can inspect the data-types of their context to reduce things down to a token + symbol-names + raw block-code. I'm referring to Anonymous class creation shortcut techniques here.

point 4: references to methods already exist as java.lang.reflect.Method.


On point 1..

for (i = 0; i < divs.length; i++)
{
divs[i].onclick(function (e) { e.id = i; } );
}

Is a BUG!!

While often surprised, I've been SAVED from bugs by being forced to produce final local values in java (just like if (x) requires a boolean expression)

for (i = 0; i < divs.length; i++){
final id = i;
divs[i].onclick(new OnDiv() {public void run(Div d) { d.id = id; });
}

On point 2...

All this function is missing is conciseness. But in this case, we have an advantage, we know the EXACT method signature we're looking for. Why duplicate it all? We are, unfortunately missing the parameter-names, so we have to pass that in.. Also we need to disambiguate control-flow various legal operations within the function call.

I see most people here suggesting:
void(Div) : (d) return x;

The void, Div and the fact that we need a return are all redundant. We can remove all of that in the syntax.

SingleMethodAnonymousClass: "{?" arg_list ":" block_code "}" ;

MultiMethodNonOverloadedAnonymousClass: "{?*" method_block_sequence "} ;

MultiMethodOverloadedAnonymousClass: "{?+" overloaded_method_block_sequence "} ;

overloaded_method_lock_sequence: overloaded_method_block
| overloaded_method_block_sequence overloaded_method_block
;

method_lock_sequence: method_block
| method_block_sequence method_block
;

overloaded_method_block: method_sym "(" def_arg_list ")" "{" block_code "}" ;

method_block: method_sym "(" arg_list ")" "{" block_code "}" ;

arg_list: sym
| arg_list "," sym
;

def_arg_list: type sym
| arg_list "," type sym
;

The {? symbol is the new syntactic sugar which defines the alternate code-block flow

So for trivial single-method operations we get:

for (i=0;i< divs.length;i++) {
final int id = i;
divs[i].onclick({?d:d.id=id})
}

For more complex multi-methods like Comparable you get:
Collections.sort(list,{?* compare(o1,o2) { return o1 < o2 ? -1 : o1 == 02 ? 0 : 1} equals(o) {this==o;}};

This is still verbose, so I propose auto-boxing equals and hashcode for certain flagged classes. Possibly with a new annotation @AutoBoxEquals @AutoBoxHashcode. The Comparator class would do this. Then our sort becomes the nice-clean looking:

Collections.sort(list,{?o1,o2:return o1 < 02 ? o1 == 02 ? 0 : 1});

I'm on the fense about explicit 'return'.. Many scripting langauges assume the last statement is used in a return.. And you only need 'return' if returning earlier (in an if-statement). The question is whether it's better to default to 'return null'/'return false'/'return 0' (i.e. signature appropriate return value), or try the last statement.. If the last statement isn't compatible with the signature use a return null/return false/return 0. But this can lead to bugginess, so perhaps being slightly less flexible is best. The advantage of a return null is that many functors don't expect a return value, but the interface wanted to be as flexible as possible. (such as explicit callbacks).

Next..
I agree that a lexical closure CAN be useful, I've often had to use a boxing object or an array

int[] i = new int[] {1};

This isn't horribly bad, just visually annoying. I've typically created a generic object:

class Generic< T > { public T d; Generic(T d){this.d=d;}}

Generic< Integer > i = new Generic< Integer >)(1);
i.d = xxx;

But this is even more verbose. How about an explicit class which at compile-time is converted to one of the above models, either a Generic data-holder or size-1 object-array.

int x = 5;
java.lang.Closure i = x;
i=8;

Closure myint = 5;
Closure mylong = 5L;
Closure mystr = "abc";
Closure mynum = new MyClass(..);

This replaces Closure with a compile-time best-guess of the data-type. (defaulting to type object).

Dereferences of the variable really perform:

mysym.data or mysym[0], whichever is deemed the more practical approach.

for (Closure i = 0; i < numDivs; i++)
{
div.onclick({?div:div.iholder=i;});
}

The above is obviously contrived and useless.. So heres:


interface MyClass { ... public int val(); }

int findMin(MyList< MyClass > list)
{
Closure min = 0;
list.each({?o:
min = Math.min(min,o.val());
});
return min;
}

Such such easy lamba functions (arguably less syntax than python, lisp, etc (lacking a keyword). We could extend Collections to contain

interface java.util.Mutator< T > {
void mutate(T data);
}

interface java.util.Filter< I, O > {
O filter(I data);
}

The Collections has:
class Collection< T > {
...
< O > Collection< O > filter(Filter< T,O > f);
void foreach(Mutator< T > m);


So
Filter< Integer,Integer > createIncFilter() {
return {?i:return i+1;}
}

Or simply
Filter< Integer,Integer > f = {?i:return i+1;}

Comparator< Integer > c = {?o1,o2:return
o1 < o2 ? -1 : o1 == o2 ? 0 : 1; };

Comparator< Integer > c1 = {?* compare(o1,o2){return compare1(o1,o2);}
equals(o){return equals1(o);}};

Closure closed=false;
Closure flushed=false;
Closure sb = new StringBuilder();
Writer w = {?+
close(){closed=true;}
flush(){flushed=true;}
write(char[] cbuf,int off,int len) {
sb.append(cbuff);
}
};
w.write("hello");
w.close();
if (closed) { return sb.toString(); }


It's arguable whether using {?+ should force ALL method parameters to be explicit v.s. only requiring ambiguous ones. I don't think it would be that difficult to separately detect data-types for individual methods. Obviously the return type isn't necessary for extended methods. Though you wouldn't be able to create new methods. I think if you're really trying to produce a new method, you should use the existing anonymous inner class format - as it doesn't really save you too much typing otherwise.

On point 4...

I don't see why

y = f(x);
is that much shorter than:
y = f.invoke(x);

Technically, you can use 'java.lang.reflect.Method' to get an assignable pointer to existing code:

function Method createMethod() {
Object o = new Object() {
public static Object callback(Object o) { /* my special code here */ } }

return o.getClass().getDeclaredMethod("callback",Object.class);
}
boolean res = createMethod().invoke(null, o);

But you need to explicitly know the full signature for this to work. Further, the above required a static method and thus [pseudo-]closures don't work.

It would be much easier to use something common in AOP frameworks:

interface MethodInvoker {
Object invoke(Object ... args);
}


Closure c;
MethodClosure mc = {?a,b:/* do stuff with a,b,c */ }
Object res = mc.invoke(a,b);

For an arbitrary function pointer, it's still a little bloated:

final Foo foo;
MethodClosure mc = {?:foo.doX();};
mc.invoke();

Here I'm assuming an implicit return null; Since return foo.doX() would be an error if it returned void.

It is slightly less performant because it's double-indirection of the method call. Theoretically Method.invoke triggers compilation optimizations, but I doubt it makes any difference.

Neal Gafter said...

@Michael Maraist:

re point 1: the current BGGA prototype requires a diagnostic in these situations now, so I no longer believe this is an issue.

re points 2+3: If you try to complete your proposal as a specification, you'll have a hard time specifying the meaning of names within the body of the closure. That's because the compiler has no way of knowing what names are inherited from the supertypes. The type of the closure needs to be computed before it can be used in overload resolution, but in your approach overload resolution needs to be done so you know what type of closure you're semantically analyzing. Thus, this approach doesn't work.

re point 4: reflection is untyped and slow. Closures should be fully typed (including checked exceptions) and fast.

sourcerror said...

"Further I would like to see conditions as a generalization of type checks to increase safty: I want to have a collection of numbers that are greater than 5 or prime or strings with a certain pattern."

Actually this is what closures where invented for.