Monday, February 05, 2007

Closures Spec Update (v0.5)

This post discusses a draft proposal for adding support for closures to the Java programming language for the Dolphin (JDK 7) release. It was carefully designed to interoperate with the current idiom of one-method interfaces. The latest version of the proposal and a prototype can be found at http://www.javac.info/.

We've just updated the Closures for Java specification, bringing it to v0.5. There are two significant changes:

  1. We've dropped the nominal version of the specification. We are no longer maintaining parallel versions of the specification (with and without function types) because the most significant concerns regarding function types were resolved in earlier revisions of the spec.
  2. We added support for user-defined looping APIs. I wrote about this in October 2006, but did not integrate that into the spec until now.

There is now a two-hour version of my Closures for Java talk on video. It is the same as the one-hour version but with questions and answers both during and after the talk.

25 comments:

Fman said...

Great Job !!!!

When will the prototype be available?
I think we are looking for it so badly

axel said...

I assume there's a typo in the "Control Invocation Synax" section. The second form should not have "for_opt".

I like the "for" methods. Along with the RestrictedClosures, It's a step up from "closures can do everything" to a more differentiated view of their usage patterns. Should there also be a keyword or interface for blocks that will be executed at most once, or exactly once?

And a small gripe: "catch (IOException ex) {}" is certainly not exemplary code. If the block completed normally, the caller should be told about the exception during cleanup. Otherwise, it should at least be logged somewhere.

This leads to the question of how to combine framework exceptions and user exceptions - maybe "throws IOException, E", where E is a type parameter, but if IOException is a subtype of the concrete E, how would the caller tell them apart?

Ben said...

A question about something in the spec:

void sayHello(java.util.concurrent.Executor ex) {
ex.execute({=> System.out.println("hello"); });
}

Is it legal to write this as

void sayHello(java.util.concurrent.Executor ex) {
ex.execute({=> System.out.println("hello") });
}

(i.e. without the ';' after the println call)?

Both Runnable#run and PrintStream#println return void - is this compatible?

A second question: Would it cause problems (in a technical sense - ignoring whether you would consider it a desirable thing) to allow the following:

1) Allow the types of parameters for closure literals to be inferred from their declared type.

eg.

{ int, int => int } add = { i, j => i + j };

and similarly for closures declared as function parameters.

2) Allow the braces to be omitted in the case where there is only an expression in the body of the closure literal, eg

{ int, int => int } add = i, j => i + j;

or for function parameters,

static Iterable<T> filter(Iterable<T> source, { T=> boolean } predicate);
List<String> strings = ...;
Iterable<String> filtered = filter(strings, s => s.length > 1);

Neal Gafter said...

@Axel: no, that is not a typo. You can have a loop with no loop variables. You might call such a loop "for nTimes".

@Ben: no, you are not allowed a void result expression. I don't know about your second question.

axel said...

*ouch* so that's what "_opt" means :-)

Another try: Except for the "legacy" executor, all the examples in the spec use static methods for "for" and "with...". Is this only for library compatibility, or would you actually suggest static methods as the better way to implement control flow abstractions (maybe for optimization reasons, or for backward syntax compatibility?)

I think the abbreviated syntax looks a bit strange when the expression list is empty:

lock.withLock(){=>
System.out.println("hello");
}

makeReader().with(FileReader in:)
makeWriter().with(FileWriter out:) {
// code using in and out
}

The clusters of consonants "(){=>" and ":){" actually make this less attractive than the unabbreviated form, once you get used to the double parens:

lock.withLock({=> System.out.println("hello") })

makeReader().withClose({ FileReader in =>
makeWriter().withClose({ FileWriter out =>
// code using in and out
}) })

Object-oriented dispatch plus closures - can we agree we all like it? :-)

Neal Gafter said...

@axel: there is no fat arrow in the abreviated syntax. It is not backward compatible to add methods to an existing interface (it breaks existing implementations of the old version).

Tom Palmer said...

I would love to see Ben's "1)" type inference of parameters. I'm not as keen on "2)" optional braces. While C# does both, I'm not so convinced that leaving out the braces is a good balance for Java in terms of brevity vs. clarity.

Stefan Schulz said...

I completely missed the loop abstraction prior to your proposal update. In general it seems a good and simple idea. I assume that using the for keyword in a declaration situation enforces to have some loop construct within the declared method and causes a compile time error, if there isn't. If not, what would be the outcome of continue? Same as break?

As the proposal states, break is breaking the closure. What's with the target of a continue, though? Where does it get back to? Imagine the invoker of the closure having nested loops. Does it come back to the innermost loop, while break simply breaks them all? I think, this is hard to understand needs some closer look, especially, as the invoker of a closure is unable to safely handle a continue, i.e., always invoking a closure having in mind that a continue may be called.

The advantage of the loop abstraction is its neatness, reusing a known loop keyword binding break and continue from the caller's point of view.
The disadvantage of the loop abstraction is the statement not having control over what happens when using break and continue, which is somewhat inconsistent with throwing exceptions (or will closures forbid to catch exceptions that are declared to be thrown?)

Overall, I welcome the consolidation of the proposal. Right today, I had several situations thinking that closures would have saved me twisting my code and passing piles of references between method calls. Thumbs up for closures.

Neal Gafter said...

@stefan: in the nested statement of a control invocation statement with the "for" keyword, "break" breaks from the control invocation statement, and "continue" breaks from the controlled statement. It's in the spec. These nest just like they do with loops.

Stefan Schulz said...

Sorry, my question wasn't clear. I did not mean nested loops in the statement but in the closure invoking method. E.g.:
<K> int for eachColumn(List<List<K>> matrix, {K=>void} block) {
´ ´ int count = 0;
´ ´ for (List<K> row : matrix) {
´ ´ ´ ´ for (K column : row) {
´ ´ ´ ´ ´ ´ block.invoke(column);
´ ´ ´ ´ ´ ´ count++;
´ ´ ´ ´ }
´ ´ }
´ ´ return count;
}
What would the following have as possible outcomes:
int count = for eachColumn(String column : myStringMatrix) {
´ ´ if ("end".equals(column)) break;
´ ´ if (name.startsWith("com.sun.")) continue;
´ ´ System.out.println(name + ":" + value);
}
The first question would be, what count will be after a break or a final continue. Maybe there is a restriction on loop abstractions to only allow for void as return value (or maybe for should be seen as a loop indicator implying void return). One could surely come up with another example that holds a state that finally gets assigned to a given parameter instead of a final return, resulting in a similar problem.
The second question is, where does the continue go. I assume to the inner for looping on columns while the break would break the for eachColumn completely. But this is not visible to the programmer of the for eachColumn as the invoke() statement can break the program flow as a continue statement may do, so he has to keep that in mind.

Neal Gafter said...

@stefan: the control invocation statement is a statement, not an expression.

A break or continue statement breaks or continues from the nearest LEXICALLY ENCLOSING matching statement. An intervening stack frame with a loop in it is not lexically enclosing.

axel said...

In the artima interview, you say about the combination of OO dispatch and cloures, the "each" method: "I don't think there is any point in doing it", "I don't think it does anything for you."

Here's something it might do: If Hotspot inlines the implementation-specific "each" method together with the restricted closure, it will make your code run faster, because no state-machine-based Iterator object has to be allocated.

There's a reason why e.g. ArrayList redefines "indexOf" and "contains".

I understand that java.util collections cannot be changed. But given the chance to develop a new framework, is there a reason NOT to go for dynamic dispatch?

Stefan Schulz said...

Hm, you are right. for is defined for control invocation syntax only. I was mentally copying from your with(lock) example, which also returns the result of the closures invocation, still being used in a control invocation scenario. Hence, I assume the example is wrong or at least misleading.
This still does not solve the problem of implicit continue ability of an invoke call, adopting the example:
<K> void for eachColumn(List<List<K>gt; matrix, AtomicInteger ac, {K=>void} block) {
´ ´ int count = 0;
´ ´ for (List<K> row : matrix) {
´ ´ ´ ´ for (K column : row) {
´ ´ ´ ´ ´ ´ block.invoke(column);
´ ´ ´ ´ ´ ´ count++;
´ ´ ´ ´ }
´ ´ }
´ ´ ac.set(count);
}
Unless I am still mistaken, what a continue within the block will cause. Maybe I do.

Brian said...

Neil,
Would it be possible using your closure proposal to create a switch statement that accepts Objects? I have read the the spec for .5 and am not sure how you would do it with closures, but then again I am not a language designer so it is easier for me to learn from examples than from language specs. From a developer perspective what I envision is the following:

objectSwitch(someString){

objectCase("String1"){ doStringOne(); }

objectCase("String2" { doStringTwo(); }

objectCaseDefault(){ doDefault(); }


}

Is that possible? Can you have nested closures like that and somehow get "someString" into the objectCase without an explicit pass?

If it is not possible to formulate an objectSwitch using that method is there a different way to achieve the same functionality... besides if/else :-)

Stefan Schulz said...

@Brian
If I am not mistaken, one could realize some object-based switch using the argument syntax (not the control invocation syntax) even type-safe like follows:
static <T> void oSwitch(T o, {T=>{=>boolean}} ... cases) {
` for ({T=>{=>boolean}} case : cases) {
` ` {=>boolean} block = case.invoke(o);
` ` if (block != null) {
` ` ` if (block.invoke()) {
` ` ` ` break;
` ` ` }
` ` }
` }
}
static <T> {T=>{=>boolean}} oCase(T case, {=>boolean} block) {
` return {T o => case.equals(o) ? block : null}
}
{T=>{=>void}} oDefault({=>boolean} block) {
` return {T o => block}
}
...
int number = 0;
oSwitch(aString,
` oCase("one", {=> number = 1; true}),
` oCase("zwei", {=> false}), // fallthrough
` oCase("two", {=> number = 2; true}),
` oDefault({=> number = -1; true})
);
As one cannot use break here, I simulated it by having cases returning a boolean value indicating to break (true) or to fallthrough (false).

Stefan Schulz said...

I got oDefault wrong. Corrected version:
static <T> {T=>{=>boolean}} oDefault({=>boolean} block) {
` return {T o => block}
}

Paul said...

Hi Neal,

Couple of questions about the 0.5 closures spec:

1) Would it be possible, or not too much of a headache, to allow closure literal conversion to a type that is an abstract class with a single abstract method, and not just an interface with a single declared method?

2) In the example containing the interface IntFunction, must the interface's method be called "invoke"? If not, I would suggest renaming it to something like "evaluate", so as to emphasize that closure literals can map to arbitrary interface types with a single method. I understand that "invoke" is the name of the method on interfaces generated from function type syntactic forms...

Tom Palmer said...

Actually, why not change this:

for_opt Primary '(' FormalParameters ':' ExpressionList_opt ')' Statement

to this:

for_opt Primary '(' FormalParameters (':' ExpressionList)_opt ')' Statement

If that made sense. Meaning, make the ':' optional. If we insist on have types in front of all the vars in Java, then take advantage of that. The whitespace between type name and var name would make it clear whether this was an inbound parameter or outbound arg.

And then you do get prettier things like:

file.open(Reader reader) {
// do stuff
}

Tom Palmer said...

As far as readability goes, I think losing the ':' for no args is status quo in Java already with catch clauses. We say this without getting confused:

catch (Exception e) {...

Requiring the colon could be seen as being less Java-like.

Jochen "blackdrag" Theodorou said...

Maybe there is a syntactic reason I don't see for this, but in your talk you had this example:

List< Integer > list = ...;
Integer sum =
Collections.reduce(list, {Integer x, Integer y => x+y});

why is it not possible to do this in the control statement style? For example like:

List< Integer > list = ...;
Integer sum = Collections.reduce(list) {Integer x, Integer y => x+y}

I am very well aware that I can't name this statement then, but have to call expression. But ok, let us extend it to: why can't I have control expressions? This way I avoid the smily style too

I think the break/continue mechanism isn't influenced by it, because there is no literal loop statement. Or do I miss something here?

From your talk I know you dislike the control statement style, but many are used to this style and the other way looks quite alien to me.

And may I suggest to use -> instead of =>, because if Java will ever gain type inference, then you will have less to change. But I think I already suggested that.

Tom Palmer said...

Jochen, the problem is that you'd have to say this:

Integer sum = Collections.reduce(list) {Integer x, Integer y => x+y};

Especially if you wanted to support things like this:

Integer sumOfOdds = Collections.handyList(list).filter {Integer i => u % 2 == 1}.reduce(list) {Integer x, Integer y => x + y};

And then you'd need semicolons on your blocks:

using (something) {
//...
};

And that would make blocks uglier. I think the direction they are going with the syntax (standard expressions and also abbreviate blocks) is fine.

In Java, endlines aren't significant like they are in Ruby.

Side question, what does "=>" have to do with type inference? Just curious. Meanwhile, I like the look of it for closures.

Tom Palmer said...

Just noticed that {=> ...} instead of {...} for block-ish syntax could be the distinguisher for whether a semicolon is needed. That is, the equals-arrow would trigger the difference. Starts getting sort of complicated, but maybe could work.

Side note, there's not even really a need to add "filter()", "reduce()", and friends to List (even if we aren't adding extension methods). We might not nessarily want the same kind of list anyway. Just as methods on ArrayList would be good enough to get started and clarify what type you end up with. (Now, adding the same for Iterables/Iterators would be handy for lazy evaluation - so maybe HandyIterable/HandyIterator classes - picking better names of course.)

Jochen "blackdrag" Theodorou said...

Out of experience I can say there is no problem for the parser to recognize a closure when => is used in the block. => is valid only there, the compare would be >=

What might be a problem is a block without =>, like foo() {bar();}

Possible problems are arrays, annonymous inner classes and normal blocks. So let us go through...

Normal blocks are no problem, because it would be then foo(); {bar();} instead.

Arrays... foo = new Object[]{bar();} Doesn't look like much a problem to me, both blocks do look the same, but the compiler knows what can follow and a closure block can not be here. It might be a question of readability of course. Well and that there is no type for the closure conversion!

So it would be nice to be able to do:
foo = new Object[]{{bar();},{bar2();}}
But it might not really be needed. (using => makes it much more explicit of course)

Anonymous inner classes...
foo = new Object(){bar();} is not valid syntax. bar() is recognized as the start of a method definition. So unless they want to have such closures for "new" I don't think there is a problem for the parser.

Coming back to the => variant such a "new" might also complicate the parser since the parser must match the type and the parameter name before being able to say we are actually in a closure. But it is not so much different when it comes to know if it is a field or method definition.

So no, I don't expect problems in the parser for this.

If I would need to specify the return type, then the whole construct would become useless, but again I don't think there is a real problem when you don't specify the type. No type inference other than already used in methods is needed in this case I think. And if someone is that evil to define a methods taking a closures with the same parameter types but different return types... well he deserves to get an error when using the closure ;)

Clackwell said...

Neal,

with all due respect, do you ever consider the consequences for complexity and maintenance of such language features?

What happened to keeping it simple?

Do we really need language features that add another level of indirection for the human brain to try to comprehend and keep in mind to do the necessary maintenance on a given piece of code? Is the gain in flexibility really worth it for the majority of the Java target audience?

Do you really think Java developers are clever enough not to horribly abuse/misuse those features?



With best regards

Clackwell

Neal Gafter said...

@Clackwell: to answer your five questions (respectively):

Yes.

Closures will significantly simplify code that would otherwise have to be very complex in order to accomplish the same thing with existing language features.

Closures remove a level of indirection compared to solutions using existing constructs. (did you stop beating your wife?)

Yes.

and Yes.