Monday, December 18, 2006

Closures Talk and Spec Update

This post discusses a draft proposal for adding support for closures to the Java programming language for the Dolphin (JDK 7) release. It was carefully designed to interoperate with the current idiom of one-method interfaces. The latest version of the proposal and a prototype can be found at http://www.javac.info/.

My 2006 JavaPolis talk Closures for Java has now been published - slides synchronized with a soundtrack - and can be viewed online at the new Javapolis website, Parleys.com. If you're still wondering why all the fuss about closures, I recommend you listen to the talk.

We've also just published an update to the specification, version 0.4 . The specification seems to be setling down quite a bit, as the changes are minor:

  • The throws clause of a function type is now placed between the curly braces.
  • The description of function types has been rewritten to emphasize more clearly that function types are interface types rather than a separate extension to the type system.
  • The closure conversion can now convert a closure that has no result expression to an interface type whose function returns java.lang.Void. This change helps support completion transparency. A completion transparent method is one written such that the compiler can infer that an invocation completes normally iff the closure passed to it completes normally.
  • Some examples have been modified to be completion transparent.
  • null is now a subtype of Unreachable, and a number of small related changes. Thanks to Rémi Forax for pointing out the issues.

I hope the talk, which is less technical than the specification, makes it easier for you to evaluate the proposal and compare it to the alternatives.

46 comments:

Marius said...

Here is an example from the closures spec:

public static <T,throws E extends Exception>
T withLock(Lock lock, {=>T throws E} block) throws E {
lock.lock();
try {
return block.invoke();
} finally {
lock.unlock();
}
}

Will be syntactical correct to just call the block (with o invoke method) ?

public static <T,throws E extends Exception>
T withLock(Lock lock, {=>T throws E} block) throws E {
lock.lock();
try {
return block();
} finally {
lock.unlock();
}
}

Neal Gafter said...

marius: no, you must use the method name when invoking a method from an interface. The lookup rules to allow what you suggest were in a previous version of the spec and added unnecessary complexity to the spec.

Marius said...

Thanks for your quick answer :)

I believe I understand the rationale behind this, but still shouldn't be too difficult for the compiler to figure out that it is a closure that we're trying to call and then call the invoke under the hood...

Maybe it's just me but it seems more natural to call a closure with no invoke method.

So if we have :

{int=>int} plus2 = {int x => x+2};

how do we call the closure ?

1. int k = plus2(3);
2. int k = plus2.invoke(3);

plus2 is a function type and #1 seems more natural to me at least.

djthomp said...

While #1 seems more natural to me as well, it leads to difficulty with regards to function namespace and variable namespace. Consider the following:

int plus2(int x) {
   /* do stuff, and return an int */
}

{int=>int} plus2 = {int x => x+2};

int k = plus2(3);

The question would be, to which of the above definitions of plus2 does plus2(3) refer to? Now, this is something thats solvable, I suppose you could throw a compile time error about an ambiguous statement. In the end though, I'd prefer #2, since it doesn't even let this problem happen.

Tom Palmer said...

Looking great in the current rev, by the way. I hope it gets into Java 7. For now, I'm of the crowd that agrees it will be easier for both newbies and old-timers alike. No need to learn the depths of the spec just to use the features. Closures can make the right way the easy way, and that's good for solid code.

Tom Palmer said...

I just imagined some examples of trying to implement LINQ with this closures proposal. I see some awkwardness in syntax and some potential major pain with Exceptions. I'll try to blog about it soon if I can work through the details nicely.

Ladislav Thon said...

Personally, I don't like this block.invoke() syntax much – seems to me like the invoke() method fell from heaven, appeared from nowhere.

I would like the block() syntax better as well – the problem described by djthomp should definitely lead to compiler error.

But OK, let's move. Let me make one suggestion: in Ruby one can write block.call() or &block, if I remember well. Why not having something simillar in Java? I think the @ sign should do well (both block.invoke() and @block would be possible).

AlBlue said...

Whilst I support the idea of anonymous functions in Java as a good/necessary thing, I don't believe that you should be confusing the term 'closure' with that of an anonymous code block. A closure is specifically a binding of variables, not the syntactical structure of whatever contains those variables. In most cases, it's because of misinterpreting the original LISP example, which used the term 'closure' to mean the binding of the local value to the anonymous function, and not the anonymous function itself.

There's more debate at http://www.eclipsezone.com/eclipse/forums/t86911.html and other comments from others in the formal language community which would prefer a clearer distinction bewteen anonymous code blocks that can refer to variables in the local scope, and the 'closures' that are created from these constructs at run-time.

Mileta said...

Also hope it gets in Dolphin, but functional types version :)

Anonymous said...

Example. The variable declaration

{int,String=>Number throws IOException} xyzzy;

is translated into

interface Closure1<R,A2,throws E> { // system-generated
R invoke(int x1, A2 x2) throws E;
}
Closure1<? extends Number,? super String,null> xyzzy;

it should be
Closure1<? extends Number,? super String,IOException> xyzzy;
right ??

Marius said...

Personally I would love to see a "syntactic sugar" for closures invocation. So besides block.invoke(), a shorthand form such as block() or whatever ... would be really nice.

According to: http://groovy.codehaus.org/Closures

"Closures may be invoked using one of two mechanisms. The explicit mechanism is to use the call() method:

closureVar.call();

You may also use the implict nameless invocation approach:

closureVar();"

Marius said...

I was wondering if a method can return a function type(probably not ...). But still let's say:

class A{


public static {int,int=>int} doSomething(){
// actually do something ...
return {int x, int y => x*y};
}

public static void main(String... args) {

{int, int => int} func = doSomething();

func.invoke(3,4);

// or better yet: func(3, 4);


}
}

... but then again this is probably more a topic for higher order functions and not closures ...

Any thoughts?

Anonymous said...

@Marius,

As long as Closure is a Interface or Java Type, I think It can be used as return type ~

Anonymous said...

I must say that before your talk in Antwerp I was a bit confused as to how closures would fit into Java. The talk both cleared that point up but it still left many unanswered questions. For example, you talk about how "this" *must* work as expected. The scoping rules in Java (and indeed langauges like Java) wouldn't allow for "this" to work as expected without some trick similar to using final with AICs.

More importantly, your interview with Bill Venners where you discuss implementation of return was most revealing. I would hope that there is a better way for return to work than having the closure throw an exception. Throwing exceptions is too disruptive to the execution environment to be using them for unexceptional cases such as returning from a method call.

Kind regards,
Kirk Pepperdine
www.javaperformancetuning.com

beppegg said...

I'm wandering about the real usefulness of closures...

writing {int x => a + x}, with "a" as a free variable bounded at runtime according the context, seems to me a bit like falling back to "global variable".

It breaks encapsulation (closure's writer has to know that there will be an integer variable called "a") and enhance coupling (closure's users has to define an integer variable named "a" in order to call the closure).

This seams to me contradicting Java philosophy of simpleness, which lead us not having multiple inheritance...

Neal Gafter said...

beppegg: closures are lexically scoped, not dynamically scoped.

beppegg said...

what's exactly the difference, with respect to my previous considerations (encapsulation breaking and coupling)?

I think I've missed some piece somewhere! :D

Thanks

Tom Palmer said...

I keep not getting to my detailed analysis, so here's my quick concern on Exceptions and closures. Say I want to save a closure for later, as in:

new TreeSet<String>({String a, String b => b.compareTo(a)})

Comparator doesn't allow checked exceptions, and happily so in this case. There's really no way to explicitly catch them later should they be thrown. The callback is performed on later methods, not necessarily right here.

I don't see any way to accommodate this in the current spec if I want to be able to catch checked exceptions. Maybe I'm missing it, though. (Side comment, I don't necessarily think that this issue should be addressed. The exception support is already too much boilerplate and complication, I think. In my own code, I'd likely just try to avoid things by avoiding checked exceptions where possible.)

What are your thoughts on this?

Neal Gafter said...

tom: the checked exception types that can be thrown by a closure are part of the interface. In the case of a Comparator, you would get a compile-time error if you tried to use a closure that could throw any checked exceptions. In short, closures do not compromise exception checking.

Tom Palmer said...

Right. I meant that if you wanted to make your own and you wanted the checked exceptions later. How do you save that E in <throws E> for later invocations if you want? I guess you could always construct a new object as the result of a method and leave that Exception in the object type, as in:

public <throws E> AlmostIterable<T, throws E> filteredBy({T => boolean throws E} filter) {
...
}

I say AlmostIterable since Iterables only have one type parameter, and this would need two.

Tom Palmer said...

beppegg, the closures spec isn't any different than anonymous inner classes today in this regard except for the not-always-final-ness of local variables. As another examples vars inside a loop today can already see vars outside the loop. This a lot like that, too.

By lexically vs. dynamically scoped, Neal Gafter meant that you only see the vars in your own file in your own block. It's not like request parameters in Servlets/JSPs that get passed around dynamically from page to page as you include them.

Tom Palmer said...

Also any chance of type inference like this (note no "int x" on the right):

{int=>int} plus2 = {x => x+2};

In the extreme case, seems like you could even leave out the "{" and "}":

{int=>int} plus2 = x => x+2;

That's C#-like, but I'm willing to leave Java a bit more explicit than that. The types on several parameters get very burdensome. The curlies aren't so bad and make it more obvious what's going on.

And I'll try to stop commenting for a while again.

Calum MacLean said...

Unless I'm misunderstanding it, the "with({T=>R throws E} block, T t)" example from the specification has the parameters the wrong way round - i.e. the closure should be the last parameter in the method.

Calum MacLean said...

I'm not clear about the usage of RestrictedClosure.
For example, for EventQueue.invokeAndWait() and EventQueue.invokeLater() - you'd want invokeLater() to be restricted.
However, as both methods use Runnable, how would this be possible?

Also, if your method just had a function type as its parameter, would it be possible to make it restricted?

Calum MacLean said...

I'm not sure about the proposed "control invocation syntax" - I find it pretty difficult to grok. Maybe it will improve with time, but I'm not too sure about that...

What's the motivation for reversing the lexical order of the formals and the expressions_opt? I'm left trying to work out what goes where. "with(A a : b) { C; }" becomes "with(b, { A a => C; })" - i.e. on expansion the order of the syntactic entities is changed.

I'm guessing that it could be motivated by Java 5.0's foreach. This is read as "for each Element element in collection". Maybe "with(FileReader in : makeReader())" is supposed to be read "with FileReader in as makeReader()" (or similar). I can see some logic in that, though I'm not too convinced.
It seems to me that this would only apply to methods with 2 parameters and the second parameter is a block, and either:
a) the block has no arguments (the lock example); or
b) the first method argument is always passed as the only parameter to the block (the with example).
For all other usages of this syntax - i.e. other types of method parameters - then the syntax seems to me to serve to confuse rather than anything else. For example, if there are 2 block parameters and only 1 other method parameter, or 1 block parameter and 2 other method parameters, then I'm not sure if the syntax makes sense at all.

I appreciate that there's maybe an effort being made to make this syntax look more statement-like rather than method call-like.
However, I think the effect is that it's not at all clear what is going on when you actually see some code - the trade-offs are too great.

I think it would be better to have syntax more like "with(makeReader()) { FileReader in => ... }" - i.e. if the last argument is a block, it can be moved after the closing parentheses of the method call.
I think that this is a lot simpler and more transparent, with minimal effect on the calls to "with".

Finally, I'm not sure about the advisability of the last example, with the nested "with"s. For me at least, it took a while to work out what was actually going on!

Fatih Coskun said...

Will the following code compile?

{=>void} block = {=> foo();};
new Thread(block).start();

The closure will be converted to a system-provided interface. The constructor expects the Runnable interface. Will the system-provided interface be compatible to the Runnable interface in this case?

Fatih Coskun said...

@Calum Maclean

Think of a foreach Method for Maps. With the shorthand syntax the code would look like:

foreach(K key, V value : myMap) {
doSomething(k,v);
}

This is an example with 2 block-parameters and one other method argument. The syntax in my opinion makes much sense in this case.

The only strange code I can think of is the following:

foo(T t : ) {
bar();
}

There is one block-parameter and no other methode argument, in this case. I would prefer not to use the shorthand syntax in this case. Remember you can always choose the normal closure syntax:

foo({T t => bar();});

Fatih Coskun said...

@bepegg

The closure's users do not need to define the variable a. This variable is defined in the same place (lexically) as the closure was defined in the code. Both the closure and all lexically bound variables are defined by the same programmer at the same place.


int a = 5;
{int => int} myClosure = {int i => i+a;};

In this example the closure has access to the local variable's in his context, which in this case is only a. The users of the closure do not need to know which variables are in the context of the closure, and they do not need to provide them.

Neal Gafter said...

Fatih Coskun: No that will not compile, because the variable "block" is not of a closure type. Rather, it is of an interface type (aka a function type). Only closure literals are subject to the closure conversion.

Fatih Coskun said...

I see. This is a downside of the current proposal. But it is not a big one. I still prefer the functional version of the proposal (including this downside) over the nominal version. I would love to see closures in Java7. In fact, if closures wont be included in Java7, I wont be able to use Java anymore. :D

Ray Cromwell said...

Neal,
What's the downside to making closures converted to legacy single-method interfaces be automatically considered restricted? It seems alot of the opposition surrounds cases where a non-local return occurs from inside old non-closure aware code. It's not clear to me that this represents a danger any more than throwing unchecked exceptions from inside an anonymous inner class, however some critics seem to be complaining about unintentional bugs where a user, used to programming anonymous inner classes, uses "return" out of force of habit and ends up unintentionally exiting from the scope where the closure literal was defined.

I guess the presumed anti-use case is something like:

LegacyThirdPartyCollections.qsort(S x, T y: array)
{
return /* ooops! */x.compareTo(y);
}

Another, more funky, way to resolve it is to make the control invocation syntax not merely syntactic sugar, but actually semantically relevant: if you put the closure literal outside the parameter listof the function, you can have non-local return, but if it is inside the parameter list, it is restricted by default.

I don't really favor this tho.

Feng said...

I agree to Ray's suggestion.

Control flow closure is the most valuable feature in this proposal, but it's the feature most of java developers are afraid of (hopefully in the beginning only). As Ray mentioned, Having restricted closure as default will help catching errors early without retrofitting it to all legacy APIs (there are a lot, most of them were not designed with control closure in mind anyway).

Tagging control closure with an explicit type emphasizes the strong invariant it requires for API designers to API users. Also, number of legacy libraries that need to be retrofitted to could be less (just my guess) than retrofitting RestrictedClosure.

I'd like to see the rationals if this alternative have been considered.

Regards,

Feng Hou

Neal Gafter said...

The marker interface must be used to indicate the restriction, not the unrestriction, because of subtype substitutability. An instance of a restricted interface type is also an instance of the unrestricted interface type, even though the instance doesn't happen to use any of the features not allowed in the subtype. Making the marker interface work the other way around breaks subtype substitutability, and therefore does not effectively check the desired constraints.

Neal Gafter said...

One more thing: adding the marker base interface to existing interfaces and classes is backward compatible. As a practical matter we would retrofit many interfaces in the JDK at the same time closures are added, so the restrictions would be effective, for example, in swing callbacks.

Ray Cromwell said...

Neal,
The backwards compatibility issue isn't that you can't retrofit the JDK, it's the oodles of third party libraries and frameworks that would need to be retrofited. The criticism from the opposition is that unrestricted closures might get passed unintentionally to third party code bases which can't handle them.

Is restricted being a subtype of unrestricted a requirement? Why not have both be a subtype of another marker?

I suppose an alternate solution would be to allow metadata to inform the compiler that all single method interfaces within a jar, package, or classpath should be considered restricted, and/or have the classloader retrofit them automatically.

Then one could either notate this in the JAR manifest, or write something like "javac -restricted com.foo.blah" or "javac -restrictedpath foo.jar", so people building against old libraries could isolate them. Another option would be to bump the class file version, and then specify that any classes loaded with a lower version number are automatically retrofited.

Ray Cromwell said...

BTW Neal, have you considered Scala's trick of allowing nullary function parameter declarations cause the compiler to treat expressions passed as argument to be treated as implicit closures?

See my post here: http://www.javalobby.org/java/forums/t87811.html?start=10#92122339

or

http://scala.epfl.ch/intro/targettyping.html

Neal Gafter said...

ray: requiring a marker interface before allowing any closure conversion doesn't work, as a function type would have to be one or the other.

the metadata solution doesn't work because metadata aren't supposed to affect the language. The flags solution doesn't work either; flags aren't supposed to affect the language, except to the extent that they are used to select which language is being compiled.

Ray Cromwell said...

Neal,
I'm not suggesting requiring a marker interface before allowing any closure conversion, I was suggesting that all existing SAMs (single method interfaces) be considered by default, implicitly restricted. Thus, it would be a compile error to do something like:

ThirdPartyCollections.sort(String a, String b : list) { break; }

but not

ThirdPartyCollections.sort(String a, String b : list) { a.compareTo(b); }


Anyone who wishes to accept unrestricted closures would not define an interface accepting a SAM, but one that accepts a function type, e.g.:

class ClosureAwareThirdPartyCollection
{
public <T> static void sort(Collection<T> col, { T a, T B => int } comparator) { ... }
// rather than

public <T> static void sort(Collection<T> col, Comparator<T> cmp) { ... };
}

Or, they could create an interface which extends UnrestrictedClosure.

I think what you're saying, is that this would make the following problematic:

{ String, String => int } cmp = {String a, String b => break; }
// at this point,cmp is already have an unrestricted closure

// but my proposal says this line should be a compile error, if the sort() method doesn't have an Unrestricted marker or use a function type declaration for the closure parameter.
ThirdPartyCollections.sort(list, cmp);

I confess tho, I can't see why. If retrofitting the entire JDK with "extends RestrictedClosure" where needed, is acceptable, then why would a rule that says "every single method interface that does not extend UnrestrictedClosure, by default, extends RestrictedClosure". I don't see the difference between manual retrofitting and automatic implicit retrofitting.

Now, if the objection is that it's not nice for the JVM/javac to silently add additional superinterfaces, I would point out that this is already the case for Java arrays, which implicitly implement Cloneable and Serializable. javac could simply consider the absense of an UnrestrictedClosure marker on a SAM to be an implicit RestrictedClosure.


The automatic conversion to SAMs is a useful feature, but it seems to be bothering some people that they can have unrestricted closures passed to their legacy methods masquerading as SAMs.

The most obvious example being the numerous listener methods (not just in the JDK) in which a non-local return would cancel event propagation to other listeners. Now IMHO, this may actually be a desirable property in some circumstances, but there you go.

I think you'll find that people might develop retrofitting classloaders anyway to wall off legacy code, so I think it is useful to hear out the complaints and see if there isn't some way to address it.

This issue seems to be a sticking point for many people.

Feng Hou said...

Conceptually, api designed for unrestricted closure appears to declare a stronger guarantee to api users than restricted ones, which is to say "I promise the closure argument will be invoked while control bindings' call stack still exists, thus non-local transfer is allowed and compiler don't have to restrict it." It seems counter-intuitive to present this guarantee via subtyping. Marking this contract explicitly still feels better than the other way around.

Another comment on restricted closure conversion, what's the rational for final variable restriction? The proposal only gives motivations on no non-local transfer. It seems artificially to be compatible with inner class. The downside is it makes nested closures somewhat ugly (require variable aliasing) if inner closure references a variable where outer closure needs to assign.

I personally think this proposal brings very good value to java, and have great hope on it. However, RestrictedClosure type is one place where I don't feel it does its job very well. It feels half way to either ends. On one hand, it's quite different from inner class (retrofitting type to make it safe for async use, different "this" scope, ), on the other hand, not consistent enough with full closure either (why enforcing two unrelated restrictions by one marker type). I believe making closure compatible with inner class for migration safety is good, but blurring it with inner class seems unnecessary compromise.

Sorry if I didn't make sense anywhere above.

Regards,
Feng

Feng Hou said...

Instead of extending marker interfaces, is it possible to make use of UnmatchedNonlocalTransfer (unchecked exception) to codify the non-local transfer invariant into function type itself?

Not sure if subtype substitutable is still an issue, but following certainly look very readable and clear in user's eyes.

// true control-flow closure
{ => void throws UnmatchedNonlocalTransfer} f = { => return; }
interface I { void meth() throws UnmatchedNonlocalTransfer; }
I i = { => return; }

// This is OK as well.
{ => void throws UnmatchedNonlocalTransfer} f = { => System.out.println();}

// compile error
{ => void } f = { => return; }
ActionListenr l = { => return; }

Neal Gafter said...

Feng: You really don't want UnmatchedNonlocalTransfer to be a checked exception type, and if it isn't checked, then this construct has no semantic force because the exception signature isn't enforced among subtypes.

M. said...

I have an useful/useless use case, just to convert an checked exception to unchecked without the try catch.

Sometime like:

I_dun_care(logger){
//IO
// SQL
}

In this case all Exception are catched and logged using the logger.

Ha, a better name of I_dun_care should be like unsafe, log, catchAll :)
unsafe() {
//call my poor RMI methods ;)
}

~~ Brilliant ~~

Daniel Yokomizo said...

Something very useful that's not in the scope of the closure proposal is coroutine like behavior found in languages like Sather, CLU, Python or even C# 2.0. I would like to write a method for looping that allowed me to use a closure as it's body but be able to yield intermediate results from closure and resume it from the method. So we could write a loop method taking an Iterator<T>, a {T=>R} closure and return a Iterator<R> closure. Note that such generalized loop would be able to do not just simple mappings but accumulate intermediate values inside the closure (i.e. working as a generalized fold).

This feature is needed anytime you can express your algorithm as a pipeline of instructions. In my case I had a method giving me an Iterator<Record> iterator(Reader) method (i.e. records inside a file) which I should accumulate and return partial results.

stefan schulz said...

Maybe late, but I really enjoyed the presentation you published on Closures. Having blocks is one of the main things I really miss since entering the Java world from Smalltalk (the other is metaclasses).

Nonetheless, a first minor comment to your proposal is my dislike for the fat-arrow syntax being introduced, especially as the function type definition inverses the order of a signature (i.e., method parameters before return type). An easy change could be adapted from the for-each notion using
{ int : String, int throws NumberFormatException }
or an anonymous signature
{ int (String, int) throws NumberFormatException }
Defining the Closure could use the same notion:
{ (String string, int base) return Integer.valueOf(string, base); }
or something like
{ String string, int base; return Integer.valueOf(string, base); }

The anonymous signature actually could be an idea to omit the invoke method from a type declaration. Introducing the Method Type / Function Type with a special character (as @ is used for annotations) would make it easy for the compiler to know what the interface is about. E.g.:
public #interface ToIntFunction {
int (String, int) throws NumberFormatException
}
which could then be used like
void with(Lock lock, String string, ToIntFunction toInteger) {
lock.lock();
try {
return toInteger(string, 10);
} finally {
lock.unlock();
}
}

Another thing, that puzzled me, is the short-hand version for calling a method that has a Closure type as parameter, what if one has two Closure typed parameters? For example:
void foreach(List<String> strings, { String => } foreachBlock, { => } inBetweenBlock) {...}
In the current proposal one would have to use the inline-variant to hand over both Closures, as two following Blocks would be difficult to read.
According to Java syntax, I would actually expect something like:
foreach (listOfStrings) { aString =>
builder.add(aString);
} inBetween {
builder.add(", ");
}
Maybe, one could come up with some advanced method definition syntax, like:
void foreach(List<String> strings, { String => } foreachBlock) inBetween({ => } inBetweenBlock) {..}
which also gives the opportunity to provide additional arguments to the second block.

However, I hope to see Closures in Java7. It's time for context-enclosing anonymous functions as opposed to rather context-free anonymous classes.

Cheers.

Neal Gafter said...

stefan: excellent comments. The only thing I would object to is omission of the .invoke on the invocation of the closure. We had that in an earlier version of the spec and it makes the lookup rules rather messy. The other sugestions are worth considering and are topics I would expect a JSR covering this feature to consider.

Ben Lings said...

Tom said:

> In the extreme case, seems like you could even
> leave out the "{" and "}":
>
> {int=>int} plus2 = x => x+2;

I think this could be very useful (presumably only allowing a single statement) -- especially when used as function arguments.

I've been looking at how LINQ-like query operators could be implemented in Java. Doing this would allow the following, which (in my opinion) is conveys its intention to the ready more clearly than the equivalent would including the {} and type declarartions.

List<String> src = Arrays.asList("a", "bbb", "c");

List<String> dest = from(src)
.where(s => s.length > 1)
.select(s => s.toUpperCase())
.toList();

cf.

List<String> dest = from(src)
.where({String s => s.length > 1})
.select({String s => s.toUpperCase()})
.toList();

Where something like the following is defined:
class Selectable<T> {
static <T> Selectable<T> from(Iterable<T> src);
Selectable<T> where({ T => boolean });
Selectable<R> select({ T => R });
List<T> toList();
}