Use cases for closures
Some people have been reacting to this proposal as if it represents Sun's plans for the future of the language. At this stage it is just some ideas being drawn up to address a number of requirements in a clean and uniform framework. Other people are working on alternative proposals, and I expect you'll see those soon. It is too early to place bets on what, if anything, Sun will do in this space for JDK7.
With the insights described in my last blog entry, we no longer need two separate syntactic forms for closures. We also don't need the "labelled" nonlocal return statement. We're updating the draft spec with this and other changes, making it more precise, and adding a bit of rationale. I hope to augment the brief rationale that will appear in the draft with further explanation here.
Closures, Inner Classes, and Concurrency
Our "solution" to concurrency issues in the original version of the proposal was quite unsatisfying. Part of the problem was that we could see no way to satisfy the requirements of that community while preserving some of the most useful features of closures and without violating the language design principles. Since then, we've been reexamining the issue, and I think we've found a very tidy solution.
Asynchronous use cases
To understand the solution, it is useful to characterize the use cases for closures into two very distinct categories. The first category, which I'll call asynchronous, are those in which the user-privided code is not called immediately by the API, but rather is saved to be executed from another thread or at a later time. This includes asynchronous notification frameworks, the concurrency framework's executors, and timer tasks. Generally speaking these fall into a pattern in which the caller of the API controls what will be done, but the API controls when it will be done. These kinds of Java APIs are widespread in practice, and are often expressed using single-method interfaces or abstract classes. Because the code passed into the API is invoked when its context no longer exists, it is generally inappropriate to allow nonlocal control-flow. Because it may be invoked from another thread, it is often inappropriate to allow such code to access mutable local state.
The closure proposal attempts to address this category of use case by providing the closure conversion, which converts the user-written chunk of code into an anonymous inner class.
To make this very concrete, let's see how this would affect they way you can submit a task to a java.util.concurrent.Executor. The way you'd write it today (and this way of writing it will always be available) is this:
void sayHelloInAnotherThread(Executor ex) { ex.execute(new Runnable() { public void run() { System.out.println("hello"); } }); }
Now, that's not too bad, but using closures you can write the following, without any change to the executor APIs:
void sayHelloInAnotherThread(Executor ex) { ex.execute(() { System.out.println("hello"); }); }
This is a little better, though not necessarily enough by itself to justify closures. This works because although Executor.execute takes a Runnable as an argument, and we wrote a closure, the closure conversion converts it to a Runnable. Essentially, the closure conversion builds an anonymous instance of Runnable whose run method just invokes the closure. The closure conversion is implemented at compile-time, generating exactly the same code as in the first case, so there is no runtime overhead.
If we adopt an alternative invocation syntax that we're working on to allow an abbreviated form of invocation statement for methods involving closures (we're still working on the specification), you would be able to write it something like this:
void sayHelloInAnotherThread(Executor ex) { ex.execute() { System.out.println("hello"); } }
If this syntax doesn't mean anything to you, just imagine that the block appearing at the end of the invocation of ex.execute is turned into a closure and then passed as the last argment (the only one, in this case) to execute. That's basically what the abbreviated invocation syntax does.
This is a significant improvement compared to what you have to write today. You could think of these uses cases as a shorthand for creating anonymous class instances. The analogy isn't exact, as certain details necessarily differ to get the langauge construct to be well defined. Most importantly, the binding of the names inside the block of code have to be based on the scope in which the code appears, which in this case is the method sayHelloInAnotherThread, not in the scope of the anonymous class that will be created. Why? Well, if the meaning of a name appearing in a block of code changes when you use it in code passed in to method like this, then it would be a fruitful source of Java Programming Puzzles. That is especially true if the scope from which the names are inherited isn't even mentioned in the code. It gets worse when you consider that the method name to which the closure is being passed might be overloaded, so you can't tell when you're typechecking the block which interface you are building.
Some people believe that what you can do with interfaces and classes is enough, and would prefer to see no more (and possibly less) than some abbreviated syntax for writing anonymous class creation expressions. The most significant difference between that approach and closures appears in the other category of use cases.
Synchronous use cases
The other category I'll call the synchronous use cases, in which the closures you pass to an API element are invoked by your own thread strictly before that API element returns control to you. There aren't many of these APIs appearing in the JDK, usually because such APIs are rather awkward to write. Anonymous class instances sometimes satisfy the requirement for this category of use cases, and when they don't programmers resort to various contortions to get their job done, or simply leave the job half done. Let's see how far we can get, though, using only interfaces for this category. I'll give a single example, again motivated by the concurrency framework.
The Java synchronized statement is convenient and easy to use, but for many purposes has been superceded by the Lock interface in the concurrency framework. Here is how you currently use a Lock to synchronize a snippet of code:
void sayHelloWhileHoldingLock(Lock lock) { lock.lock(); try { System.out.println("hello"); } finally { lock.unlock(); } }
This syntax is tedious and distracting. One would prefer to write something like
void sayHelloWhileHoldingLock(Lock lock) { withLock(lock) { System.out.println("hello"); } }
Let's try to write this using only interfaces and classes, and perhaps taking advantage of the syntactic abbreviation that we might think of as being implemented by translation into inner classes. The first thing we'll need is an interface that represents the block of code that we're passing in. java.lang.Runnable might just do it. So our first try at writing this method might look something like this:
void withLock(Lock lock, Runnable runnable) { lock.lock(); try { runnable.run(); } finally { lock.unlock(); } }
This works for some uses. It allows us to write the sayHelloWhileHoldingLock method, above. So far so good. What happens if we try to write other clients using this?
Suppose we have the following method that uses old-style locks, and we want to refactor it to using a java.util.concurrent lock.
void callBigHonkingWhileHoldingLock(Object lock) throws CheckedException { synchronized (lock) { bigHonkingMethod(); // throws CheckedException } }
The refactored version would presumably look something like this:
void callBigHonkingWhileHoldingLock(Lock lock) throws CheckedException { withLock (lock) { bigHonkingMethod(); // throws CheckedException } }
It would be nice if this just worked, but it doesn't quite. The reason is that the run method of Runnable isn't declared to throw any exception types, and specifically it doesn't throw CheckedException. If we make something almost like Runnable but in which the run method throws Exception, that doesn't quite work either because the implementation of withLock would have to declare that it throws Exception too, and we aren't catching it in callBigHonkingWhileHoldingLock.
The immediate problem is that the withLock method doesn't provide exception transparency, which means that the method throws the same exception type as the method passed in to it. In order to get that, we'd have to make a version of Runnable that can also throw some unspecified exception type. We can do that with generics:
class RunnableWithException<E extends Exception> { public void run() throws E; }
now we can make the withLock method exception-transparent:
<E> void withLock(Lock lock, RunnableWithException<E> runnable) throws E { lock.lock(); try { runnable.run(); } finally { lock.unlock(); } }
Assuming the closure conversion can automatically figure out that it should be creating an instance of RunnableWithException<CheckedException>, which it should, this works just fine.
Things are a bit more complicated when the thing you're running can throw two checked exception types; you really want to write a version of withLock that doesn't care how many exceptions are thrown in the block, they are all just propogated out. Let's take it as a given that the generic type system can be extended to do that (I don't think it's too difficult). Now we have a version of withLock that is completely exception-transparent, and can be invoked on a block of code that throws any set of exception types, which will have to be declared or caught by the caller of withLock. So far so good.
Let's take another slightly more complex example of a method that we might want to refactor to use the new locks:
boolean containsFred(Object lock, List<String> c) throws CheckedException { synchronized (lock) { for (String s : c) { if (s.toLowerCase().equals("fred")) return true; } } return false; }
The refactored version would presumably look something like this:
boolean containsFred(lock lock, List<String> c) throws CheckedException { withLock (lock) { for (String s : c) { if (s.toLowerCase().equals("fred")) return true; } return false; } }
But this won't compile. The first problem is that the variable c isn't final, but it is being accessed within an inner class. We can easily fix that in this case by making c final because the variable isn't assigned anywhere. If the variable were assigned somewhere (but not in the block), we could create a final local variable to hold a copy of the variable's value where it is used, and use that other variable inside the block.
There is a worse problem: the return true statement inside the block is returning from the RunnableWithException.run method, not containsFred but the run method returns void. We could solve that, perhaps, by making a new version of Runnable-like interface that returns a boolean. Or better yet perhaps we could add another generic type parameter to indicate the return type of the interface's method, and have the withLock method return the value to its caller that was returned by the block.
So far so good... well, not quite as good as one would hope, because the refactoring isn't straightforward and requires serious thought from the programmer. Did you notice that the final return statement had to move inside the block? But Java programmers generally speaking are smart people, and they can work out how to do this. I'm sure they enjoy (as much as I do) spending their time puzzling out how to coerce the language into doing what they want.
Next let's try to refactor the following method to use new locks:
int countBeforeFred(Object lock, List<String> c) throws CheckedException { synchronized (lock) { int count = 0; for (String s : c) { if (s.toLowerCase().equals("fred")) return count; count++; } } reportMissingFred(c); }
Performing a straightforward refactoring of this code to use a method like withLock runs into a number of problems. Let's look at the snippet of code that would become the body of the method RunnableWithExceptionAndReturn.run
{ int count = 0; for (String s : c) { if (s.toLowerCase().equals("fred")) return count; count++; } }
This can't possibly be the body of any method: it returns a value in the middle of a loop, but it also falls off the end of its execution. In order to refactor this method to use something like withLock, we have to start reorganizing the code to avoid these problems. We could move reportMissingFred(c); into the synchronized block, but that changes the behavior of the program. Another idea is to use a boolean state value that is returned to the caller in addition to the count, to tell the caller whether or not the block fell off the end, but where would we put it? We can't make it a local variable in countBeforeFred because the block can only use final variables. Perhaps we could use a ThreadLocal. We could put a length-one final boolean array in the local scope of countBeforeFred, and keep the boolean value in element zero.
If the body of the countBeforeFred were much more complicated in its control structure at all, for example containing a break from inside the block to outside it, this kind of refactoring might become prohibitively complex. Realistically, we would just give up and stop trying to use withLock at all, effectively inling it. After all, it is only a few lines of locking code, and it's an idiom that programmers should be expected to learn, right? The idiom is even documented in the javadoc of the Lock interface.
We've only touched the surface, though. The method withLock is perhaps simple enough that you won't mind having to inline it except in the simplest of circumstances when it does you the least good. You might feel some discomfort at repeating code, especially such boilerplate code, but there really isn't any good alternative, and after all it really isn't that much code.
However, some synchronous APIs are naturally more complex, and it doesn't take much more complexity in the API before it simply doesn't make sense for the caller to repeat code from the implementation of the API to avoid these problems. Consider, for example, an API that automatically closes your streams for you. Instead of writing
{ Stream s = null; try { s = openStream(); yourCodeHere(); } finally { try { if (s != null) s.close(); } catch (IOException ex) {} } }
You would write an invocation of a library method closeAtEnd to get the same effect:
{ closeAtEnd(Stream s : openStream()) { yourCodeHere(); } }
You would probably go to much greater lengths to avoid inlining the API method closeAtEnd than withLock, but the same problems are are unavoidable. These are just two specific examples of particular APIs of the synchronous closure variety; if the API implementation is much more complicated, or contains details in its implementation that callers should not be aware of, the problem becomes acute because by-hand inlining is no longer an option. If you've ever used java.security.AccessController.doPrivileged, for example, you know the pain.
While we've shown how to achieve exception transparency, we haven't quite achieved control transparency, which means that control constructs like break, continue, and return have the same meaning when enclosed by a closure. The idea is that you should be able to wrap a block of code in an invocation of withLock, closeAtEnd, or any other API that takes a block of code, and the meaning of the control constructs appearing in that code should be the same. Ideally, the only difference is that the block is executed while a lock is held, or a stream is closed after the block, or more generally whatever is specified and implemented by the API. You should be able to break from the block to get out of a loop or switch statement that might enclose the invocation of withLock or closeAtEnd. We've looked at examples where return is problematic, but the same problems occur for other control constructs. Similarly, and for the same reasons, in synchronous APIs you want to be able to refer to and modify local variables in the enclosing scope; because the block is executed synchronously and by the same thread, there are no more concurrency issues than there are for an ordinary block statement.
If you believe that these two methods - closeAtEnd and withLock - are the only synchronous APIs worth having, now or ever, you might be tempted to add some very specific language features to Java to get precisely the desired functionality, and dismiss the rest of our closure proposal. We believe that would be a mistake, because we believe that synchronous closures would greatly simplify the lives of programmers. There are not many APIs of this sort today, mainly because they can't be made to work very well with the language in its current state.
The problem to be solved for synchronous closures is that the programmer should not have to transform or contort his code at all in order to wrap it in a closure to pass it to one of these APIs.
Bridging the Chasm
We're left with two seemingly irreconcilable sets of requirements. Asynchronous closures should not be allowed to refer to local variables in the enclosing scope, do not support nonlocal control-flow, do not require exception transparency, and generally speaking are done today using interfaces or abstract classes. Synchronous closures, on the other hand, should be allowed to refer to locals in enclosing scopes, should support nonlocal control-flow, and may require exception transparency, but generally speaking don't appear in the Java APIs because they are not expressible. These use cases would seem to have so little in common that one might not expect a single language construct to satisfy both sets of requirements. But we believe a small modification to the currently published proposal, in addition to the change described in my previous blog post, satisfies both sets of requirements. This is a very tidy result, because having a single uniform language feature is very much preferable to having two.
The change is very simple, but the change is not to the specification of function types or closure expressions, it is to the specification of the closure conversion, which allows you to write a closure - a block of code - where a one-method interface is expected. We add the additional restrictions that a closure to which the closure conversion is applied must not reference any non-final local variables from enclosing scopes, and must not use break, continue, or return to transfer control out of the closure, or the program is in error. These are the same restrictions you have now with anonymous inner classes.
How does this solve both sets of problems? Well, first consider the asynchronous use cases. These are expressed using interfaces, and therefore in order to use these APIs with the closure syntax, the block must be subjected to the closure conversion. Consequently, if there is any use of a nonlocal non-final variable, the compiler must reject the program. Similarly, the compiler must reject the program if there is nonlocal control-flow. The things that you will be able to do are essentially the same as what you can do today, but with a much simpler syntax and clearer semantics.
However, APIs for the synchronous use cases have largely not been written yet, so they can be written using the new function types. A closure is an expression of function type, so the closure conversion is not required, and its restrictions do not apply. Consequently, the block is allowed to use nonlocal non-final variables, and can use nonlocal control-flow. The block has essentially the same meaning as it would if it were not wrapped by a closure. In the rare cases that an existing API is expressed using an interface but is a synchronous use case - for example java.security.PrivilegedExceptionAction - the library writer can write a small utility method that converts a closure to an implementation of the interface, enabling the client to take advantage of the full generality of the feature. To implement withLock using closures, with full exception- and control-transparency, the JDK authors could write this:
<E extrends Exception> public static void withLock(Lock lock, void()throws E block) throws E { lock.lock(); try { block(); } finally { lock.unlock(); } }
The method closeAtEnd is similarly straightforward.
Once you have support for synchronous closures, however, there are many opportunities for the addition of extremely useful APIs to the JDK as library methods, such as withLock, closeAtEnd, and a method that simplifies the use of java.security.AccessController.doPrivileged. More importantly, synchronous closures enable Java programmers - not the JDK authors, but ordinary Java programmers - to freely refactor and abstract over pieces of their code in simple ways that are not currently possible for seemingly arbitrary reasons.
Refactoring
This isn't just about writing and reading programs, it is about refactoring programs too. Your ability to factor common code by moving it into a separate "routine" is currently limited by how the block passed by the caller can interact with the caller. Closures, and synchronous closures in particular, simply remove that limit. When a program is in need of a small refactoring, you are free to factor just the code in common, rather than revisiting the design of the basic data structures and organization of your application. That was the point of my post What's the Point of Closures. Some people missed the point, and thought the program was merely flawed from the beginning and should have been rewritten. Yes, it was flawed, as are most programs in the real world. Not every programmer can get everything right the first time, or can rewrite thousands of lines to solve a localized issue. The whole point of such examples of refactoring is that the original program is flawed in some way. If the program were not flawed then refactoring might not be required. The refactorings supported by closures enable the programmer to get everything right in the end by small refinements to the program, step by step.
And more...
I believe that the additional expressiveness of the language with closures will enable people to discover and use new and powerful kinds of design patterns to organize their code in ways not possible today, making things simple that currently seem complex. I have some more ideas in this direction, which I'll write about at another time when I've had more sleep.