Friday, August 25, 2006

Use cases for closures

Some people have been reacting to this proposal as if it represents Sun's plans for the future of the language. At this stage it is just some ideas being drawn up to address a number of requirements in a clean and uniform framework. Other people are working on alternative proposals, and I expect you'll see those soon. It is too early to place bets on what, if anything, Sun will do in this space for JDK7.

With the insights described in my last blog entry, we no longer need two separate syntactic forms for closures. We also don't need the "labelled" nonlocal return statement. We're updating the draft spec with this and other changes, making it more precise, and adding a bit of rationale. I hope to augment the brief rationale that will appear in the draft with further explanation here.

Closures, Inner Classes, and Concurrency

Our "solution" to concurrency issues in the original version of the proposal was quite unsatisfying. Part of the problem was that we could see no way to satisfy the requirements of that community while preserving some of the most useful features of closures and without violating the language design principles. Since then, we've been reexamining the issue, and I think we've found a very tidy solution.

Asynchronous use cases

To understand the solution, it is useful to characterize the use cases for closures into two very distinct categories. The first category, which I'll call asynchronous, are those in which the user-privided code is not called immediately by the API, but rather is saved to be executed from another thread or at a later time. This includes asynchronous notification frameworks, the concurrency framework's executors, and timer tasks. Generally speaking these fall into a pattern in which the caller of the API controls what will be done, but the API controls when it will be done. These kinds of Java APIs are widespread in practice, and are often expressed using single-method interfaces or abstract classes. Because the code passed into the API is invoked when its context no longer exists, it is generally inappropriate to allow nonlocal control-flow. Because it may be invoked from another thread, it is often inappropriate to allow such code to access mutable local state.

The closure proposal attempts to address this category of use case by providing the closure conversion, which converts the user-written chunk of code into an anonymous inner class.

To make this very concrete, let's see how this would affect they way you can submit a task to a java.util.concurrent.Executor. The way you'd write it today (and this way of writing it will always be available) is this:

void sayHelloInAnotherThread(Executor ex) {
    ex.execute(new Runnable() {
        public void run() {
            System.out.println("hello");
        }
    });
}

Now, that's not too bad, but using closures you can write the following, without any change to the executor APIs:

void sayHelloInAnotherThread(Executor ex) {
    ex.execute(() {
        System.out.println("hello");
    });
}

This is a little better, though not necessarily enough by itself to justify closures. This works because although Executor.execute takes a Runnable as an argument, and we wrote a closure, the closure conversion converts it to a Runnable. Essentially, the closure conversion builds an anonymous instance of Runnable whose run method just invokes the closure. The closure conversion is implemented at compile-time, generating exactly the same code as in the first case, so there is no runtime overhead.

If we adopt an alternative invocation syntax that we're working on to allow an abbreviated form of invocation statement for methods involving closures (we're still working on the specification), you would be able to write it something like this:

void sayHelloInAnotherThread(Executor ex) {
    ex.execute() {
        System.out.println("hello");
    }
}

If this syntax doesn't mean anything to you, just imagine that the block appearing at the end of the invocation of ex.execute is turned into a closure and then passed as the last argment (the only one, in this case) to execute. That's basically what the abbreviated invocation syntax does.

This is a significant improvement compared to what you have to write today. You could think of these uses cases as a shorthand for creating anonymous class instances. The analogy isn't exact, as certain details necessarily differ to get the langauge construct to be well defined. Most importantly, the binding of the names inside the block of code have to be based on the scope in which the code appears, which in this case is the method sayHelloInAnotherThread, not in the scope of the anonymous class that will be created. Why? Well, if the meaning of a name appearing in a block of code changes when you use it in code passed in to method like this, then it would be a fruitful source of Java Programming Puzzles. That is especially true if the scope from which the names are inherited isn't even mentioned in the code. It gets worse when you consider that the method name to which the closure is being passed might be overloaded, so you can't tell when you're typechecking the block which interface you are building.

Some people believe that what you can do with interfaces and classes is enough, and would prefer to see no more (and possibly less) than some abbreviated syntax for writing anonymous class creation expressions. The most significant difference between that approach and closures appears in the other category of use cases.

Synchronous use cases

The other category I'll call the synchronous use cases, in which the closures you pass to an API element are invoked by your own thread strictly before that API element returns control to you. There aren't many of these APIs appearing in the JDK, usually because such APIs are rather awkward to write. Anonymous class instances sometimes satisfy the requirement for this category of use cases, and when they don't programmers resort to various contortions to get their job done, or simply leave the job half done. Let's see how far we can get, though, using only interfaces for this category. I'll give a single example, again motivated by the concurrency framework.

The Java synchronized statement is convenient and easy to use, but for many purposes has been superceded by the Lock interface in the concurrency framework. Here is how you currently use a Lock to synchronize a snippet of code:


void sayHelloWhileHoldingLock(Lock lock) {
    lock.lock();
    try {
        System.out.println("hello");
    } finally {
        lock.unlock();
    }
}

This syntax is tedious and distracting. One would prefer to write something like

void sayHelloWhileHoldingLock(Lock lock) {
    withLock(lock) {
        System.out.println("hello");
    }
}

Let's try to write this using only interfaces and classes, and perhaps taking advantage of the syntactic abbreviation that we might think of as being implemented by translation into inner classes. The first thing we'll need is an interface that represents the block of code that we're passing in. java.lang.Runnable might just do it. So our first try at writing this method might look something like this:

void withLock(Lock lock, Runnable runnable) {
    lock.lock();
    try {
        runnable.run();
    } finally {
        lock.unlock();
    }
}

This works for some uses. It allows us to write the sayHelloWhileHoldingLock method, above. So far so good. What happens if we try to write other clients using this?

Suppose we have the following method that uses old-style locks, and we want to refactor it to using a java.util.concurrent lock.

void callBigHonkingWhileHoldingLock(Object lock) throws CheckedException {
    synchronized (lock) {
        bigHonkingMethod(); // throws CheckedException
    }
}

The refactored version would presumably look something like this:

void callBigHonkingWhileHoldingLock(Lock lock) throws CheckedException {
    withLock (lock) {
        bigHonkingMethod(); // throws CheckedException
    }
}

It would be nice if this just worked, but it doesn't quite. The reason is that the run method of Runnable isn't declared to throw any exception types, and specifically it doesn't throw CheckedException. If we make something almost like Runnable but in which the run method throws Exception, that doesn't quite work either because the implementation of withLock would have to declare that it throws Exception too, and we aren't catching it in callBigHonkingWhileHoldingLock.

The immediate problem is that the withLock method doesn't provide exception transparency, which means that the method throws the same exception type as the method passed in to it. In order to get that, we'd have to make a version of Runnable that can also throw some unspecified exception type. We can do that with generics:

class RunnableWithException<E extends Exception> {
    public void run() throws E;
}

now we can make the withLock method exception-transparent:

<E> void withLock(Lock lock, RunnableWithException<E> runnable) throws E {
    lock.lock();
    try {
        runnable.run();
    } finally {
        lock.unlock();
    }
}

Assuming the closure conversion can automatically figure out that it should be creating an instance of RunnableWithException<CheckedException>, which it should, this works just fine.

Things are a bit more complicated when the thing you're running can throw two checked exception types; you really want to write a version of withLock that doesn't care how many exceptions are thrown in the block, they are all just propogated out. Let's take it as a given that the generic type system can be extended to do that (I don't think it's too difficult). Now we have a version of withLock that is completely exception-transparent, and can be invoked on a block of code that throws any set of exception types, which will have to be declared or caught by the caller of withLock. So far so good.

Let's take another slightly more complex example of a method that we might want to refactor to use the new locks:


boolean containsFred(Object lock, List<String> c) throws CheckedException {
    synchronized (lock) {
        for (String s : c) {
            if (s.toLowerCase().equals("fred")) return true;
        }
    }
    return false;
}

The refactored version would presumably look something like this:

boolean containsFred(lock lock, List<String> c) throws CheckedException {
    withLock (lock) {
        for (String s : c) {
            if (s.toLowerCase().equals("fred")) return true;
        }
        return false;
    }
}

But this won't compile. The first problem is that the variable c isn't final, but it is being accessed within an inner class. We can easily fix that in this case by making c final because the variable isn't assigned anywhere. If the variable were assigned somewhere (but not in the block), we could create a final local variable to hold a copy of the variable's value where it is used, and use that other variable inside the block.

There is a worse problem: the return true statement inside the block is returning from the RunnableWithException.run method, not containsFred but the run method returns void. We could solve that, perhaps, by making a new version of Runnable-like interface that returns a boolean. Or better yet perhaps we could add another generic type parameter to indicate the return type of the interface's method, and have the withLock method return the value to its caller that was returned by the block.

So far so good... well, not quite as good as one would hope, because the refactoring isn't straightforward and requires serious thought from the programmer. Did you notice that the final return statement had to move inside the block? But Java programmers generally speaking are smart people, and they can work out how to do this. I'm sure they enjoy (as much as I do) spending their time puzzling out how to coerce the language into doing what they want.

Next let's try to refactor the following method to use new locks:

int countBeforeFred(Object lock, List<String> c) throws CheckedException {
    synchronized (lock) {
        int count = 0;
        for (String s : c) {
            if (s.toLowerCase().equals("fred")) return count;
            count++;
        }
    }
    reportMissingFred(c);
}

Performing a straightforward refactoring of this code to use a method like withLock runs into a number of problems. Let's look at the snippet of code that would become the body of the method RunnableWithExceptionAndReturn.run

{
    int count = 0;
    for (String s : c) {
        if (s.toLowerCase().equals("fred")) return count;
        count++;
    }
}

This can't possibly be the body of any method: it returns a value in the middle of a loop, but it also falls off the end of its execution. In order to refactor this method to use something like withLock, we have to start reorganizing the code to avoid these problems. We could move reportMissingFred(c); into the synchronized block, but that changes the behavior of the program. Another idea is to use a boolean state value that is returned to the caller in addition to the count, to tell the caller whether or not the block fell off the end, but where would we put it? We can't make it a local variable in countBeforeFred because the block can only use final variables. Perhaps we could use a ThreadLocal. We could put a length-one final boolean array in the local scope of countBeforeFred, and keep the boolean value in element zero.

If the body of the countBeforeFred were much more complicated in its control structure at all, for example containing a break from inside the block to outside it, this kind of refactoring might become prohibitively complex. Realistically, we would just give up and stop trying to use withLock at all, effectively inling it. After all, it is only a few lines of locking code, and it's an idiom that programmers should be expected to learn, right? The idiom is even documented in the javadoc of the Lock interface.

We've only touched the surface, though. The method withLock is perhaps simple enough that you won't mind having to inline it except in the simplest of circumstances when it does you the least good. You might feel some discomfort at repeating code, especially such boilerplate code, but there really isn't any good alternative, and after all it really isn't that much code.

However, some synchronous APIs are naturally more complex, and it doesn't take much more complexity in the API before it simply doesn't make sense for the caller to repeat code from the implementation of the API to avoid these problems. Consider, for example, an API that automatically closes your streams for you. Instead of writing

{
    Stream s = null;
    try {
        s = openStream();
        yourCodeHere();
    } finally {
        try {
            if (s != null) s.close();
        } catch (IOException ex) {}
    }
}

You would write an invocation of a library method closeAtEnd to get the same effect:

{
    closeAtEnd(Stream s : openStream()) {
        yourCodeHere();
    }
}

You would probably go to much greater lengths to avoid inlining the API method closeAtEnd than withLock, but the same problems are are unavoidable. These are just two specific examples of particular APIs of the synchronous closure variety; if the API implementation is much more complicated, or contains details in its implementation that callers should not be aware of, the problem becomes acute because by-hand inlining is no longer an option. If you've ever used java.security.AccessController.doPrivileged, for example, you know the pain.

While we've shown how to achieve exception transparency, we haven't quite achieved control transparency, which means that control constructs like break, continue, and return have the same meaning when enclosed by a closure. The idea is that you should be able to wrap a block of code in an invocation of withLock, closeAtEnd, or any other API that takes a block of code, and the meaning of the control constructs appearing in that code should be the same. Ideally, the only difference is that the block is executed while a lock is held, or a stream is closed after the block, or more generally whatever is specified and implemented by the API. You should be able to break from the block to get out of a loop or switch statement that might enclose the invocation of withLock or closeAtEnd. We've looked at examples where return is problematic, but the same problems occur for other control constructs. Similarly, and for the same reasons, in synchronous APIs you want to be able to refer to and modify local variables in the enclosing scope; because the block is executed synchronously and by the same thread, there are no more concurrency issues than there are for an ordinary block statement.

If you believe that these two methods - closeAtEnd and withLock - are the only synchronous APIs worth having, now or ever, you might be tempted to add some very specific language features to Java to get precisely the desired functionality, and dismiss the rest of our closure proposal. We believe that would be a mistake, because we believe that synchronous closures would greatly simplify the lives of programmers. There are not many APIs of this sort today, mainly because they can't be made to work very well with the language in its current state.

The problem to be solved for synchronous closures is that the programmer should not have to transform or contort his code at all in order to wrap it in a closure to pass it to one of these APIs.

Bridging the Chasm

We're left with two seemingly irreconcilable sets of requirements. Asynchronous closures should not be allowed to refer to local variables in the enclosing scope, do not support nonlocal control-flow, do not require exception transparency, and generally speaking are done today using interfaces or abstract classes. Synchronous closures, on the other hand, should be allowed to refer to locals in enclosing scopes, should support nonlocal control-flow, and may require exception transparency, but generally speaking don't appear in the Java APIs because they are not expressible. These use cases would seem to have so little in common that one might not expect a single language construct to satisfy both sets of requirements. But we believe a small modification to the currently published proposal, in addition to the change described in my previous blog post, satisfies both sets of requirements. This is a very tidy result, because having a single uniform language feature is very much preferable to having two.

The change is very simple, but the change is not to the specification of function types or closure expressions, it is to the specification of the closure conversion, which allows you to write a closure - a block of code - where a one-method interface is expected. We add the additional restrictions that a closure to which the closure conversion is applied must not reference any non-final local variables from enclosing scopes, and must not use break, continue, or return to transfer control out of the closure, or the program is in error. These are the same restrictions you have now with anonymous inner classes.

How does this solve both sets of problems? Well, first consider the asynchronous use cases. These are expressed using interfaces, and therefore in order to use these APIs with the closure syntax, the block must be subjected to the closure conversion. Consequently, if there is any use of a nonlocal non-final variable, the compiler must reject the program. Similarly, the compiler must reject the program if there is nonlocal control-flow. The things that you will be able to do are essentially the same as what you can do today, but with a much simpler syntax and clearer semantics.

However, APIs for the synchronous use cases have largely not been written yet, so they can be written using the new function types. A closure is an expression of function type, so the closure conversion is not required, and its restrictions do not apply. Consequently, the block is allowed to use nonlocal non-final variables, and can use nonlocal control-flow. The block has essentially the same meaning as it would if it were not wrapped by a closure. In the rare cases that an existing API is expressed using an interface but is a synchronous use case - for example java.security.PrivilegedExceptionAction - the library writer can write a small utility method that converts a closure to an implementation of the interface, enabling the client to take advantage of the full generality of the feature. To implement withLock using closures, with full exception- and control-transparency, the JDK authors could write this:

<E extrends Exception>
public static void withLock(Lock lock, void()throws E block) throws E {
    lock.lock();
    try {
        block();
    } finally {
        lock.unlock();
    }
}

The method closeAtEnd is similarly straightforward.

Once you have support for synchronous closures, however, there are many opportunities for the addition of extremely useful APIs to the JDK as library methods, such as withLock, closeAtEnd, and a method that simplifies the use of java.security.AccessController.doPrivileged. More importantly, synchronous closures enable Java programmers - not the JDK authors, but ordinary Java programmers - to freely refactor and abstract over pieces of their code in simple ways that are not currently possible for seemingly arbitrary reasons.

Refactoring

This isn't just about writing and reading programs, it is about refactoring programs too. Your ability to factor common code by moving it into a separate "routine" is currently limited by how the block passed by the caller can interact with the caller. Closures, and synchronous closures in particular, simply remove that limit. When a program is in need of a small refactoring, you are free to factor just the code in common, rather than revisiting the design of the basic data structures and organization of your application. That was the point of my post What's the Point of Closures. Some people missed the point, and thought the program was merely flawed from the beginning and should have been rewritten. Yes, it was flawed, as are most programs in the real world. Not every programmer can get everything right the first time, or can rewrite thousands of lines to solve a localized issue. The whole point of such examples of refactoring is that the original program is flawed in some way. If the program were not flawed then refactoring might not be required. The refactorings supported by closures enable the programmer to get everything right in the end by small refinements to the program, step by step.

And more...

I believe that the additional expressiveness of the language with closures will enable people to discover and use new and powerful kinds of design patterns to organize their code in ways not possible today, making things simple that currently seem complex. I have some more ideas in this direction, which I'll write about at another time when I've had more sleep.

33 comments:

Jochen "blackdrag" Theodorou said...

A question here... "We add the additional restrictions that a closure to which the closure conversion is applied must not reference any nonlocal non-final variables, and must not use break, continue, or return to transfer control out of the closure, or the program is in error."

nonlocal nonfinal? Nonlocal means I can't use fields, right? Isn't that even more restricting than inner classes, where I can use fields of the enclosing class?

And then about closure conversion. I name closures inline-closures whenever the closure is directly given to a method as attached block. So am I right that only this inline form would be capable of being converted to implement a one method interface? Which for example would mean I can't use the same closure for more than one actionPerformed. Am I right here?

If no, than I missed important points in your blog. If yes, then this means I won't be able to make an API working on closures if at any point such interfaces are involved. and that is most likely. Or would I warp the closure in a inner class then? If I want to avoid that, I would have to use the interfaces again. And not only that, because of the usage of checked exceptions I may have to define multiple interfaces just to cover the exception cases. That is really not very appeling. And even worse.

But exceptions are a general problem. Think of an iteration method using a inline closure. Maybe I iterate over a bunch of files where I make a backup of the file, read the file as xml, transform the file, overwrite the backup, copy the backup over the old file. Naturally this means that I might have to handle some very different Exceptions. And my only possibilties now are, that I handle them all at the place they occur (inside the closures) or to wrap them in a RuntimeException and catch them later? What a method do I have in mind here you may ask. Well (sorry, the blog doesn't allow me to use the pre tag):

public class IteratorHelper {
public static void forLoop(List l, Runnable closure) {
for (Object o: l) closure(o)
}
}

This method tells us the closure will not throw any checked exception. So what do I do if I want to use a closure that throws one? I need to write a new interface taking "void(Object) throws Exception"? Do I really have to overload the method 3 or 4 times to cover exceptions? The really bad thing then would be, that these overloaded methods would not be able to share the same implementation.

And if I do not use conversion - if I operate on closure types? Do I then still have to overload the method to cover the exception cases? But ok, that might be a general problem with checked Exceptions, even with inner classes.

Ah right, the non-local-returns! So you want to disallow them in case of inlined closures? I am speaking about inline closures only here, because if you get the closures from somewhere else, then it is stored in a variable before and you won't be able to make a difference between the closures. I mean the compiler can't tell us about a closure it doesn't know, so there can't be a conversion of a closure given as parameter in a method to a one-method-interface avoiding for example "return". Ok, you could make them different by giving them different types - what ever that would be. If not it means that the interface is dominant, because if I want to call a method taking an interface with one method using a closures and the closure itself comes as method parameter, then I have to use the interface type instead of any closures type to still be able to get the conversion I need.

Please, don't think I am against closures. I am using them very much in Groovy. But while Groovy is running on the JVM, it doesn't force type or exception checks. And in case of closures not even the number of parameters is checked during compile time. This is surely a nogo for Java.

But well.. it's a draft. There is maybe more time needed to get a really nice version. Not only to read in source at the point the closure is created, but also at the point the closure is used. And of course being able to write really general programs.

Anonymous said...

Thanks, Neal, for this blog. I think I begin to see the point of closures... I often write very similar code (like the typical stream-closing you mention) and it's true that it's difficult to refactor it in a shorter way (avoiding writing the same again and again).

I like the idea of having closures in JDK7, but I must admit that some parts of the syntax are a bit ugly for me, like the declaration of a closure "a_return_type (a_param_type,another_param_type) throws an_Exception" as another parameter of a method that uses a closure. It reminds me of a C function pointer declaration. Of course we need type verification, so the types and throws declaration should be declared, but I see it a bit ugly.

If we can have a somehow clearer syntax for this, it would be ideal for me, but I know it's difficult, I can't make a proposal myself now... :-(

Regards,


Xavi

Neal Gafter said...

I hear you, Xavi. We'll work on it.

Anonymous said...

Hello Neal and thanks for the food at the Google party during J1. :)

If the "final" restriction was lifted on local varables for annonymous inner classes and type safe function references was added to Java, wouldn't that solve the vast amount of use cases that clsoures is set out to address? And this wihout changing the language much. References to functions are simpler to understand for average Java developers IMO.

Cheers,
Mikael Grev

Neal Gafter said...

No, Mikael, that still wouldn't provide exception transparency or control transparency, which are the kinds of things you need for many kinds of synchronous use cases.

Anonymous said...

What would prevent someone from trying to create a new asynchronous API that used the function type rather than an interface with a single method? Is it just assumed people creating such APIs would have to know not to do this?

Brian

Neal Gafter said...

Sorry, blackdrag, my wording wasn't clear. It should say "a closure to which the closure conversion is applied must not reference any non-final local variables from enclosing scopes". I fixed the wording.

You could use two closures to implement a two-method interface by writing a little helper method that takes two closures and creates an anonymous class.

As for the issues with exceptions, that is currently a limitation of interfaces that closures don't make any worse or better. To make your method exception transparent, you'll have to make the method generic and give it a type parameter for the exception, and use function types, as I showed how to do. You can't really do it correctly with interfaces. However, we'll see what we can do about that. Check back in about two weeks.

Howard Lovatt said...

Your example is easy to code with inner classes. Put your effort into support for a more functional style of programming with a functional style collections library, with select, reject, etc., stuff like a Tuples classes, and shoter syntax, e.g.:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6389769

Your example with inner classes is:

interface Block<E extends Exception> { void e() throws E; }

<E extends Exception>
public static void withLock(Lock lock, Block<E> block) throws E {
lock.lock();
try {
block();
} finally {
lock.unlock();
}
}

You would return count, in the extended example, using a tuple class. If the extra syntax of using a tuple class is considered a problem, then add auto boxing to tuples (this is what C# 3.0 does).

Perhaps a new mantra could be: Closures = Poor Man's Inner Class

Neal Gafter said...

Howard: your suggestion doesn't work if the block throws more than one checked exception.

Howard Lovatt said...

RE Multiple Exceptions.

You don't even need the exception as a generic parameter. Foe example you can use the existing Callable interface:

static < V > V withLock( final Lock lock, final Callable< V > block ) throws Exception {
lock.lock();
V value;
try {
value = block.call();
} finally {
lock.unlock();
}
return value;
}

static int countBeforeFred( final Lock lock, final List< String > c ) throws Exception {
final Integer count = withLock( lock, new Callable< Integer >() {
public Integer call() throws Exception {
int value = 0;
for ( final String s : c ) {
if ( s.toLowerCase().equals( "fred" ) ) return value;
value++;
}
return null;
}
} );
return count == null ? reportMissingFred( c ) : count;
}

static void twoExceptions() throws Exception, RuntimeException {
final Lock lock = new ReentrantLock();
withLock( lock, new Callable< Void >() {
public Void call() throws Exception {
if ( Math.random() > 0.5 ) throw new RuntimeException();
throw new Exception();
}
} );
}

public static void main( final String[] notUsed ) {
for ( int i = 0; i < 10; i++ ) {
try {
twoExceptions();
} catch ( RuntimeException notUsed2 ) {
System.out.println( "Runtime" );
} catch ( Exception notUsed2 ) {
System.out.println( "Exception" );
}
}
}

With some syntactic sugar this would present as well as the proposed closures and would have the advantage of not introducing a new but partially overlapping concept.

Neal Gafter said...

Howard: the point of exception transparency is that the withLock method throws precisely the set of exceptions thrown by the block of code. Your solution causes the generic Exception to be thrown, forcing the caller to catch Exception rather than just the things thrown by the block. That undermines the benefits of compile-time exception checking.

I believe there is a solution to this problem, but yours isn't it.

Tom Palmer said...

Please no function types. And just let us throw whatever exceptions we want, too. That solves all the exception problems. (There's a reason RuntimeExceptions are getting more popular.)

Synchronous vs. asynchronous and flow control statements are still fun, though.

Neal Gafter said...

Howard: this technique can't be used to make a method, like withLock, that is exception transparent unless the caller is restricted to some specific fixed number of exception types.

Neal Gafter said...

Tom Palmer: I think a solution along the lines you suggest is possible. Check back in two weeks.

Neal Gafter said...

Or rather, I think a solution along the lines you suggest but also prividing exception transparency is possible.

Tom Palmer said...

I think I read your post more carefully now. Please don't use function types to distinguish sync vs. async. Please don't include function types at all.

It should be possible to distinguish synch vs. asynch based on usage syntax alone. That is, if you really care to make this distinction, then allow sync to look like blocks, but don't allow async to look like blocks. There's still no need for function types. Closure conversion really should address all needs. (And how do you define semantics for function types. Or see Tom Hawtin's blog comment on the pain of JavaDoc for function types.)

Also, please let async reference non-final outer locals. That's okay. It's the break/continue/return and so on that don't apply.

In any case, thanks for bringing the discussion public even though it's created such a boiling pot. And I agree that convenient closures are very important to solid and readable code. And that it's best not to assume we make special case blocks in advance for everything.

Oh, and I'll check back in two weeks, too. Thanks.

Tom Palmer said...

And sorry to go on so long, but there's one more fun use case. Swing's invokeAndWait(). That is, the point is to run the code on a separate thread, but it also runs synchronously in the sense that the current thread hangs until the other thread finishes.

I could see value in the ability to express this as a block and provide flow control and exception handling. I think this use case should also be under consideration (even if it can't be solved well enough).

And on that subject, I guess I need to study people's threading concerns with non-final vars better.

Stephen Colebourne said...

Firstly, a quick thanks for having this debate in public. Its already clear that some views and opinions are being taken onboard by the group. Secondly, thanks for going back to explain Tennents Principle and to identify the two separate use cases more clearly. In fact, this post is probably the most useful yet. Rather than post a long response here, I've blogged at my blog

Howard Lovatt said...

@Neal Re. Checked Exceptions

I think the 80:20 rule is applicable here. For 20% of the effort you can get 80% of the benifit. I.E. just stick with the closure: throwing RuntimeExceptions, exactly one Exception in throws clause, or limit it to throwing, say, up to 5 Exceptions in its throws clause. That covers virtually all cases and is easy.

However you seem to want to cover absolutely every case, even the rare and theoretical cases. In that case why not extend generic syntax so that if the last generic argument is a Throwable or something derived from Throwable then ellipses can be used like var args. E.G.

interface F< V, E extends Throwable... > { V e() throws E; }

That is a simple solution with many possible use cases and not limited to just closures.

Neal Gafter said...

Howard: "varargs" generics don't work because, within the generic class, there is no way to refer to the type arguments after the trailing one. If something in this space is workable, it had better be the disjunction type, but disjunction types must be severely restricted in order not to break generic type inference. I'll blog more on this topic in about two weeks.

Neal Gafter said...

Stephen: a closure has precisely the same meaning no matter which kind of API it is passed to. Only in one case certain things that you try to do will be rejected by the compiler, in the other case allowed.

Stephen Colebourne said...

Neal, maybe it was late, but your reply merely emphasises the point I was making. You said "a closure has precisely the same meaning no matter which kind of API it is passed to", but then contradict that with "Only in one case certain things that you try to do will be rejected by the compiler, in the other case allowed".

This dichotomy is central to my argument. The proposal currently tries to pretend that this is one nice neat language extension, but it isn't. There are two new syntaxes being added, one where one list of things are valid, and another where a different list of things are valid. The fact that they will both look similar at a glance will make it especially confusing.

In particular, consider a *user* of closures. They won't be able to tell just from readng the 'client' code as to which type it is - sync or async. And hence what change they can or can't make. And if thats not a bad language smell I don't know what is!

Neal Gafter said...

Stephen: the reader won't need to know what kind of use the closure is to know what the code means. If it is legal, it can mean only one thing. A reference to a variable name not defined within the closure, for example, always designates a variable by that name in a scope lexically enclosing the closure. Similarly for method names.

If the reader needs to understand more about the semantics in the context of the method being called, the reader has to refer to the documentation for the method being called. That has always been the case, and the introduction of closures wouldn't change it.

Eugene Vigdorchik said...

What I cannot understand is how you are going to deal with mixing synchronous and asynchronous closure usages, i.e. if the user calls f(void ()) passing a closure that mutates enclosing local state/performs nonlocal transfer, but in f some asynchronous API is called. Is it allowed to perform closure convertion on parameter closures?

Neal Gafter said...

eugene: closure conversion only applies to closure expressions, not to any expression of function type.

Eugene Vigdorchik said...

Neal, then you will lose the transparency to extract variable/method from any expression/set of statements without breaking the code...

Neal Gafter said...

Eugene: You can still keep transparency to extract variable/method; when performing the refactoring you just make the type of the variable or method parameter the interface type required by the API instead of the function type, so that the closure conversions occurs at the site where the closure is written.

Anonymous said...

Asynchronous vs. Synchronous? As a Lisp programmer, I get the distinct impression that you may have confused the functionality of closures with the functionality of macros.

Java inner classes, when accessed outer local variables are declared as final single-value arrays, have precisely the same semantic functionality as Common Lisp closures. Your Asynchronous example is the classic example of use of such closures, in Lisp and other dynamic languages, and also in Java. And yet you're stating that inner classes aren't as powerful as "closures" because they can't be used in "synchronous" cases. Call me befuddled, because Lisp closures can't be used there either.

I think you have confused closures with macros. Closures are a semantic mechanism. Macros are syntactic manipulation. You're trying to implement in functions (with your withLock() and later auto-stream-closing examples) textbook examples, in Lisp anyway, of things that can't be properly implemented with functions. And it appears you've rediscovered the gotcha without realizing the solution. These things are trivially and cleanly implemented with macro expansion. Closures don't really have much to do with it, IMHO.

Berin Loritsch said...

Closures are a great way to handle the Visitor pattern. Many of the languages that support closures right now, don't necessarily provide any guidance on what to expect with multithreaded code. The assumption always being that they are executed synchronously.

I would even go so far as to suggest that closures should be implemented that way, and put the onus of thread safety on the person using them. If you refer to variables outside the closure, inside a particular method, you do so at your own risk if the call is being done asynchronously.

Neal Gafter said...

Berin: loke many other people, you misrepresent the visitor pattern. It is all about double dispatch. See the GOF book.

In any case, closures, like methods and classes and other language constructs, can be used in a single-threaded or concurrent context, and it is up the the API author to provide any guarantees about synchronization or thread safety of the API. There is no reason to tie closures to purely-concurrent or purely-sequential use cases.

Lasse said...

Does anything prevent us from capturing a (synchroneous) closure, e.g., by storing it in a final variable and referencing that from inside an instance of an inner class?

I.e., are closures storable?

If they are, and can thereby survive the scope they are created in, when will an error happen? Immediately when the closure is invoked, or when it tries to do something that requires being synchroeous (accessing local state or doing non-local control flow operations)?

/Lasse 'call by name by any other name ... :)'

Neal Gafter said...

@Lasse: the specification provides that a runtime exception is thrown in this case. Fortunately, the problem is very easy to diagnose, as the closure is lexically written inside of a method that lets it escape.

Neal Gafter said...

@Lasse: let me be more specific. Captured variables survive - that is, their lifetime lasts as long as any code that can refer to them. The only runtime issue that can happen is when a closure attempts to return (or break or continue) from a method that is no longer running.