Monday, December 18, 2006

Closures Talk and Spec Update

This post discusses a draft proposal for adding support for closures to the Java programming language for the Dolphin (JDK 7) release. It was carefully designed to interoperate with the current idiom of one-method interfaces. The latest version of the proposal and a prototype can be found at http://www.javac.info/.

My 2006 JavaPolis talk Closures for Java has now been published - slides synchronized with a soundtrack - and can be viewed online at the new Javapolis website, Parleys.com. If you're still wondering why all the fuss about closures, I recommend you listen to the talk.

We've also just published an update to the specification, version 0.4 . The specification seems to be setling down quite a bit, as the changes are minor:

  • The throws clause of a function type is now placed between the curly braces.
  • The description of function types has been rewritten to emphasize more clearly that function types are interface types rather than a separate extension to the type system.
  • The closure conversion can now convert a closure that has no result expression to an interface type whose function returns java.lang.Void. This change helps support completion transparency. A completion transparent method is one written such that the compiler can infer that an invocation completes normally iff the closure passed to it completes normally.
  • Some examples have been modified to be completion transparent.
  • null is now a subtype of Unreachable, and a number of small related changes. Thanks to Rémi Forax for pointing out the issues.

I hope the talk, which is less technical than the specification, makes it easier for you to evaluate the proposal and compare it to the alternatives.

Friday, December 15, 2006

Javapolis 2006

I'm on my way back from Javapolis 2006, unfortunately missing the third day of the conference. One of the innovations this year was a bank of ten whiteboards where people brainstormed about the future of the Java language and platform. I had a camera with me but I realized too late that I should have taken snapshots of the contents of all those boards before I left. I hope someone else will. The only snapshots I took were the vote tallies that I discuss below.

There were three issues discussed on those boards that I'd like to share with you.

Closures

The first board contained a discussion of closures, including an informal vote of people "in favor of" and "against" adding support for closures. I gave a talk about closures yesterday afternoon, which explained the goals and design criteria, showed how they fit quite smoothly into the existing language and APIs, and more importantly explained in detail why anonymous instance creation expressions do not solve the problems closures were designed to solve. Before my talk about closures, the vote was about 55% in favor and 45% against. After my talk the vote was 71% in favor and 29% against. I don't think any "against" votes were added to the tally after my talk. There was also a BOF (Birds-Of-a-Feather) session in the evening discussing closures. Of the 20 or so attendees none admitted being against adding closures. I'm sure the fact that the BOF was scheduled opposite a free showing of Casino Royale didn't help, but I had hoped to hear from opponents more about their concerns. We discussed a number of issues and concerns, most of which had been discussed on my blog at one time or another.

One issue that I discussed with a few individuals earlier and then at the BOF was the idea of adding a statement whose purpose is to yield a result from the surrounding closure, which you could use instead of or in addition to providing a result as the last expression in the closure. It turns out that adding this feature make closures no longer suitable for expressing control abstractions. In other words, adding this feature to the language would make closures less useful! This is a very counterintuitive result. I first understood the issue by analyzing the construct using Tennent's Correspondence Principle, but it is much easier for most people to understand when the result is presented as specific examples that fail to work as expected. For now I'll leave this as an exercise to the reader, but I'll probably write more about it later. Incidentally, I believe the folks who designed Groovy got closures wrong for exactly this reason.

There was a video made of my talk that will be posted to the Javapolis website. Ted Neward also interviewed me on video, and Bill Venners interviewed me on audio. As soon as these are available on the web I'll blog pointers to them.

Native XML Support

Mark Reinhold gave a talk on adding native (i.e. language-level) support for XML into Java. Though they were not presented as such, some people prefer to think of the proposal as separable into a language extension part and an API part. The proposed APIs appear to be an improvement over the existing alternatives. However, the language extension for writing XML literals appears to be only marginally more convenient than the XML construction techniques provided by libraries in JDOM. I personally would like to see the new APIs pursued but XML creation provided in the JDOM way. Mark took a vote by show of hands on how people felt about the two issues, but I couldn't see the tally. There was also an informal tally about adding native XML support on one of the whiteboards. The result was 29% in favor and 71% against.

Constructor Invocation for Generics

Another much-discussed issue appeared on one of the whiteboards: the verbosity of creating instances of generic types. This is typical:

Map<Pair<String,Integer>,Node> map = new HashMap<Pair<String,Integer>,Node>();

The problem here is that the type parameters have to be repeated twice. One common "workaround" to this problem is to always create your generic objects using static factories, but the language should not force you to do that. A number of different syntax forms have been suggested for fixing this:

var map = new HashMap<Pair<String,Integer>,Node>();

This unfortuately requires the addition of a new keyword. Another:

final map = new HashMap<Pair<String,Integer>,Node>();

This reuses an existing keyword, but at the same time it also makes the variable final. Another variation on this idea:

map := new HashMap<Pair<String,Integer>,Node>();

In my opinion these three forms all suffer the same flaw: they place the type parameters in the wrong place. Since the variable is to be used later in the program, the programmer presumably wants control over its type and the reader wants the type to be clear from the variable declaration. In this case you probably want the variable to be a Map and not a HashMap. An idea that addresses my concern is:

Map<Pair<String,Integer>,Node> map = new HashMap();

Unfortunately, this is currently legal syntax that creates a raw HashMap. I don't know if it is possible to change the meaning of this construct without breaking backward compatibility. Another possibility:

Map<Pair<String,Integer>,Node> map = new HashMap<>();

You can see clearly by the presence of the empty angle brackets that the type parameters have been omitted where the compiler is asked to infer them. Of the alternatives, this is my favorite. I don't think it will be too hard to implement in javac using the same techniques that work for static factories of generic types.

Tuesday, December 05, 2006

Type Literals

Yesterday I wrote about super type tokens, which is an API that acts like class literals but with two advantages over class literals: they work with generic types, and they provide full generic type information. The disadvantages are that they are more verbose than class literals, and (being a different type than java.lang.Class) they are incompatible with APIs that use type tokens. It turns out that it is possible to get the best of both worlds by adding super type tokens to the language. I would call the resulting construct type literals.

The basic idea would be to retrofit java.lang.reflect.Type with generics, in the same way that java.lang.Class was retrofitted with generics in JDK5. Then the type of List<String>.class would be Type<List<String>>, and the structure of the resulting object would contain all of the generic type information that a Type is capable of expressing. The code generated by javac could very well use the trick outlined in my previous blog post, though I expect there are much more efficient ways of doing it. You would then be able to write

Type<List<String>> x = List<String>.class; 

The "type literal" on the right is currently a syntax error, but this language extension would give meaning to the syntax. How does this fit with the existing meaning of class literals? Very nicely. The type java.lang.Class already extends java.lang.reflect.Type, so a class literal of type Class<String> could be assigned to a variable of type Type<String>. In other words, the following would be legal:

Type<String> x = String.class;  

Adding support for type literals to the language and retrofitting java.lang.reflect.Type would simplify the use of the THC pattern with generic types and make existing type-token-based APIs interoperate nicely with the pattern.

Monday, December 04, 2006

Super Type Tokens

When we added generics to Java in JDK5, I changed the class java.lang.Class to become a generic type. For example, the type of String.class is now Class<String>. Gilad Bracha coined the term type tokens for this. My intent was to enable a particular style of API, which Joshua Bloch calls the THC, or Typesafe Heterogenous Container pattern. For some examples of where this is used see the APIs for annotations:

public <A extends Annotation> A java.lang.Class.getAnnotation(Class<A> annotationClass)

My earliest use of this feature (like my earliest use of all recent Java language features) appears in the compiler for the Java programming language (javac), in this case as a utility called Context, and you can find the code in the open source version. It was a utility that allowed the compiler to be written as a bunch of separate classes that all refer to each other, and solved the hard problem of getting all the parts created in an order such that they can be initialized with references to each other. The utility is also used to replace pieces of the compiler, for example to make related tools like javadoc and apt, the Annotation Processing Tool, and for testing. Today I would describe the utility as a simple dependency injection framework, but that wasn't a popular buzzword at the time.

Here is a simple but complete example of an API that uses type tokens in the THC pattern, from Josh's 2006 JavaOne talk:

public class Favorites {
    private Map<Class<?>, Object> favorites =
        new HashMap<Class<?>, Object>();
    public <T> void setFavorite(Class<T> klass, T thing) {
        favorites.put(klass, thing);
    }
    public <T> T getFavorite(Class<T> klass) {
        return klass.cast(favorites.get(klass));
    }
    public static void main(String[] args) {
        Favorites f = new Favorites();
        f.setFavorite(String.class, "Java");
        f.setFavorite(Integer.class, 0xcafebabe);
        String s = f.getFavorite(String.class);
        int i = f.getFavorite(Integer.class);
    }
}

A Favorites object acts as a typesafe map from type tokens to instances of the type. The main program in this snippet adds a favorite String and a favorite Integer, which are later taken out. The interesting thing about this pattern is that a single Favorites object can be used to hold things of many (i.e. heterogenous) types but in a typesafe way, in contrast to the usual kind of map in which the values are all of the same static type (i.e. homogenous). When you get your favorite String, it is of type String and you don't have to cast it.

There is a limitation to this pattern. Erasure rears its ugly head:

Favorites:15: illegal start of expression
f.setFavorite(List<String>.class, Collections.emptyList());
                          ^

You can't add your favorite List<String> to a Favorites because you simply can't make a type token for a generic type. This design limitation is one that a number of people have been running into lately, most recently Ted Neward. "Crazy" Bob Lee also asked me how to solve a related problem in a dependency injection framework he is developing. The short answer is that you can't do it using type tokens.

On Friday I realized you can solve these problems without using type tokens at all, using a library. I wish I had realized this three years ago; perhaps there was no need to put support for type tokens directly in the language. I call the new idea super type tokens. In its simplest form it looks like this:

public abstract class TypeReference<T> {}

The abstract qualifier is intentional. It forces clients to subclass this in order to create a new instance of TypeReference. You make a super type token for List<String> like this:

TypeReference<List<String>> x = new TypeReference<List<String>>() {};

Not quite as convenient as writing List<String>.class, but this isn't too bad. It turns out that you can use a super type token to do nearly everything you can do with a type token, and more. The object that is created on the right-hand-side is an anonymous class, and using reflection you can get its interface type, including generic type parameters. Josh calls this pattern "Gafter's Gadget". Bob Lee elaborated on this idea as follows:

import java.lang.reflect.Constructor;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;
import java.util.ArrayList;
import java.util.List;

/**
 * References a generic type.
 *
 * @author crazybob@google.com (Bob Lee)
 */
public abstract class TypeReference<T> {

    private final Type type;
    private volatile Constructor<?> constructor;

    protected TypeReference() {
        Type superclass = getClass().getGenericSuperclass();
        if (superclass instanceof Class) {
            throw new RuntimeException("Missing type parameter.");
        }
        this.type = ((ParameterizedType) superclass).getActualTypeArguments()[0];
    }

    /**
     * Instantiates a new instance of {@code T} using the default, no-arg
     * constructor.
     */
    @SuppressWarnings("unchecked")
    public T newInstance()
            throws NoSuchMethodException, IllegalAccessException,
                   InvocationTargetException, InstantiationException {
        if (constructor == null) {
            Class<?> rawType = type instanceof Class<?>
                ? (Class<?>) type
                : (Class<?>) ((ParameterizedType) type).getRawType();
            constructor = rawType.getConstructor();
        }
        return (T) constructor.newInstance();
    }

    /**
     * Gets the referenced type.
     */
    public Type getType() {
        return this.type;
    }

    public static void main(String[] args) throws Exception {
        List<String> l1 = new TypeReference<ArrayList<String>>() {}.newInstance();
        List l2 = new TypeReference<ArrayList>() {}.newInstance();
    }
}

This pattern can be used to solve Ted Neward's problem, and most problems where you would otherwise use type tokens but you need to support generic types as well as reifiable types. Although this isn't much more than a generic factory interface, the automatic hook into the rich generic reflection system is more than you can get with simple class literals. With a few more bells and whistles (toString, hashCode, equals, etc) I think this is a worthy candidate for inclusion in the JDK.