Thursday, April 9, 2009

Who's Afraid of Generic Variance?

Abstract


This post is an in-depth exploration of generic variance in the ECMA 335 standard and the REALLY BIG PROBLEM therewith. See the previous post on generic variance for an introduction to the topic.


The Punchline


The ECMA 335 Standard, 4th Edition, allows for nondeterministic execution. This means that in certain situations involving generic variance, the specification does not provide sufficient guidance to resolve ambiguities. As you may imagine, nondeterminism is a Very Bad Thing in computing (usually). It means that your program may run one way on .NET 3.5 and another way on .NET 4 and a third way if Richard Stallman saw his shadow in the morning.


The Solution


The ECMA 335 committee is hoping to resolve this problem in the 5th edition. Unfortunately, no clear solutions have revealed themselves. There are a number of possible approaches, all of which have pros and cons. I will be describing the problems and their possible solutions in this post. Feel free to weigh in on the matter. A robust discussion will aid the 335 group in their decision.


The Implementation Selection Problem


There are a number of increasingly pointy corner cases, all variations on the same theme: which of multiple implementation does the runtime select for execution? I will describe these corner cases in order of ascending pointiness. For each case, I will propose a deterministic solution and then demonstrate how that solution fails to address the next corner case. (Note: the deterministic solutions I propose are merely examples. Other solutions could be used.)


Overview


Before getting into the corner cases, I will provide a broad overview of the problem. Ambiguous situations arise when there are multiple implementations of one generic interface (with different type arguments). The generic interface must have some variant generic type parameter.

Vocab

The implemented types are the closed generic interface types that are actually implemented in the type hierarchy of the object in question.

The execution type is the closed generic interface type of the location to which the object is assigned. Calls made to members of this type must be resolved to an implementation among the implemented types.

An exact implementation is the implementation of an implemented type whose generic type arguments precisely match those of the execution type. For example, if type A implements the generic interface I<String>, and we assign some A to an I<String> location, then its implementation for that interface is exact.

A variant implementation is the implementation of an implemented type whose generic type arguments do not precisely match those of the execution type, but are legal variants thereof. For example, if type A implements the covariant generic interface I<String>, and we assign some A to an I<Object> location, then its implementation for that interface is variant.


Case 0

Nothing to See Here

As a starting point, let us consider the following non-variant scenario which is NOT ambiguous.

interface I<T> {
    T Next();
}

class A {}

class X : I<A> {
    A I<A>.Next() {...}
}

class Y : X, I<A> {
    A I<A>.Next() {...}
}

// Test code
I<A> i = new Y();
A someA = i.Next();

Problem
There are two implementations of I<A>: one in X and one in Y. Which is used for the call to I<A>.Next()?

Solution
The implementation in type Y is used. As I said, this is not actually a problematic situation. I illustrate this case to demonstrate one of the ways the spec currently resolves potential ambiguities. In this case, the implementation in the most derived class is used. Y is more derived than X, therefore its implementation is used.


Case 1

Easy Pickins

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class X : I<A> {
    A I<A>.Next() {...}
}

class Y : X, I<B> {
    B I<B>.Next() {...}
}

// Test code
I<A> i = new Y();
A someA = i.Next();

Problem
There are two implementations which suffice for I<A>: the I<A> exact implementation in X, and the I<B> variant implementation in Y. Which does the runtime select?

Solution
There are two possible solutions. We could adopt the aforementioned "most derived implementation wins" rule, in which case the runtime would pick the variant implementation in Y.

The other solution is to favor exact implementations over variant implementations, in which case the runtime would pick the exact but less derived implementation in X.

For the sake of argument, let's adopt the second solution: the runtime will select the most-derived exact implementation if one exists.


Case 2

The Decider

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class C : B {}

class X : I<B> {
    B I<B>.Next();
}

class Y : I<C> {
    C I<C>.Next();
}

// Test code
I<A> i = new Y();

Problem
There is no exact implementation of I<A>. There are two variant implementations: I<B> in X and I<C> in Y. Which does the runtime select?

Solution
As before, there are two options. We could choose the most derived implementation, in which case the runtime would select the I<C> implementation in Y.

The other option is to select the variant implementation with the least "distance" to the execution type. We are attempting to select an implementation for I<A>. We have variant implementations I<B> and I<C>. B is nearer to A in the inheritance chain than is C. Therefore we say that I<B> has less "distance" to I<A> than does I<C>. It may therefore be the case that the I<B> implementation is a more relevant substitute for I<A> since B is nearer to A than is C. By this rule, we would select the less distant but less derived I<B> implementation in X.

There are problems with the second option. What if there are multiple type parameters, each with different distances? How do we calculate the total distance of the type? Such a solution could require a rather complex set of rules, complicating the specification and compliant runtimes. And at the end of the day, it is dubious whether "distance" is a meaningful selection criteria.

For the sake of argument, let us go with the first option: the runtime will select the most derived variant implementation in the absence of an exact implementation.


Case 3

Ambiguity is Scary

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class C : B {}

class X : I<B>, I<C> {
    B I<B>.Next () {...}
    C I<C>.Next() {...}
}

// Test code
I<A> i = new X();
A someA = i.Next();

Problem
There is no exact implementation for I<A> and there are two variant implementations, I<B> and I<C>, both defined in the same type: X. Which does the runtime select?

Solution
We are now approaching the point at which any selection rule is largely arbitrary. We can revisit the "least distant" option though its problems are no fewer. It is also unclear whether such a selection behavior is obvious to the user. The user may not have taken "distance" into consideration when designing their types. On the other hand, the user who wrote this code clearly failed to take a number of things into consideration and it's now the runtime's job to make the best possible choice. Anything is better than nondeterminism.

For the sake of argument, let us say that we select the variant implementation with the least distant type argument. If there are multiple type parameters, we take the distance from the first non-identical set of arguments.


Case 4

Sophie's Choice

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class C : A {}

class X : I<B>, I<C> {
    B I<B>.Next() {...}
    C I<C>.Next() {...}
}

// Test code
I<A> i = new X();
A someA = i.Next();

Problem
There is no exact implementation of I<A>. There are two variant implementations: I<B> and I<C>, both defined in type X. B and C are equidistant from A. What would Jesus do?

Solution
Select I<B>. Because B comes before C in the alphabet.

If you think that's heinously arbitrary, you're right! But remember our motto: anything is better than nondeterminism.

[UPDATE]
Several people have proposed selecting based on the order in which the implementations appear in the code. This would select I<B> because it is defined first. As I said, my solutions are merely example rules.


Recap


To review, our selection rules are:
  1. The most derived exact implementation wins

  2. Failing an exact implementation, the most derived variant implementation wins

  3. If there are multiple equally-derived variant implementations and no exact implementation, the implementation with the least "distance" wins. If there are multiple type parameters, the distance is taken from the first non-identical set of arguments.

  4. If there are multiple equally-derived equidistant variant implementations and no exact implementation, the implementation with alphabetical priority wins. If there are multiple type parameters, the alphabetical priority is taken from the first non-identical set of arguments.



How Arbitrary Is Too Arbitrary?


As the cases get cornerier, the solutions get arbitrarier. Somewhere around case 3, the arbitrariness begins to offend the delicate sensibilities. There are alternatives to arbitrary selection which I will describe in a moment but I would like to first emphasize the virtue of arbitrariness: it is better than nondeterminism. ANYTHING is better than nondeterminism. Eeny meeny miny moe is a step up from the current spec. If no other solution can be decided upon, egregious arbitrariness is still better than what we have.

As we consider alternatives to arbitrary selection, consider the question, "how arbitrary is too arbitrary?" For which of the above cases is one of the below alternatives a superior option? Bear it in mind as we press ahead...


Exceptions


Runtime exceptions are an alternative to arbitrary implementation selection. If an ambiguous situation arises, an exception pops up. The benefit of exceptions is that they let the developer know that there is an ambiguous situation, whereas arbitrary selection may simply lead to unexpected and difficult-to-debug behavior. There are two exceptional approaches:

Exception On Call

When an ambiguous call is made, throw a runtime exception. For example:

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class C : A {}

class X : I<B>, I<C> {
    B I<B>.Next();
    C I<C>.Next();
}

// Test code

I<A> i = new X();
A someA = i.Next(); // AMBIGUOUS CALL EXCEPTION HAPPENS HERE

This has the benefit of only throwing an exception if the ambiguity is actually going to be a problem. The major issue is, exceptions can be thrown from innocent code. If you pass an X to some library function which takes an I<A>, that library will blissfully call I<A>.Next() and trigger the exception. The stack trace will implicate the library, but the culprit is the X type definition.

Exception On Assignment

When an object is assigned to an ambiguous interface, throw a runtime exception. For example:

interface I<out T> {
    T Next();
}

class A {}

class B : A {}

class C : A {}

class X : I<B>, I<C> {
    B I<B>.Next();
    C I<C>.Next();
}

// Test code
I<A> i = new X(); // AMBIGUOUS ASSIGNMENT EXCEPTION HAPPENS HERE
A someA = i.Next();

This approach runs the risk of unnecessary exceptions. If an ambiguous call is never made, an ambiguous assignment is not dangerous. This approach also allows exceptions to be thrown from innocent code. If some library takes an I<B>, an X can be passed without ambiguity. The library is then at liberty to assign the I<B> to an I<A>, resulting in an exception. The library is not at fault, but the stack trace implies otherwise.


Invalid Ambiguity


Once you have answered the question "how arbitrary is too arbitrary," take everything on the "too arbitrary" side of the fence and make it against the rules. For example, if you conclude that there is no reasonable way of selecting between two variant implementations defined in the same type, then it would be invalid to define a type which implements multiple interfaces that are variants of a common interface.

This has the downside of invaliding currently valid CLI images. I don't know to what degree forward compatibility is a goal for the ECMA 335 spec.


Language-Level Enforcement


Irrespective of the runtime's solution to this problem, languages can impose independent restrictions on ambiguous type design. For example, C# could make it a compilation error for a type to implement multiple interfaces that are variants of a common interface, even if such a type is valid for the runtime. If all major languages enforce unambiguous type design, the pressure on the runtime to avoid arbitrariness is lessened since the dangerously arbitrary rules will not affect any code written in a major language. The only people vulnerable to the arbitrary rules are ostensibly savvy enough to be trusted with understanding obscure runtime behaviors.


The Least Worst Solution


There is clearly no silver bullet for the problem. The least worst solution will be a blend of several approaches. The appropriate blend will depend upon a number of things such as the importance of the forward compatibility of CLI images. The ultimate solution must address two concerns: the elimination of nondeterminism and the prevention of developer confusion. The same implementation must deterministically execute every time, and it must be clear to developers which implementation that will be.


My Thoughts


The implementation selections rules should begin as follows:
  1. The most derived exact implementation wins.

  2. In the absence of an exact implementation, the most derived variant implementation wins.

Situations not addressed by these rules are considered "ambiguous."

As a solution to ambiguous situations, exceptions serve neither of the stated goals. They achieve deterministic failure rather than deterministic execution and their stack traces obfuscate the root of the problem. A sufficiently descriptive error message could aid developer comprehension but it would be far too easy for a stack trace from "someone else's code" to become a bug report to the wrong people or an ignored problem. I think exceptions are the worst worst solution.

Language-level ambiguity preclusion is an very good idea but adding a compiler error for ambiguous type design could break existing code when existing interfaces are made variant. For example, consider this C#:

class A {}

class B : A {}

class C : A {}

class X : IEnumerable<B>, IEnumerable<C> {...}

This is currently a legal C# class definition. However with the release of .NET 4 the IEnumerable<T> interface will be made covariant, making the above class vulnerable to ambiguity. Making ambiguous type design a warning rather than an error would solve this problem but I think it is worth breaking existing code in the interest of drawing developers' attention to the new ambiguity.

All variance-capable .NET languages should add errors for ambiguous type design.

Which brings us to the question of the runtime's ambiguity handling mechanism. The two best options are:
  1. Make ambiguities illegal.

  2. Add an arbitrary implementation selection rule: alphabetical priority of type arguments.

Option 1 achieves both objectives of a solution: it guarantees deterministic execution and ensures developer comprehension by proscribing ambiguous circumstances altogether. However it also breaks the forward compatibility of existing binaries.

Option 2 guarantees deterministic execution but does not serve developer comprehension (no developer wants to consider the alphabetical priority of type arguments when designing their types). In conjunction with language-level enforcement, however, no developer should ever suffer this confusion. And this option affords all currently valid CLI images continued validity.

So the key question is, how important is compatibility?

I am tending toward option 1, but I think there is room for debate.


What Do You Think?


Comment away.


Next Time...


If all that weren't enough, there is a problem with variancifying existing types (such as turning IEnumerable<T> into IEnumerable<out T>) but I will save that for another post.


Thanks


Thanks goes to Dr. Nigel Perry on the ECMA 334 and 335 standards committee for working with me on this.