Tuesday, June 30, 2009

Variance, Thy Name is Ambiguity

Previously
On This Blog...

"I love you, Generic Variance, and I want your babies RIGHT NOW!"

"I think there's something you should know about Generic Variance..."

"I can change him!"


And now, the thrilling continuation...

I've just sent my recommendation to the ECMA 335 committee regarding the generic variance problem. I present it here for your reading pleasure:


Quick Recap


The following is an example of an ambiguous circumstance involving generic variance, the very sort over which we have all lost so much sleep:

.class interface abstract I<+T> {
    .method public abstract virtual instance !T Foo ()
}

.class A {}
.class B extends A {}
.class C extends A {}

.class X implements I<class B>, I<class C> {
    .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
    .method virtual instance class C I[C].Foo () { .override I<class C>::Foo }
}

// Meanwhile, in some unsuspecting method...
I<A> i = new X ();
A a = i.Foo (); // AMBIGUITY!


Give a Runtime A Bone


To disambiguate such situations, we introduce a new custom attribute in the BCL. For the sake of example, let's call it System.PreferredImplementationAttribute. The PreferredImplementationAttribute is applied to a type and indicates which implementation should be selected by the runtime to resolve variance ambiguities. Our above definition of the type X would now look like this:

.class X implements I<class B>, I<class C> {
    .custom instance void System.PreferredImplementationAttribute::.ctor (class System.Type) = { type(I<class C>) }
    .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
    .method virtual instance class C I[C].Foo () { .override I<class C>::Foo }
}


New Rules


With the addition of this attribute, the runtime requires that any type defined in an assembly targeting the 335 5th edition runtime which implements multiple interfaces that are variants of a common generic interface MUST specify ONE AND ONLY ONE PerferredImplementationAttribute for EACH of the potentially ambiguous common interfaces, and that each such specification of a PerferredImplementationAttribute must reference an interface implemented by the type that is a legal variant of the ambiguous common interface. In other words, all possible ambiguities MUST be disambiguated by the use of PreferredImplementationAttribute custom attributes. If a type does not satisfy these rules, the runtime MUST throw a System.TypeLoadException.

As this rule only applies to assemblies targeting the new version of the runtime, old images will continue to execute without issue. If the committee prefers, the resolution of ambiguities in old types may remain unspecified, or alphabetical priority could be codified in the spec to standardize such behavior. I would be fine leaving it unspecified.


Custom Attributes vs. Metadata


Ideally, I feel disambiguation information belongs in the type metadata structure rather than a custom attribute. If the committee feels that amending the metadata specification is tenable, I would recommend doing so (though I don't have any thoughts at this time on the exact logical or physical nature of such an amendment). If, on the other hand, changing the metadata spec at this point in the game is not feasible, then a custom attribute will just have to do. I see the addition of one custom attribute type to the Base Class Library as entirely justified.


An Aside to Our Friends on the 334 Committee


As a note to language designers targeting the runtime, I personally would consider it obnoxious if developers where burdened with the manual application of such a custom attribute. C# and other languages would do well to prohibit the direct use of the custom attribute, favoring instead a special syntax to denote the preferred implementation (the "default" keyword comes to mind in the case of C#). If this committee changes the type metadata spec to include preferred implementation information (and does not introduce a custom attribute type for that purpose), then special language syntaxes will be necessary.


An Alternative


In the interest of completeness, I will describe an alternate (if similar) approach to the ambiguity resolution problem. Rather than annotate types to indicate which of their interface implementations will satisfy ambiguous calls, the preferred implementation could be denoted on a per-member basis. Referring again to our original type X, this solution would modify that type thusly:

.class X implements I<class B>, I<class C> {
    .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
    .method virtual instance class C I[C].Foo () {
        .override I<class C>::Foo
        .custom instance void System.PreferredImplementationAttribute::.ctor ()
    }
}

The member I[C].Foo is annotated with the System.PreferredImplementationAttribute, indicating that it will be selected by the runtime to fulfill otherwise ambiguous calls to I<T>.Foo. Note that in this solution the constructor to the PerferredImplementationAttribute type is parameterless. The runtime ensures that for EACH of the members of an interface which is the common variant of two or more of the interfaces implemented by a type, ONE AND ONLY ONE of the implementations for that member is flagged as "preferred."

Per-member preference definition affords developers more control but costs runtime implementers time, effort, and simplicity. I also don't envision many scenarios when developers would desire per-member control over implementation preference. I personally find this approach less tasteful than the per-interface solution but I mention it here, as I said, for completeness.


One More Thing...


There remains a situation on which there are varied opinions:

.class interface abstract I<+T> {
    .method public abstract virtual instance !T Foo ()
}

.class A {}
.class B extends A {}

.class X implements I<class A> {
    .method virtual instance class A I[A].Foo () { .override I<class A>::Foo }
}

.class Y extends X implements I<class B> {
    .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
}

// Meanwhile, in some unsuspecting method...
I<A> i = new Y ();
A a = i.Foo ();

In this situation I<A>::Foo is called on an object of type Y. There is an implementation of I<A>::Foo in Y's type hierarchy (X::I[A].Foo), but there is also an available implementation which is a legal variant of I<A> in Y itself (Y:I[B].Foo). Does the runtime favor the exact implementation, or the more derived variant implementation? I don't have strong feelings on the matter, but my slight preference is for favoring the exact implementation.

The runtime is deciding on behalf of the developer which implementation is most appropriate. It could be argued that an exact implementation, wherever it is to be found the type hierarchy, is more appropriate than a variant implementation.

Also - and this is an implementation detail which should not outweigh other considerations but may be useful to keep in mind if all other things are equal - Mono stores a type's implemented interfaces in a binary tree, meaning that finding an exact implementation is an O(log n) worst-case operation, whereas finding a legal variant interface among a type's implemented interfaces is an O(n) worst-case operation (all interfaces must be examined to see if a legal variant exists among them). I haven't heard of any way to do O(log n) (or better) lookup of variants. With such popular types as IEnumerable`1 becoming variant, the superior time complexity could make a difference.