Tuesday, August 10, 2010

Disoriented

The Tyranny of Terminology would have us discussing Object Orientation and Functional Programming, calling our code by its Patterns and Paradigms. I am going to plead ignorance of these epistemologies and just show you some C#. Feel free to know for yourself what it Really Means.

Code like this was written for Mono.Upnp. Expect news there soon.

Our story begins here:

abstract class ContentDirectory

"ContentDirectory" is UPnP parlance for a directory of content. (See the ContentDirectory:1 service template PDF. Or better still, don't.) The CD is a hierarchy of objects. Objects have descriptive class names like "object.item.audioItem.musicTrack" and "object.container.systemFolder". That gives you some idea, yes?

The CD has methods like Browse and Search. Here is the signature for Search:

string Search (string containerId,
               string searchCriteria,
               int startIndex,
               int requestCount,
               out int numberedReturned,
               out int totalMatches)


I will pause to explain the return value: it is a string of XML describing the result set.

Now, in a source file not far away...

class InMemoryContentDirectory : ContentDirectory

It may not shock you to learn that an InMemoryContentDirectory keeps an in-memory collection of its object hierarchy. It has a method called GetChildren. Here is the signature:

IEnumerable<CDObject> GetChildren (string containerId)

Explanatory pause #2: CDObject is short for "Content Directory Object"; it is the root type of objects in the CD.

If you were me, then this is how you would implement InMemoryContentDirectory.Search:

totalMatches = 0;
var results = new List<CDObject> (requestCount);
foreach (var child in GetChildren (containerId)) {
    if (IsMatch (child, searchCriteria)) {
        totalMatches++;
        if (totalMatches > startIndex &&
            results.Count < requestCount)
        {
            results.Add (child);
        }
    }
}
numberReturned = results.Count;
return Serialize (results);

I may or may not have promised a theory-free post — I don't really remember now — but in any event, this is standard Object Oriented Programing fare. To state the obvious:
  • You create a list to hold the results.
  • You iterate through the children of the subject container.
  • For each match, you increment totalMatches.
  • If you have not reached the startIndex, you keep going.
  • Otherwise unless you have reached the requestCount, you add the matching object to the results list.
  • You return the serialized results list.
    • Which iterates through the list and serializes each object to XML, returning the whole XML string.
I have highlighted the problems with your solution in salmon. You are allocating a new List<CDObject> to hold objects which already exist in a collection somewhere. Also, you are iterating through the results twice: first in your pass through the container's children and then again to serialize them.

If you were still me then this just won't do. LINQ should come to mind but have I got a fun surprise for you: Mono.Upnp targets the .NET 2.0 profile. So how might you Query the Language without fancy INtegration?

IEnumerable<CDObject> Search (
    string containerId,

    string searchCriteria,
    int startIndex,
    int requestCount)
{
    var count = 0;
    foreach (var child in GetChildren (containerId)) {
        if (IsMatch (child, searchCriteria)) {
            count++;
            if (count > startIndex &&
                count - startIndex < requestCount)
            {
                yield return child;
            }
        }
    }
}


string Search (string containerId,
               string searchCriteria,
               int startIndex,
               int requestCount,
               out int numberedReturned,
               out int totalMatches)

{
    var serializer = new Serializer ();
    var results = Search (
        containerId, searchCriteria,
        startIndex, requestCount);
    var xml = serializer.Serialize (results);
    numberReturned = serializer.NumberReturned;
    totalMatches = 0;
    return xml;
}
    No more List<CDObject> and we only iterate through the results once (during Serializer.Serialize). The serializer counts the results for us during its iteration and exposes Serializer.NumberReturned.

    There is one big problem: totalMatches will always be 0. We know what the correct value should be (it is the count variable in our generator), but we have no way to get it out: generator methods cannot have by-reference parameters (a.k.a. "out" parameters).

    To make this solution work, we could return something fancier than plain old IEnumerable<CDObject> which would expose count through a property; let's call it TotalMatchesCount. But we could not use generators; we would have to implement IEnumerator<CDObject> by hand just like'n Ye Olde Days.

    A final caveat: TotalMatchesCount would only have the correct value after we iterate through the results in Serializer.Serialize, just as with Serializer.NumberReturned.

    This approach frankly sucks. Alright you/me, show me your teeth!

    abstract void VisitChildren (string containerId,
                                 Action<CDObject> visitor);

    string Search (string containerId,
                   string searchCriteria,
                   int startIndex,
                   int requestCount,
                   out int numberedReturned,
                   out int totalMatches)
    {
        var total = 0;
        var count = 0;
        var serializer = new Serializer ();
        VisitChildren (containerId, child => {
            if (IsMatch (child, searchCriteria) {
                total++;
                if (total > startingIndex &&
                    count < requestCount)
                {
                    serializer.OnResult (child);
                    count++;
                }
            }
        });
        numberReturned = count;
        totalMatches = total;
        return serializer.OnDone ();
    }

    No more IEnumerable<CDObject> GetChildren, and the implementation lives in the abstract ContentDirectory class where it works with any sort of subclass: in-memory, db-backed, web service, &c.

    As an exercise I want you to invent a name for this pattern which rhymes with neither "shmisitor" nor "shmobserver." Bonus points for double entendres. Then I want you to imagine a world without return values. Get back to me when your mind is blown.

    Tuesday, July 28, 2009

    Mono.Upnp Dance Party

    So it's been a while since mention was made of a certain UPnP library. What happened? First, I had various other things to do. Second, I decided to do two or three major refactorings, ditching a lot of code. Third, I moved development to github.

    What the status?
    The status? The status, you ask?! THIS is the status! If you can't see, I am pointing at my TV. My TV which is connected to my PS3. My PS3 which is playing music from my laptop computer with WIRELESS NETWORKING! Yes friends, tonight at last, Mono.Upnp and the PS3 are doing the DANCE OF LOVE. I plug, it plays. Universally. About ten minutes ago I finally tracked down the typo responsible for a day's worth of debugging and let me tell you, Starfucker never sounded so good (and they already sound so good anyway, seriously, you should listen to them).

    What now?
    I've kept pretty quite about the whole project because I wanted to lay all the groundwork before make too much noise. There is still work to be done on the core of the library, but now that it's working I'll start sharing more frequent updates. You can follow the project on github if you want commit-by-commit news.

    Can I help?
    Sure! But helping might be a little tricky. The solution only loads in MonoDevelop SVN, and there are certain necessary BCL fixes that require Mono from SVN too (one of them isn't even committed yet). It's not quite "checkout, compile, run," but if you're interested in helping out, I will be more than happy to get you up to speed. I wrote a TODO on the github wiki today with some stuff that needs doing. Testing is also something I will need help on. I don't have access to an XBox 360 anymore, so I'm going to need help on that front. As the library and the tools evolve, we'll need to test with as many devices as we can.

    Yeah!
    Yeah indeed! NOW DANCE!

    Thursday, July 23, 2009

    C#er

    Was chillin' with the impish abock last weekend when, all of a hullabaloo, he geniused something wonderful.

    "Behold!" he cried:

    var button = new Button {
        Label = "Push Me",
        Relief = ReliefStyle.None
    };
    button.Clicked += (o, a) => Console.WriteLine ("ouch!');


    To which I replied, "?"

    "Watch..." said he:

    var button = new Button {
        Label = "Push Me",
        Relief = ReliefStyle.None,
        Clicked +=> Console.WriteLine ("ouch!")
    };


    "?!" came my response.

    "Is not it better?"

    "Yes," quoth I, "but gentle abock, this wundercode... it doth not compile!"

    "... YET!"

    Well friends, yet is over. I am here today to tell you that yes, IT DOTH COMPILE. This is what you get when Scott forgets to pull the git repos for his real projects before a plane flight: unsolicited language features. And there are other goodies:

    As with anonymous methods via the delegate keyword, you may omit the parameters to a lambda if you aren't going to use them. This is also helpful when the delegate type has no parameters. For example:

    Func<string> myFunc = () => "blarg";


    Just look at those parenthesis! Chillin' there all higgledy piggledy. They look like some unseemly ASCII art. But now, presto chango:

    Func<string> myFunc => "blarg";


    See what I did there? That's called an assignment arrow. It is better. Don't argue with me, because you're wrong.

    For my next trick, you can do the same kind of thing with lambdas and event handler registration.

    myButton.Clicked +=> Console.WriteLine ("higgledy piggledy");


    Because who ever uses the EventHandler arguments? A big, fat nobody, that's who.

    Last but not least, you can now do all of this plus regular event handler registration inside of object initializers. abocks around the world rejoice!

    There Is No Syntax Without Corner Cases


    So there is at least one possible ambiguity with this new syntax:

    class Foo {
        public void Add (Action<string> action) { ... }
        public Action<string> Bar { get; set; }
    }

    // Meanwhile, in some unsuspecting method:
    var foo = new Foo {
        Bar => Console.WriteLine ("HELP ME!")
    };


    Question: Is that an object initialization, or a collection initialization?

    Answer: It's ambiguous.

    Solution: It's an object initialization. If you want it to be a collection initialization, throw some parenthesis around "Bar." This would be a good candidate for a compiler warning. And if you want to make it an unambiguous object initialization, you could do:

    var foo = new Foo {
        Bar = () => Console.WriteLine (
            "What does this ASCII art even mean?")
    };


    Patch


    The patch for all of this is available here. Apply to mcs, recompile, then use gmcs.exe passing -langversion:future.

    Future


    There has been on-again-off-again talk about adding non-standard language features to the C# compiler under the guard of -langversion:future. The main concern voiced is the ability to maintain such extensions. I will definitely discuss this patch with Marek and co. to see about landing it in mainline. I'll keep you up to date.

    Are You Bock Enough?


    In the meantime, I call upon manly man Aaron Bockover to make the only manly choice available: fork C# and ship the compiler. Because you're not really a serious media player until you have your own special language.

    Thursday, July 16, 2009

    Casting Call

    Type safety only gets you so far; eventually you have to cast. There are three features in the C# language which address typing: the unary cast operator and the binary "as" and "is" operators. I see people misuse these operators all the time, so here for your records are the official Best Ways to use each.

    If you want to check the type of an object and do not care about using the object as that type, use the "is" operator. For example:

    if (thing is MyType) {
        // do something which doesn't involve thing
    }

    If you want to check the type of an object and then use that object as that type, use the "as" operator and the check for null. For example:

    var my_type_thing = thing as MyType;
    if (my_type_thing != null) {
        // do something with my_type_thing
    }

    This only works for reference types since value types cannot be null. For value types, use the "is" and cast operators. For example:

    if (thing is MyValueType) {
        var my_value_type_thing = (MyValueType)thing;
        // do something with my_value_type_thing
    }

    If you know for a fact that an object is some type, use the cast operator. For example:

    var my_type_thing = (MyType)thing;
    // do something with my_type_thing

    These patterns minimize the operations performed by the runtime. This wisdom comes by way Marek who educated me on this a while ago. Please pass it on.

    Thursday, July 9, 2009

    Dear LazyMarket

    Are you hiring? Do you know someone who is hiring? Well you're in luck! Because none other than yours truly is looking for a job. If you're interested in how great I am, send an email to lunchtimemama@gmail.com and I'll get you a copy of my resume. I look forward to hearing from you...

    Tuesday, June 30, 2009

    Variance, Thy Name is Ambiguity

    Previously
    On This Blog...

    "I love you, Generic Variance, and I want your babies RIGHT NOW!"

    "I think there's something you should know about Generic Variance..."

    "I can change him!"


    And now, the thrilling continuation...

    I've just sent my recommendation to the ECMA 335 committee regarding the generic variance problem. I present it here for your reading pleasure:


    Quick Recap


    The following is an example of an ambiguous circumstance involving generic variance, the very sort over which we have all lost so much sleep:

    .class interface abstract I<+T> {
        .method public abstract virtual instance !T Foo ()
    }

    .class A {}
    .class B extends A {}
    .class C extends A {}

    .class X implements I<class B>, I<class C> {
        .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
        .method virtual instance class C I[C].Foo () { .override I<class C>::Foo }
    }

    // Meanwhile, in some unsuspecting method...
    I<A> i = new X ();
    A a = i.Foo (); // AMBIGUITY!


    Give a Runtime A Bone


    To disambiguate such situations, we introduce a new custom attribute in the BCL. For the sake of example, let's call it System.PreferredImplementationAttribute. The PreferredImplementationAttribute is applied to a type and indicates which implementation should be selected by the runtime to resolve variance ambiguities. Our above definition of the type X would now look like this:

    .class X implements I<class B>, I<class C> {
        .custom instance void System.PreferredImplementationAttribute::.ctor (class System.Type) = { type(I<class C>) }
        .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
        .method virtual instance class C I[C].Foo () { .override I<class C>::Foo }
    }


    New Rules


    With the addition of this attribute, the runtime requires that any type defined in an assembly targeting the 335 5th edition runtime which implements multiple interfaces that are variants of a common generic interface MUST specify ONE AND ONLY ONE PerferredImplementationAttribute for EACH of the potentially ambiguous common interfaces, and that each such specification of a PerferredImplementationAttribute must reference an interface implemented by the type that is a legal variant of the ambiguous common interface. In other words, all possible ambiguities MUST be disambiguated by the use of PreferredImplementationAttribute custom attributes. If a type does not satisfy these rules, the runtime MUST throw a System.TypeLoadException.

    As this rule only applies to assemblies targeting the new version of the runtime, old images will continue to execute without issue. If the committee prefers, the resolution of ambiguities in old types may remain unspecified, or alphabetical priority could be codified in the spec to standardize such behavior. I would be fine leaving it unspecified.


    Custom Attributes vs. Metadata


    Ideally, I feel disambiguation information belongs in the type metadata structure rather than a custom attribute. If the committee feels that amending the metadata specification is tenable, I would recommend doing so (though I don't have any thoughts at this time on the exact logical or physical nature of such an amendment). If, on the other hand, changing the metadata spec at this point in the game is not feasible, then a custom attribute will just have to do. I see the addition of one custom attribute type to the Base Class Library as entirely justified.


    An Aside to Our Friends on the 334 Committee


    As a note to language designers targeting the runtime, I personally would consider it obnoxious if developers where burdened with the manual application of such a custom attribute. C# and other languages would do well to prohibit the direct use of the custom attribute, favoring instead a special syntax to denote the preferred implementation (the "default" keyword comes to mind in the case of C#). If this committee changes the type metadata spec to include preferred implementation information (and does not introduce a custom attribute type for that purpose), then special language syntaxes will be necessary.


    An Alternative


    In the interest of completeness, I will describe an alternate (if similar) approach to the ambiguity resolution problem. Rather than annotate types to indicate which of their interface implementations will satisfy ambiguous calls, the preferred implementation could be denoted on a per-member basis. Referring again to our original type X, this solution would modify that type thusly:

    .class X implements I<class B>, I<class C> {
        .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
        .method virtual instance class C I[C].Foo () {
            .override I<class C>::Foo
            .custom instance void System.PreferredImplementationAttribute::.ctor ()
        }
    }

    The member I[C].Foo is annotated with the System.PreferredImplementationAttribute, indicating that it will be selected by the runtime to fulfill otherwise ambiguous calls to I<T>.Foo. Note that in this solution the constructor to the PerferredImplementationAttribute type is parameterless. The runtime ensures that for EACH of the members of an interface which is the common variant of two or more of the interfaces implemented by a type, ONE AND ONLY ONE of the implementations for that member is flagged as "preferred."

    Per-member preference definition affords developers more control but costs runtime implementers time, effort, and simplicity. I also don't envision many scenarios when developers would desire per-member control over implementation preference. I personally find this approach less tasteful than the per-interface solution but I mention it here, as I said, for completeness.


    One More Thing...


    There remains a situation on which there are varied opinions:

    .class interface abstract I<+T> {
        .method public abstract virtual instance !T Foo ()
    }

    .class A {}
    .class B extends A {}

    .class X implements I<class A> {
        .method virtual instance class A I[A].Foo () { .override I<class A>::Foo }
    }

    .class Y extends X implements I<class B> {
        .method virtual instance class B I[B].Foo () { .override I<class B>::Foo }
    }

    // Meanwhile, in some unsuspecting method...
    I<A> i = new Y ();
    A a = i.Foo ();

    In this situation I<A>::Foo is called on an object of type Y. There is an implementation of I<A>::Foo in Y's type hierarchy (X::I[A].Foo), but there is also an available implementation which is a legal variant of I<A> in Y itself (Y:I[B].Foo). Does the runtime favor the exact implementation, or the more derived variant implementation? I don't have strong feelings on the matter, but my slight preference is for favoring the exact implementation.

    The runtime is deciding on behalf of the developer which implementation is most appropriate. It could be argued that an exact implementation, wherever it is to be found the type hierarchy, is more appropriate than a variant implementation.

    Also - and this is an implementation detail which should not outweigh other considerations but may be useful to keep in mind if all other things are equal - Mono stores a type's implemented interfaces in a binary tree, meaning that finding an exact implementation is an O(log n) worst-case operation, whereas finding a legal variant interface among a type's implemented interfaces is an O(n) worst-case operation (all interfaces must be examined to see if a legal variant exists among them). I haven't heard of any way to do O(log n) (or better) lookup of variants. With such popular types as IEnumerable`1 becoming variant, the superior time complexity could make a difference.

    Saturday, May 16, 2009

    Further Generic Variance Thoughts

    I was writing a type today that implements both IDictionary<TKey, TValue> and ICollection<TValue>. These interfaces require the implementation of both IEnumerable<TValue> and IEnumerable<KeyValuePair<TKey, TValue>>. In .NET 4, the IEnumerable<T> type will be covariant. This exposes my type to potential ambiguity if it is assigned to a location of type IEnumerable<object> (see the previous post for details). If I were to follow my own advice and forbid the implementation of multiple interfaces which are variants of a single common interface, this type would be illegal. So on further reflection, I have decided to amend my opinion thusly: If there are multiple interface implementations which are variants of a common interface, then there must be implicit implementations of all of the potentially ambiguous members. These public members are then selected by the runtime to satisfy otherwise ambiguous calls. The implicit member implementations need not all be for the same interface. For example, if we have some interface IFoo<out T> with members T Bar(); and T Bat(); and we have some type with implements both IFoo<string> and IFoo<Uri>, it could have the members public string Bar(){} and public Uri Bat(){}. Any call to IFoo<object>.Bar() on an object of this type will execute the IFoo<string> implementation, and IFoo<object>.Bat() will execute the Uri implementation.

    I believe that this restriction should be enforced at least at the language level (for all variant-capable languages targeting .NET), if not at the runtime level: all potentially ambiguous members must have public implementations. This resolves the ambiguity in a logical way, allows for more complex type design (which, as in the case of my type today, is desirable), and gives developers the ability to control which implementation will be selected. I think it is a Good Thing.