Tuesday, August 10, 2010

Disoriented

The Tyranny of Terminology would have us discussing Object Orientation and Functional Programming, calling our code by its Patterns and Paradigms. I am going to plead ignorance of these epistemologies and just show you some C#. Feel free to know for yourself what it Really Means.

Code like this was written for Mono.Upnp. Expect news there soon.

Our story begins here:

abstract class ContentDirectory

"ContentDirectory" is UPnP parlance for a directory of content. (See the ContentDirectory:1 service template PDF. Or better still, don't.) The CD is a hierarchy of objects. Objects have descriptive class names like "object.item.audioItem.musicTrack" and "object.container.systemFolder". That gives you some idea, yes?

The CD has methods like Browse and Search. Here is the signature for Search:

string Search (string containerId,
               string searchCriteria,
               int startIndex,
               int requestCount,
               out int numberedReturned,
               out int totalMatches)


I will pause to explain the return value: it is a string of XML describing the result set.

Now, in a source file not far away...

class InMemoryContentDirectory : ContentDirectory

It may not shock you to learn that an InMemoryContentDirectory keeps an in-memory collection of its object hierarchy. It has a method called GetChildren. Here is the signature:

IEnumerable<CDObject> GetChildren (string containerId)

Explanatory pause #2: CDObject is short for "Content Directory Object"; it is the root type of objects in the CD.

If you were me, then this is how you would implement InMemoryContentDirectory.Search:

totalMatches = 0;
var results = new List<CDObject> (requestCount);
foreach (var child in GetChildren (containerId)) {
    if (IsMatch (child, searchCriteria)) {
        totalMatches++;
        if (totalMatches > startIndex &&
            results.Count < requestCount)
        {
            results.Add (child);
        }
    }
}
numberReturned = results.Count;
return Serialize (results);

I may or may not have promised a theory-free post — I don't really remember now — but in any event, this is standard Object Oriented Programing fare. To state the obvious:
  • You create a list to hold the results.
  • You iterate through the children of the subject container.
  • For each match, you increment totalMatches.
  • If you have not reached the startIndex, you keep going.
  • Otherwise unless you have reached the requestCount, you add the matching object to the results list.
  • You return the serialized results list.
    • Which iterates through the list and serializes each object to XML, returning the whole XML string.
I have highlighted the problems with your solution in salmon. You are allocating a new List<CDObject> to hold objects which already exist in a collection somewhere. Also, you are iterating through the results twice: first in your pass through the container's children and then again to serialize them.

If you were still me then this just won't do. LINQ should come to mind but have I got a fun surprise for you: Mono.Upnp targets the .NET 2.0 profile. So how might you Query the Language without fancy INtegration?

IEnumerable<CDObject> Search (
    string containerId,

    string searchCriteria,
    int startIndex,
    int requestCount)
{
    var count = 0;
    foreach (var child in GetChildren (containerId)) {
        if (IsMatch (child, searchCriteria)) {
            count++;
            if (count > startIndex &&
                count - startIndex < requestCount)
            {
                yield return child;
            }
        }
    }
}


string Search (string containerId,
               string searchCriteria,
               int startIndex,
               int requestCount,
               out int numberedReturned,
               out int totalMatches)

{
    var serializer = new Serializer ();
    var results = Search (
        containerId, searchCriteria,
        startIndex, requestCount);
    var xml = serializer.Serialize (results);
    numberReturned = serializer.NumberReturned;
    totalMatches = 0;
    return xml;
}
    No more List<CDObject> and we only iterate through the results once (during Serializer.Serialize). The serializer counts the results for us during its iteration and exposes Serializer.NumberReturned.

    There is one big problem: totalMatches will always be 0. We know what the correct value should be (it is the count variable in our generator), but we have no way to get it out: generator methods cannot have by-reference parameters (a.k.a. "out" parameters).

    To make this solution work, we could return something fancier than plain old IEnumerable<CDObject> which would expose count through a property; let's call it TotalMatchesCount. But we could not use generators; we would have to implement IEnumerator<CDObject> by hand just like'n Ye Olde Days.

    A final caveat: TotalMatchesCount would only have the correct value after we iterate through the results in Serializer.Serialize, just as with Serializer.NumberReturned.

    This approach frankly sucks. Alright you/me, show me your teeth!

    abstract void VisitChildren (string containerId,
                                 Action<CDObject> visitor);

    string Search (string containerId,
                   string searchCriteria,
                   int startIndex,
                   int requestCount,
                   out int numberedReturned,
                   out int totalMatches)
    {
        var total = 0;
        var count = 0;
        var serializer = new Serializer ();
        VisitChildren (containerId, child => {
            if (IsMatch (child, searchCriteria) {
                total++;
                if (total > startingIndex &&
                    count < requestCount)
                {
                    serializer.OnResult (child);
                    count++;
                }
            }
        });
        numberReturned = count;
        totalMatches = total;
        return serializer.OnDone ();
    }

    No more IEnumerable<CDObject> GetChildren, and the implementation lives in the abstract ContentDirectory class where it works with any sort of subclass: in-memory, db-backed, web service, &c.

    As an exercise I want you to invent a name for this pattern which rhymes with neither "shmisitor" nor "shmobserver." Bonus points for double entendres. Then I want you to imagine a world without return values. Get back to me when your mind is blown.