Generics and Delegates in C#

Published by Marco on

Updated by Marco on

The term DRY—Don’t Repeat Yourself—has become more and more popular lately as a design principle. This is nothing new and is the main principle underlying object-oriented programming. As OO programmers, we’ve gotten used to using inheritance and polymorphism to encapsulate concepts. Until recently, languages like C# and Java have had only very limited support for re-using functionality across larger swathes of code.[1] To illustrate this, let’s take a look at a simple class with a descendent as well as some code that deals with lists of these objects and their properties.

Let’s start with some basic definitions[2]:

class Pet
{
  public string Name
  {
    get { return _Name; }
  }
  public bool IsHouseTrained
  {
    get { return _IsHouseTrained; }
  }

  private string _Name;
  private bool _IsHouseTrained = true;
}

class Dog : Pet
{
  public void Bark() {}
}

class Owner
{
  public IList<Pet> Pets
  {
    get { return _Pets; }
  }

  private IList<Pet> _Pets = new List<Pet>();
}

This is basically boilerplate for articles about inheritance, so let’s move on to working with these classes. Imagine that the Owner wants to find all pets named “Fido”:

IList<Pet> FindPetsNamedFido()
{
  IList<Pet> result = new List<Pet>();
  foreach (Pet p in Pets)
  {
    if (p.Name == "Fido")
    {
      result.Add(p);
    }
  }
  return result;
}

Again, no surprises yet. This is a standard loop in C#, using the foreach construct and generics to loop through the list in a type-safe manner. Applying the DRY principle, however, we see that we’re going to end up writing a lot of these loops—especially if we offer a lot of different ways of analyzing data in the list of pets. Essentially, the code above is a completely standard loop except for the condition—the (p.name == “Fido”) part. We can then imagine a function with the following form:

IList<Pet> FindPets(??? condition)
{
  IList<Pet> result = new List<Pet>();
  foreach (Pet p in Pets)
  {
    if (condition(p))
    {
      result.Add(p);
    }
  }
  return result;
}

Introducing Delegates

Now we need to figure out what type condition has. From the function body, we see that it takes a parameter of type Pet and returns a bool value. In C#, the definition of a function is called a delegate, which is also a keyword; for the type above, we write:

delegate bool MatchesCondition(Pet item);

As mentioned above, the return type is a bool, the single parameter is of type Pet, and the delegate is identified by the name MatchesCondition. The name of the parameter is purely for documentation. We can then rewrite the function signature above using the delegate we just defined:

IList<Pet> FindPets(MatchesCondition condition) {…}

We’ve managed to move the looping code for many common situations into a shared method. Now, how do we use it? We originally wanted to find all pets named “Fido”, so we need to define a function that does just that, matching the function signature defined by MatchesCondition:

bool IsNamedFido(Pet p)
{
  return p.Name == "Fido";
}

In this fashion, we can write any number of methods, which check various conditions on Pets. To use this method, we simply pass it to the shared FindPets method, like this:

IList<Pet> petsNamedFido = FindPets(IsNamedFido);
IList<Pet> petsNamedRex = FindPets(IsNamedRex);
IList<Pet> houseTrainedPets = FindPets(IsHouseTrained);

Anonymous Methods

This is better than the previous situation—in which we would have repeated the loop again and again—but we can do better. The problem with this solution is that it tends to clutter the class (Owner in this case) with many little methods that are useful only in conjunction with FindPets. Even if the methods are private, it’s a shame to have to use a full-fledged method as a kludge for instancing a piece of code to be called. The C# designers thought so too, so they added anonymous methods, which have a parameter list and a body, but no name. Using anonymous methods, we can replace the methods, IsNamedFido, IsNamedRex and IsHouseTrained, with the following code:

IList<Pet> petsNamedFido = FindPets(delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindPets(delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindPets(delegate(Pet p) { return p.IsHouseTrained; });

Again, the keyword delegate introduces a parameter list and body for the anonymous method.

Generic Functions

All of the code above uses the generic IList and List classes. None of the looping code in FindPets is dependent on the type of the list element except for the condition. It would be really nice if we could re-use this code not just for Pets, but for any collection of elements. Generic functions to the rescue. A generic function has one or more generic parameters, which can be used throughout the parameter list and implementation body. The first step in making FindPets fully generic is to change the definition of MatchesCondition:

delegate bool MatchesCondition<T>(T item);

As with a generic class, the function’s generic arguments appear within pointy brackets after the identifier—in this case, the single generic parameter is named T. Pet has been replaced as the type of the parameter as well. In order to finish making FindPets fully generic, we’ll have to pass it a list to work with (right now it always uses Pets) and change the name, so as to avoid confusion:

IList<T> FindItems<T>(IList<T> list, MatchesCondition<T> condition)
{
  IList<T> result = new List<T>();
  foreach (T item in list)
  {
    if (condition(item))
    {
      result.Add(item);
    }
  }
  return result;
}

We’re not quite done yet, though. If you look closely at the function body, all it does is enumerate the items in the parameter list. Therefore, we can loosen the type-constraint of the parameter from IList to IEnumerable, so that it can be called with any collection from all of .NET.

IList<T> FindItems<T>(IEnumerable<T> list, MatchesCondition condition) {…}

And … we’re done. Fully generic! Let’s see how that looks using the examples from above:

IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindItems<Pet>(Pets, delegate(Pet p) { return p.IsHouseTrained; });

Though we’ve lost something in legibility, we’ve gained quite a bit in re-use. Imagine now that an Owner also has a list of Vehicles, a list of Properties and a list of Relatives. You only have to write the conditions themselves and you can search any type of container for items matching any condition … all in a statically type-safe manner:

IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Vehicle> redCars = FindItems<Vehicle>(Vehicles, delegate(Vehicle v) { return (v is Car) and (((Car)v).Color == Red); });
IList<Property> bigLand = FindItems<Property>(Properties, delegate(Property p) { return p.Acreage >= 1000; });
IList<Relative> deadBeats = FindItems<Relative>(Relatives, delegate(Relative r) { return r.MoneyOwed > 0; });

Note: C# 2.0 offers this functionality in the .NET library for both the List and Array classes. In the official version, MatchesCondition is called Predicate and FindItems is called FindAll. It is not known why these functions don’t apply to all collections, as illustrated in our example.

Extension Methods

Can we do something about the legibility of the solution from the last section? In C# 2.0, we’ve reached the end of the line. If you’ve been following the development of “Orcas” and C# 3.0/3.5, you might have heard of extension methods[3], which allow you to extend existing classes with new functions without inheriting from them. Let’s extend any IEnumerable with our find function:

public static class MyVeryOwnExtensions
{
    public static bool FindItems<T>(this IEnumerable<T> list, MatchesCondition<T> condition)
    {
      // implementation from above
    }
}

The keyword this highlighted above indicates to the compiler that FindItems is an extension method for the type following it: IEnumerable<T>. Now, we can call FindItems with a bit more legibility and clarity, dropping both the generic parameter the actual argument (Pet and Pets, respectively) and replacing with a method call on Pets directly.

IList<Pet> petsNamedFido = Pets.FindItems(delegate(Pet p) { return p.Name == "Fido"; });

Contravariance

For brevity’s sake, the examples in this section assume use of the extension method defined above. To use the examples with C# 2.0, simply rewrite them to use the non-extended syntax.

We use anonymous methods to avoid declaring methods that will be used for one-off calculations. However, larger methods or methods that are reused throughout a class properly belong to the class as full-fledged methods. At the top, we defined a descendent of the Pet class called Dog. Imagine that each Owner has not only a list of Pets, but also a list of Dogs. Then we’d like to bring back our IsNamedFido method in order to be able to apply it against both lists (copied from above):

bool IsNamedFido(Pet p)
{
  return p.Name == "Fido";
}

Now we can use this method to test against lists of pets or lists of dogs:

IList<Pet> petsNamedFido = Pets.FindItems(IsNamedFido);
IList<Dog> dogsNamedFido = Dogs.FindItems(IsNamedFido);

The example above illustrates an interesting property of delegates, called contravariance. Because of this property, we can use IsNamedFido—which takes a parameter of type Pet—when calling FindItems<Dog>. That means that IsNamedFido can be used with any list containing objects descended from Pet. Unfortunately, contravariance only applies in this very special case; the type of dogsNamedFido cannot be IList<Pet> because IList<Dog> does not conform to IList<Pet>.[4]

However, this courtesy extends only to predefined delegates. If we wanted to replace the call to IsNamedFido with a call to an anonymous method, we’d be forced to specify the exact type for the parameter, as shown below:

IList<Dog> dogsNamedFido = FindItems(o.Dogs, delegate(Dog d) { return d.Name == "Fido"; });

Using Pet as the type parameter does not compile even though it is simply an in-place reformulation of the previous example. Enforcing the constraing here does not restrict the expressiveness of the language in any way, but it’s interesting to note that the compiler relaxes the rule against contravariance only when it absolutely has to.

Closures

In the previous section, we created a method, IsNamedFido instead of using an anonymous method to avoid duplicate code. In that spirit, suppose we further believe that having a name-checking function that checks a constant is also not generalized enough[5]. Suppose we write the following function instead:

bool IsNamed(Pet p, string name)
{
  return p.Name == name;
}

Unfortunately, there is no way to call this method directly because it takes two parameters and doesn’t match the signature of MatchesCondition (and even contravariance won’t save us). You can, however, drop back to using a combination of the defined method and an anonymous method:

IList<Pet> petsNamedFido = Pets.FindItems(delegate (Pet p) { return IsNamed(p, "Fido"); });

This version is a good deal less legible, but serves to show how you can at least pack most of the functionality away into an anonymous method, repeating as little as possible. Even if the anonymous method uses local or instance variables, those are packed up with the call so that the values of these variables at the time the delegate is created are used.

For comparison, Java does not support proper closures, requiring final hacks and creation of anonymous classes in order to perform the task outlined above. Various proposals aim to extend Java in this direction, but, as of version 6, none have yet found their way into the language specification.

Agents

On a final note, it would be nice to have a cleaner notation for formulating the method call above—in which additional parameters to a function must be collected manually into an anonymous method. The Eiffel programming language offers such an alternative, calling their delegates agents instead[6]. The conformance rules for agents for a method signature like MatchesCondition<T> are different, requiring not that the signature match perfectly, but only that all non-conforming parameters be provided at the time the agent is created.

Eiffel uses question marks to indicate where actual arguments are to be mapped to the agent, so in pseudo-C# syntax, the method call above would be written as:

IList<Pet> petsNamedFido = Pets.FindItems(agent IsNamed(?, "Fido"));

This is much more concise and expressive than the C# version. It differs enough from an actual function call—through the rather obvious and syntax-highlightable keyword, agent—but not so much as to suggest an entirely different mechanism. The developer is made aware that it’s not a regular method call, but a delayed one. C# could easily implement such a feature as pure syntactic sugar, compiling the agent expression to the previous formulation automatically. Perhaps in C# 4.0?

All in all, though, C#’s support for generics and closures and DRY programming is eminently useful and looks only to improve in upcoming versions like LINQ, which introduces inferred typing, a mechanism that will improve legibility and expressiveness dramatically.


[1] This article covers ways of statically checking code validity, so dynamically typed languages, like Smalltalk, Ruby or Python, while providing the same functionality, don’t apply because they can’t verify correctness at compile-time. On the other hand, there are languages—like Eiffel, which has had generics from the very beginning, but never really caught on (though it now runs under .NET) or C++, which has the powerful STL, but is horrifically complex for general use—which have offered some or all of the features discussed in this article for quite some time now.
[2] The notation is C# 2.0, which does not yet support automatic properties.
[3] As described in New “Orcas” Language Feature: Extension Methods by Scott Guthrie
[4]

This reduces the expressiveness of the language, but C# forbids this because it cannot statically prevent incorrect objects from being added to the resulting list. Building on the example above, if we assume a class Cat also descendend from Pet, it would then be possible to do the following:

IList<Pet> dogsNamedFido = Dogs.FindItems(IsNamedFido);
dogsNamedFido.Add(new Cat());

This would cause a run-time error because the actual instance attached to dogsNamedFido can only contain Dogs. Instead of adding run-time checking for this special case and enhancing the expressiveness of the language—as Eiffel or Scala, for example, do—C# forbids it entirely, as does Java.
 

[5] For the irony-impaired: yes, that was sarcasm.
[6] For more information on the Eiffel feature, see Agents in the online manual.

For further information, the articles, Generic type parameter variance in the CLR and Using ConvertAll to Imitate Native Covariance/Contravariance in C# Generics, are also useful. For more information on closures in C#, see C#: Anonymous methods are not closures and The Power of Closures in C#.