Blogging my way through "C# in depth": Advanced Generics and Comparison with C++ and Java

In this post I will touch on some aspects of advanced generics mentioned in the book. The book covers some other aspects as well like "Reflection" which I don' see very important to cover here.
Firstly, we will discuss static fields and generics. Then we would continue with how JIT handles generics and finally we'll close with a comparison between the C++ and Java generics and C#.

Static fields defined in Generics

Static fields defined in a class do not belong to any specific object of the class but to the type itself. This means that when you declare a static field no matter how many objects you create there would be only one instance of these fields and they are shared between the classes.

Now the question is : If we have a generic type that has some static variables. Should these variables be shared between different types T or should they be independent between types ?

The answer would be obvious if we look at the class we used in the previous post :

public class CustomList<T> : IEquatable<CustomList<T>> where T : IEqualityComparer<T>
{
    private T[] list;
    private static IEqualityComparer<T> comparer = EqualityComparer<T>.Default;

If the static member comparer is shared between types all sorts of problems will happen and of course, the choice of independent static members is the right one. This means that for Class<T>, any closed type would have it's own set of static or non-static members. This means that Class<string> and Class<int> would have different and independent members.

Generic iteration (IEnumerator<T>)

Chances are you have used this interface many times without knowing so. For example when you traversed a generic type in a foreach statement. In order for a type to be iteratable in a foreach statement, that type should implement the IEnumerable interface. This interface in turn would return a IEnumerator type that implements MoveNext() and Current. Now what this would mean for the generics ? What if we want to be able to iterate a generic class. If we implement the IEnumerable interface then we would be able to successfully iterate our generic type. But there is a problem to all this. A closer look at the IEnumerable interface and the methods we have to implement shows that we have to implement this method in IEnumerator:

Object Current { get; }

This is were the problem arises. This method has to return the current element in the collection being iterated. Of course our generic type holds a collection of T objects and to return that type we need a cast to Object ! Isn't that why we got generics ? No extra boxing/unboxing, strong typing ?
Well the anwer is the IEnumerator<T> interface. Most interfaces have been extended to include generic types as well. IEnumerator<T> extends the IEnumertor to allow strong typing when working with generics. This though is not as smooth as you'd think since due to a design decision in the framework IEnumerator<T> extends the non-generic IEnumerator interface. This means that any class implementing IEnumerator<T> has to also implement IEnumerator ! So in order to implement the IEnumerator<T> interface we have to implement:

object IEnumerator.Current{}
{
    return Current();
}
public T Current{}

This design decision seems to be due to backward compatibility reasons(C# 1.0). As you can see here the implementation of two methods with the same name and different return values is possible by an explicit implementation of the non-generic one and having it call the generic version.

C# Generics vs. Counterparts in C++ and Java

In C++ generics exist as templates. These templates act as place holders in a macro definition. Basically after definition of a template type(type parameter), the compiler would just replace the value of the template with it's equivalent types at compile time. For example in the code below T is once replaced by int and once by long.

template<class T>
T min(T first, T second)
{
    return first<second ? first:second;
}

int main(int argc, char *argv[])
{
    long l1 = 2, l2 = 4;
    int i1 = 3, i2 = 5;
    long min1 = min<long>(l1, l2);
    int min2 = min<int>(i1, i2);
}

This of course, would mean we no longer need to add constraints to the type parameter used so that compile type checks can be conducted, since any type can be used for that position and at compile-time the compiler would check to see if any operation conducted is available for that type. This would add much more flexibility. This would allow the use of let's say operators on the type parameters. Doing so is not possible in C# and there is no constraint to enforce the availability of a certain operator overload. This would also mean that the C++ compiler can conduct optimization based on the types used. Although there is an optimization done in .Net that is not done in C++ and that is sharing of code for generics. If the generic is used in 5 different places in the code with different type arguments, the IL code would not have 5 different variations but only a reference to a common shared code. JIT then, would create as many different variations needed at execution time. The code is shared between generics with reference types as the type argument and not shared for value types. The reason for this is that a reference type would always have the same size (4 bytes for 32-bit CLR) but the value types can have various size(int, long, structs, ...). Lastly, C++ also allows the type arguments to not be types at all. In C++ intrinsic datatypes can be sent as arguments as well as functions.
I did talk about the concept of "variance" in .Net in the previous posts, although I have not gone into detail as to what it is. I may do a entire post on the subject later on but for now I'll assume that you know about this concept. In C# up to C#(4.0) generics are strictly invariant. Meaning that for example you cannot have List<String> and try to handle it as a List<Object>. The same limitation exists in C++.
Contrary to C++, in Java the generic library is inferior to C#. Basically the java byte code does not know about generics at all ! Generic types in Java would be converted to their non-generic equivalents with the casts necessary for the conversion. We also get some compile time checking with them as well. Another feature that is very annoying is the fact that Java's built-in value types cannot be used as type arguments in generics ! So you'll end up having to use the boxed version for them (List<Integer> vs List<int>) which is very inefficient. One feature that Java has that C# lacks however is generic variance(C# 4.0 introduced some variance for generics which we'll discuss later on). Java allows generic variance using wildcards.
With this, we're finally done with generics ! This has been a very long topic and we're not even touching the surface with it. Although everything was not covered here, you now know enough to delve into the details of the language specification if you're inclined. If you have had enough already, don't worry you would rarely need to go any more advanced than what has been discussed here.

Blogging my way through "C# in depth"

Saturday, 10 November 2012

Advanced Generics and Comparison with C++ and Java

Static fields defined in Generics

Generic iteration (IEnumerator<T>)

C# Generics vs. Counterparts in C++ and Java

No comments:

Post a Comment