Tuesday, 17 September 2013

LINQ

In this post I will talk about LINQ as a whole. There are books written about this subject and you can't really explain it all in one post. It would need its own series of posts. I haven't look through all the nook and crannies of it neither and you don't really have to in order to be productive in it. What you have to know is more about how and when to use it and more importantly how to use it efficiently.
This is basically where all the things we talked about in the previous posts come together. Ideas and features like Lambda Expressions, Delegates, Deferred Execution and Extension Methods all go hand in hand to give us what we will see below.

In order for us to get started, we need a set of classes to actually run the queries on. The following shows the classes that we will be using to run our queries. We will fill in the objects with some sample data using object initialization we learned in previous posts:

class Book
{
    public string Name { get; set; }
    public string Genre { get; set; }
    public decimal Price { get; set; }
    public override string ToString()
    {
        return Name;
    }
}
class Library
{
    public string Name { get; set; }
    public List<Book> AllBooks { get; set; }
    public List<Member> AllMembers { get; set; }
    public override string ToString()
    {
        return Name;
    }
}
class Member
{
    public string Name { get; set; }
    public string Address { get; set; }
    public short Age { get; set; }
    public override string ToString()
    {
        return Name;
    }
}
class Reservation
{
    public Library InLibrary { get; set; }
    public DateTime StartDate { get; set; }
    public DateTime EndDate { get; set; }
    public Book ReservedBook { get; set; }
    public Member MemberWhoTookIt { get; set; }
}
...
..
.
Library newLibrary = new Library
{
    Name = "Biggest Library Ever!!",
    AllBooks = new List<Book>
    {
        new Book { Name = "When I was at the gym", Genre = "Horror", Price=1000},
        new Book { Name = "Hills can run", Genre = "Comedy", Price = 200},
        new Book { Name = "Inspector Bandex", Genre = "Romantic", Price = 500}
    },
    AllMembers = new List<Member>
    {
        new Member { Name = "John", Address = "Saturn", Age = 23},
        new Member { Name = "Joon", Address = "Sun", Age = 1500},
        new Member { Name = "Jeen", Address = "Moon", Age = 2}
    }
};

List<Reservation> allReservations = new List<Reservation>();
allReservations.Add(new Reservation
{
    InLibrary = newLibrary,
    StartDate = DateTime.Now,
    EndDate = DateTime.Now.Add(new TimeSpan(2, 0, 0, 0)),
    MemberWhoTookIt = 
        newLibrary.AllMembers.First(member 
                                        => 
                                    member.Name.Equals("John")),
    ReservedBook = 
        newLibrary.AllBooks.First(book 
                                    => 
                                    book.Name.Equals("Inspector Bandex"))
});

allReservations.Add(new Reservation
{
    InLibrary = newLibrary,
    StartDate = DateTime.Now,
    EndDate = DateTime.Now.Add(new TimeSpan(1, 0, 0, 0)),
    MemberWhoTookIt = 
        newLibrary.AllMembers.First(member 
                                        => 
                                    member.Name.Equals("Jeen")),
    ReservedBook = newLibrary.AllBooks.First(book 
                                                => 
                                             book.Name.Equals("Romantic"))
});

public static void PrintContent(object obj)
{
    Console.WriteLine("--------");
    PropertyInfo[] properties = 
        obj.GetType().GetProperties(BindingFlags.Instance | BindingFlags.Public);
    foreach(var property in properties)
        Console.WriteLine(String.Format("{0} : {1}", 
                          property.Name, 
                          property.GetValue(obj)));
}

As you see above, I have created a Library class that holds certain members and books. Another Reservation class is responsible to hold different Reservations for the library and its members and books. I have also added a PrintContent method which would help us print out all the properties of an object it receives as input using reflection. Don't worry about the PrintContent method and how it works as this post is not really about reflection. I will go ahead and write the first query on this setup:

var query = from reservation in allReservations
            where reservation.StartDate.Equals(DateTime.MinValue)
            select reservation;

foreach (var item in query)
    PrintContent<Reservation>(item);
            
Console.ReadKey();

The code above would have the following output:

--------
InLibrary : Biggest Library Ever!!
StartDate : 0001-01-01 12:00:00 AM
EndDate : 0001-01-03 12:00:00 AM
ReservedBook : Inspector Bandex
MemberWhoTookIt : John
--------
InLibrary : Biggest Library Ever!!
StartDate : 0001-01-01 12:00:00 AM
EndDate : 0001-01-02 12:00:00 AM
ReservedBook : Inspector Bandex
MemberWhoTookIt : Jeen

You can definitely tell what the query is doing by looking at the outcome and by basically reading it ! Just read it the plain english. I'm trying to find all reservations in the allReservation collection where the StartDate of the resevation is of a certain value. Now you may wander what the "select" statement is for. We will get to that soon. But first let's see what exactly happens when you write those lines of code.

Firstly, you will notice that I have used the "var" keyword. This keyword if you remember is to ask the compiler to infer the type itself. Let's just say for now that finding the actual type of a LINQ query can sometime be complicated and more importantly usually it's not much of concern. That is why you usually would see the var keyword used instead of the actual type the query would return.

Another point to remember is that what we have written as LINQ query statements and assigned it to the query local variable are just operations to be performed later. In other words by assigning these to the variable we are not doing any type of processing on the collection yet. Instead a representation of this query is generated using expression trees and this tree would later be traversed by the LINQ engine when the query is actually executed(remember deferred execution and the yield return statement?). The process of transforming the LINQ query statements to function calls is completely mechanical in the sense that the compiler doesn't try to do any kind of optimization at this point. The statement is transformed to a series of function calls. For example the above statement would be translated to the following:

var query = from reservation in allReservations
            where reservation.StartDate.Equals(DateTime.MinValue)
            select reservation;

allReservations.Where(reservation 
                          => 
                      reservation.StartDate.Equals(DateTime.MinValue)
               .Select(reservation => reservation)
               
Now if you are familiar with SQL and you had a nagging feeling as to why this query is written in reverse(instead of SELECT * FROM ...), I can answer you why. As you can see they way we have written the query is the same way the query is translated into code.

Now you may be wandering when exactly the processing starts then!? It actually starts the first time the query is used in an IEnumerable context. That could be either by calling MoveNext() and Current on an Enumerator(either implicitly in a foreach statement or calling it explicitly on an enumerator), or it could also be by calling a method that would accept an IEnumerable<T> and call these methods for us.

Up to here you have probably guessed that the return value of most LINQ operators(from, where, select, etc) should be IEnumerable<T>. That is true! There is another return type called IQueryable which we would analyze in the next posts. For now what we know is that LINQ to Object would receive a collection that implements IEnumerable<T> as input and in a sequence of operators, this input is passed along from one to the next all in some form of IEnumerable<T>. But let's delve a little bit deeper and check what the order of these operations are.

Deferred execution in LINQ actually has a maybe at first, strange behavior. Look at the translation from the query expression to the method calls above. Although, the methods are translated like so, the execution starts from the most inner method. Meaning that first the select method is called. This method would ask the outer method for the first element. The outer method(here where) would ask its outer method for the first element. The outer method here is an object which implements IEnumerable<T> so it would just yield the first element. This element would pass through all the methods all the way to the most inner method which would then yield the output element to the consumer of the LINQ query. This is called streaming of the sequence of the collection objects' elements. Most LINQ operators are streaming operators. There are some methods that require the entire list of items to perform their task. For example the sort or reverse operators are these types of operators which "buffer" the data and don't stream it. This is also why you should be careful where in the query you use these operators. For example it is usually better to use these operations after a where operator which would cause the buffering operation to be done on lesser items.

A Side Note

There are many operators in LINQ of which I would name a few more in passing. These all make sense if you have some experience with SQL. These include:
  • OrderBy: When applied this operator would enable ordering of data in ascending and descending format. If you are ordering the elements by more than one criteria you can use the ThenBy operators in combination with OrderBy as well. In order to sort in a descending form you use the OrderByDescending and ThenByDescending operators.

Joins

LINQ was here to bring us a structured query language that we could use to query objects. What is a query without joins? Joins are mainly used in relational models where the only way of reaching a referenced data in another table is to join them on that data. In languages like C# however, we use references to reach the piece of data. Considering this, there are still scenarios in which we can use Joins to find objects of interest in a well expressed format. In this section I would talk about three main Join types. "Inner Joins", "Outer Joins" and "Cross Joins", all of which are available to us using LINQ.

Let's start with the Inner Join. In this type of join both the left and the right elements of the tables have to have the item joined on in order for a row to be generated in the result group. This means that if there is an element that exists in the left group and not in the right group, the result would not have a row corresponding to this value and so is true with the right group. The syntax of an inner join operation in LINQ is as follows:

var joinQuery = from member in newLibrary.AllMembers
                where member.Age > 20
                join reservation in allReservations
                on member.Name equals reservation.MemberWhoTookIt.Name
                select member;
foreach (var item in joinQuery)
    PrintContent(item);

Here I've joined the AllMembers List with the allReservation List on the Name property. Also the left or outer sequence(AllMembers) is filtered before the join. Had we wanted to filter the sequence on the right, the sequence would have been more complicated.

var joinQuery = from member in newLibrary.AllMembers
                where member.Age > 20
                join reservation in (
                                    from reservation in allReservations
                                    where reservation.StartDate == DateTime.MinValue
                                    select reservation
                                    )
                on member.Name equals reservation.MemberWhoTookIt.Name
                select member;
Pay attention to the sub query that is used here. It is also worth mentioning that since the left sequence is streamed and the right sequence is buffered for key lookups, it is a better idea to put the sequence with more elements as the left sequence.

Outer Joins are needed to have a one to one relationship between the elements of a group and the result group. Meaning that regardless of the existence of an equivalent element in the right group we may want a row representing an element of the left group. This join's syntax is as follows:

var joinQuery = from member in newLibrary.AllMembers
                join reservation in allReservations
                on member.Name equals reservation.MemberWhoTookIt.Name
                into joinedElementsFromRight
                select new { Member = member, 
                             MembersReservations = joinedElementsFromRight };
foreach (var item in joinQuery)
{
    PrintContent(item.Member);
    foreach(var reservation in item.MembersReservations)
        PrintContent(reservation);
}

The last type of join is Cross Join. This is the same as the Cartesian product of all the elements of the two groups. There are is no matching done. But since in this type of join the right group is streamed as well, this join can be used quite elegantly to produce a product in which the elements of the right group are dependent on the elements of the left group. See the example below:

var query = from num1 in Enumerable.Range(1, 10)
            from num2 in Enuemrable.Range(1, num1)
            select new { Left = num1, Right = num2 }

Run this query and check out the result. As you'd expect just like a nested loop the join would yield different number of elements for the right sequence for each item from the left sequence. It is worth to repeat that in this join the right sequence is not buffered and streamed also. This means that this join can be quite useful for unknown or endless streams of data as only one item is fetched and processed at any given time.

Group By

This is the last operator that Jon has covered in the book. This is a very important operator indeed and used much more than Join in my experience. This operator allows grouping of elements in the sequence by a key. The returned value from this operator is IGrouping<TKey,TValue> which is extended from IEnumerable<T> to only have a key per each element. Here is an example:

var joinQuery = from reservation in allReservations
                group reservation by reservation.MemberWhoTookIt;
foreach (var item in joinQuery)
{
    PrintContent(item.Key);
    foreach(var reservation in item)
        PrintContent(reservation);
}
There you are ! We have now covered LINQ's most used operators and have now covered all new features added in C# 3. Now it's time to delve deeper. In the next post I will talk about what is actually going on behind the scenes with the C# compiler. Until then ciao.

Wednesday, 13 March 2013

Extension Methods

This post tells you all you have to know about extension methods. I'll look into why they were introduced, how to declare and use them and finally we'll explore some of the added extension methods that make LINQ possible. Let's get to it then.
I personally think that extension methods are a double edged sword. They can both add readability to code and make it obscure at the same time. As the name suggest, they extend functionality of a class. You may now be thinking that wait didn't we use inheritance for that purpose?
The answer is : well, maybe...it all depends. Inheritance is used when there is a form of specialization going on and an added state is required for all the objects of the extended class. There are times where you don't really need to add this functionality to the class since you're not specializing anything. Nonetheless, this would rarely keep you from extending that type if you own the code. But there are times when that is basically not an option.
For example, when you don't own the class's code or you are adding functionality to an interface, you can't really make any changes to legacy code. At these times you may add a static method and just pass in the class to the method for that functionality. Although this technique is widely used, this wouldn't feel that much object oriented when you look at the code since the functionality is not really called on the object itself.
In case of interfaces, when you add a method to an interface you have basically broken all legacy code that used that interface. All those classes wouldn't build anymore and would require implementation of that method. This is due to the rule that states that each implementation of an interface has to implement all the methods that are declared in that interface.
Finally Changes that LINQ needed are mostly to the interfaces. This is why extension methods were added to the language. Yes the language only. As you have probably guessed by now the changes are yet again only syntactic sugar added by the compiler. Extension methods are converted to a static class that receives each class as an argument behind the scenes. The syntax just make it look like the type now has the functionality.
Declaring extension methods is almost too simple to forget! All you have to do is adding the extended type to as a first parameter to the method and add a "this" keyword before the type too:

    
public static int MultiplyBy(this int extendedType, int multiplyBy)
{
    return extendedType * multiplyBy;
}

Now every integer has a "MultiplyBy" method. It is truly as easy as that. There are some limitation as to where the extension method can be declared. The class in which the extension method is implemented should be non-nested, non-generic and static.
Extension methods are resolved as follows:
The compiler would firstly look for an instance method with the signature defined in the caller code. If there no instance methods found in the type then all imported namespaces are searched for compatible extension methods. Compatible extension methods are those that have exactly the same syntax or have signature that can be implicitly converted. Note that if an extension method has the same signature as an instance method, the extension methods is never called. Also no warning is given by the compiler on the occurrence of said event.
Now we look at another piece of code:
    
public static bool Equals(this object obj1, object obj2)
{
    if(obj1 == null)
        return (obj1 == obj2);
    else
        return obj1.Equals(obj2);
}

Object obj1 = null;
Object obj2 = new Object();

Console.WriteLine(obj1.Equals(obj2));

What do you think the outcome of the code above is? Does it compile? Do we get a run-time error? Can we call methods on an object that is null(obj1)? We sure were not able to up to now. The code above compiles and runs without issue. The reason is that you can actually call extension methods on NULL objects ! If you think about it, it kind of makes sense. After all we talked about extension methods being syntactic sugar on top of the language. So actually what is happening is the compiler creating a static class containing the method and passing the object as an argument. There is no problem with an argument being NULL now is it ?
That's all there is to say on extension methods. Next we will look at the extension methods added to .Net 3.5.

IEnumerable<T> Extension Methods

This interface is one those extended with extension methods in .Net 3.5. A host of extension methods that work on the sequence of Ts yielded by the type. Filtering, aggregation, projection, search, grouping are just a few of the uses of these added methods. It is really fun to play around with the methods of this interface. I can't really look at them all in this post and I haven't even used all of them but I'll go through the filtering operation of Where<T>, projection in Select<T> and maybe some groupings.

Where<T> extension method

If you're familiar with SQL you can think of this method as a counterpart to the where clause in a select statement. If you aren't familiar with SQL then an example should make it clear:

Enumerable.Range(1, 10).Where(x => x < 5);

In the example above first the static method Range of the Enumerable static class is used to get the numbers between 1-10. Notice that the return value of this function is an IEnumerable. This method is defined using an iterator block which uses deferred execution. The Where extension methods is then used combined with a Lambda expression to get all the numbers that are between 1-10 and are smaller than 5. This is really where, whatever we have talked about in the previous posts all come together. In that one line of code a lot of powerful concepts can now be see. The use of iterator blocks, static classes, generics, two phase type inference and lambda expressions. These have all resulted in a lazy filtering of elements of a collection without any effort on your part. You may be thinking that Where should have a very complicated implementation. But it turns out that, that is far from the truth. You can implement your own Where method with just a few lines of code as below:



public static IEnumerable<T> Where<T>(this IEnumerable<T> sequence, Func<T, bool> predicate)
{
    if (sequence == null || predicate == null)
        throw new ArgumentNullException();

    foreach (T element in sequence)
    {
        if(predicate(element))
            yield return element;
    }
}
There is nothing to stop us from using a method group or a delegate instead of the lambda expression but usually as you are chaining these methods together you don't usually end up doing something that a lambda expression cannot handle.

Select<T> extension method

This method is also called the projection method. Basically what it does is receiving an input, project that object into another object by either selecting a part of it or adding to it, etc. For example you may have an object that carries around a lot of information and you may want to populate a grid view with the said object. One easy way of populating grid views is to assign a collection to the DataSource property. Since you can't really assign a collection of the entire object you can just select a piece of the object and add them to a collection and then set the property. In the inverse case you can create an anonymous type that contains an object:

var x = Enumerable.Range(1, 10).Select(x => new {Number = x, Inverse = 1/x});

Here I have created a new type that contains both the number and its inverse.

GroupBy<T> extension method

The GroupBy extension method has many overloads and is a very powerful tool. This extension methods would group the elements of the sequence according to a key which is designated in the first argument. The return value of this extension method is IEnumerable<IGrouping<TKey, TElement>>. The IGrouping<TKey, TElement> type is actually inherited from IEnumerable<T> and it also contains a property "Key". It is now kind of clear what this type is there for. The GroupBy method would return a enumeratable list of IGroupings which each contain a key and can also be enumerated. The simplest form can be seen in the example below.
var persons = new [] {
                        new { Name = "John", Age = "23"},
                        new { Name = "Mary", Age = "21"},
                        new { Name = "Joan", Age = "24"},
                        new { Name = "Tom", Age = "24"},
                        new { Name = "Hank", Age = "22"},
                        new { Name = "Steve", Age = "22"},
                        new { Name = "Bella", Age = "22"},
                    };

Console.WriteLine(
                    persons.GroupBy(p => p.Age)
                            .OrderBy(p => p.Key)
                );
In order to order a list by more than one field you can easily add a .ThenBy() to the end of the chain. Something to note here though is that the actual sequence's order is not changed by these commands. Most operations done in LINQ have been made to be side effect free. These operations just make copies of the sequence, make changes and pass it along.
There are some points to remember when deciding to write your own extension methods. Just like implicit typing there are pros and cons to defining extension methods. You have to realize that the code that you write in a group development environment is different than code you write for yourself. You now have to know your audience and your maintainers. You don't want to surprise any of the people working with the code. Extension methods could be really confusing for someone who's not familiar with them. They can also cause all kinds of problems. For example if you defined a method which is added to the framework in the next version, your code will break. Unless your lucky enough to get the exact same implementation of the method added. You can also cause a lot of problems if you don't define the extension methods in the right namespace. You definitely don't want to get all the extension methods of IEnumerable in your intellisense suggestions if you're not using them? Just think long and hard about defining your own extension methods and put them in the right namespace and with the right name. It may be worth it to standardize your extension methods naming scheme so that every person in the group knows it when they're calling an extension method.
Well...this is about it ! Guess what? You now know LINQ ! You just don't know that you do yet. LINQ is just syntactic sugar over all we have learned so far. It only allows a more familiar syntax for queries. Everything will fall into place with the next post which is officially about LINQ containing query expressions and LINQ to objects.
Stay tuned !

Monday, 11 March 2013

Lambda Expressions and Expression Trees

This chapter again is all about delegates. It might be a good idea to review delegates in previous posts if you think you don't remember much from them. Lambda expressions are here to make it easier and more straightforward than ever to create delegates.

Lambda Expressions

Lambda expressions are so called because Lambda calculus in Computer Science and Math deals with the definition and manipulation of functions. Other than the name I haven't seen the use of lambda calculus anywhere else so don't get intimidated by the name.
Since lambda expressions can be considered as a special case of anonymous methods and we have already covered them, I'm going to jump right into the syntax. There are many different forms of allowed syntax For lambda expressions and that is due to their wide use across different scenarios. As we get closer to LINQ, you'd see that they are used all over the place. Of course, as I said they can always be replaced by the more general anonymous methods. Here's a general form for lambda expressions :

        (explicit or implicit list of input arguments) ⇒ {code block}

As you see the most general form is quite similar to anonymous methods with the small difference of having the new ⇒ symbol and no mention of the delegate keyword. You can explicitly type the input arguments as well as have the compiler infer them. Also, another shortcut in the syntax, which is also the more important is that you can put a single expression in the {code block}. Just like anonymous methods the blocks needs to return a value. In the case that there is only one expression, this value is that expression. for example, in the Lambda expression below, the return value of the lambda expression is each employee's name :
    
    List<person> persons = new List <person>(); 
    persons.FindAll(p => p.Name == "Joe");

In the above example, the lambda expression has one input parameter(p). That is implicitly of type Person. The "=>" operator should be read as "goes to" when reading Lambda expressions. So in this example, p goes to p.Name which is the hypothetical person's name. What the above statement accomplishes then, is searching all elements of the list and then returning the element that matches the criteria. As we know FindAll expects the generic delegate Predicate<T>. The lambda expression is implicitly converted to a delegate instance and called for each element by the FindAll method. It is important to know that Lambda expressions by themselves don't represent a type. This is due to the fact that they can be converted to both delegate instances and expression trees which we'll cover in the next section.
You can imagine all the other forms that the Lambda expression can take by looking at the transformation from anonymous methods by removing/adding types/parantheses/etc. But the most popular form and the one that you will be using most is the one we just covered.
Captured variables are handled the Same way as anonymous methods here. You have to be careful about closures here just like we discussed in the post about anonymous methods.

Expression Trees

As their name suggests, expression trees are trees that have expressions as their nodes. Each node contains an expressions which after evaluation would be a child node of another expression. Expression trees and Lambda expressions are at the heart of LINQ. It is important to know why we need expressions trees in the first place. 
LINQ is here to streamline the process of querying objects, databases, XML documents, etc. It is here to provide one language and set of operators to be applied uniformely across all these inherently different topologies. The same thing that SQL in databases and VMs in cross platform development environments are doing. In order to query each specific topology you have to define and use custom queries. For example, the syntax to query a database using SQL is different than the query language used for XML(XQuery). In order to use the same syntax for all these different platforms, one can define a common series of operations possible on each platform and then translate these common operations for each respective topology. In order to do so, the program that is written using the common operations should be analyzed and translated. This means that the code itself is the data input to the translater.
The concept of using code as data then is key to creating such a mechanism. Expressions trees do just that. Then enable us to represent a piece of code in tree data structure which can later be parsed and analyzed. In .Net 3.5, the Expression class provides all the functionalities to do this. This class allows representation of actual C# code in a data structure which can then be parsed in the same or another program (in-process vs out-process). Expression trees in .Net 3.5 only allow some encapsulation of certain operations. In order to gain complete control over dynamically generated code one still has to use the CodeDom (a library to create language-independent dynamically generated code in .Net).

In the Expression namespace, each class extending the base abstract Expression class has two main attributes associated with it:
  • Type: This attributes is the actual .net type of the expression. Kind of like a return value of a function.
  • Node Type: Node type is selected from a defined enumeration in the Expression namespace. As we said before all different types of expressions derive from the Expression abstract base class. This is all fine and jolly but what should be we with all these different kinds of expressions that share a common structure ? We definitely want to end of with a huge inheritance chain. Here the design decision is to extend the hierarchy not in depth, but in breadth. Each general expression type as in a binary expression is grouped under a binary expression and different "Types" are defined for them. All supported node types for the binary expression class is defined here.
There is really no easy way to tell you how expression trees work unless to do this with an example. So here is an example of an expression tree that writes an output to the console.

    Type console = Type.GetType("System.Console");
    MethodInfo method = console.GetMethod("WriteLine", new[] { typeof(String) });

    //The target is null since WriteLine is a static method.
    ParameterExpression lambdaParam = Expression.Parameter(typeof(String));
    Expression methodCall = Expression.Call(null, method, new[] { lambdaParam });
         
    var rootExpression = Expression.Lambda<Action<string>>(methodCall, new[] { lambdaParam });
    var compiledMethod = rootExpression.Compile();
    compiledMethod("This was generated using an expression tree");
 
    Console.ReadKey(true);

I realize that the above code can be hard to grasp at first but bear with me for a little bit longer and soon you'd be able to get back to this code and understand it completely. In the above example, you can see how the Expression class provides you with factory methods to create expressions. Here we needed to create a single statement that would output a string to the standard output. If you think about this in .Net terms, this means calling the "WriteLine" method of the Console type.
 In order to translate this into the expression tree world, you first have to dissect the statement and see what are the elements of the code and how are they represented in the Expression namespace. Chances are that they may not be represented ! After all the Expression namespace is not here to replace the CodeDom ! At least not yet. Anyways, you need to specify a function call, specify the method used in this function call and then specify the target of the function(instance object) and the parameters.
 How do we provide information about a type or method? The Reflection namespace of course. Discussing what reflection is and how it is used is outside the scope of this article. In short, we will be able to specify a type or method's fully qualified name and get the .Net framework to search for and find the type we are talking about. This would allow specification of actual types with strings which would be searched for at run-time. If you've never heard of reflection before, this should feel very messy and counter-intuitive from an object oriented point of view, but it adds a lot of flexibility and its useful.
 Getting back to the issue at hand, we use the reflection namespace to obtain information on the method and it's parameters both of which, we supply with a string. What's left is specifying the expression types for each of the expressions and we have given the compiler all it is to know about the statement ! In the end we can ask the compiler to compile the expression tree into an element of type Expression<TDelegate>
Now things get a little bit confusing here due to a naming system used in the inheritance chain. So far we know that different types of expressions inherit from a base class called Expression. Classes like BinaryExpression, ConditionalExpression, LambdaExpression and so forth. Now there is a special generic class that extends the LambdaExpression class called the Expression<TDelegate> class. This class represents a strongly typed lambda expression and has a Compile() method which if used would create an actual Lambda expression whose type is specified by the TDelegate type parameter. In the example the type of the delegate is Action<String>. The .net library provides some premade and ready to use delegate types. This is quite handy as they are generic and have basically abolished the need to create a delegate types ever. Here we needed to specify a delegate type that accepts a string as a parameter and doesn't return anything. Action<T> and it's brothers(Action<T1, T2>, Action<T1, T2, T3>, etc) provide the types for void methods that receive 1 to 17 parameters(in .Net 4.0). What are we going to do if the method has a return type ? There is 17 generic delegates declared for that too(Func). Anyways, we use the Action delegate to represent the expression for the LambdaExpression class and call the Compile method. This method returns an actual delegate that can now be called like any other ! We have just dynamically created a delegate at run-time ! That looked like a lot of word for a rather simple task and indeed it is ! The good news is that .Net allows auto conversions of lambda expressions in their simplest case to expression trees. You just have to assign a Lambda expression to a Expression<TDelegate> instance and the compiler would do the rest. This can be seen in the example below:
    
    Expression<Action<String>> expression = (p) => Console.WriteLine(p);
    expression.Compile()("Hello");

The more complex lambda expressions which contain loops or condition blocks or even a single return statement are not supported in .net 3.5 but in .net 4.0 they were added since they were needed in the DLR framework.
The rest of the chapter in the book deals with the new rules on overload resolutions and type inference rules. In C# 2.0 each parameter was resolved independently and not much inference was going on. In C# 3.5 with the introduction of inferred types, lambda expressions and LINQ, type inference is basically going on everywhere. The rules are complicated but the gist is that in .net 3.5 type inference and parameter resolution
uses a collaborative effort among the different parameters; meaning that each parameter now can add some information to the inference process. This is needed to resolve some the method group resolutions and lambda type resolutions. I would not delve into any details on this. In case you are interested you can check out the language specifications which contains the rules or read the chapter on this in the book which goes into it in more detail.
Next up are extension methods and then we have basically covered the language side of LINQ.

Monday, 4 February 2013

Starting Coding in C# 3.0

From Chapter 8 on-wards we will introduce all the bits and pieces that would eventually go hand-in-hand to accommodate the creation of LINQ. In this chapter we would look at Automatic Properties, Implicitly typed local variables, Object and collection initializers, Implicitly typed arrays and Anonymous Types.

In some ways from this post on we will pave the way into LINQ. Basically after we cover anonymous types (this post), expressions trees and lambda expressions (next post) and extension methods (post after next) we have covered all the bits and pieces that enable LINQ. What is only introducing the syntax of query expressions and we're done ! So basically think of this post and the later ones as learning the under-pinnings of LINQ. What is really interesting about this approach is that not only would you learn LINQ and see it's power but when it is introduced you will know how it's actually working underneath. This would solve all the confusion about how to optimize your queries, when to use LINQ and etc.

Automatic Properties

Automatic Properties is a feature that is easy to learn and implement. It is one of those features that you would use a lot if you learn it. I have. Remember all that extra code you had to write just to enable encapsulated access to your private fields ? Although it was only a few lines of code but it created extra un needed clutter when you implemented a few trivial properties. Trivial is meant properties that lack any kind of validation or logging and are just simple access providers. You can now create this type of property with one line of code:

C# 2.0:
private int ID;
public GetID
{
    get
    {
         return ID;
    }
    set
    {
        ID = value;
    }
}

C# 3.0:
public GetID {get; set;}

You can use access modifiers for the getter/setters and make the automatic property static.Wandering what the static automatic property may be used for ? How about setting a private setter and public getter and then implementing a Singleton object ? You would need a static automatic property for that as seen below:

public class Singleton
{
    public static Singleton()
    {
        SingletonInstance = new Singleton();
    }
    public static Singleton SingletonInstance {private set; public get;}
}

Notice that we have not guaranteed thread safety in the above example and neither would the .Net framework. Actually automatic properties don't have anything to do with the CLR and are just some syntactic sugar added to the language by the nice people designing the C# compiler.

Implicitly typed local variables

In the first few posts on C# we covered the fact that the C# type system is static, explicit and safe. This is due to the fact that each and every variable has a known explicit type which remains static throughout its lifetime. It is also safe as opposed to the type system in C++. This fact has been entirely true up to C# 3.0 . In C# 3.0 the concept of implicitly typed local variables is introduced. Notice that the fact that C#'s type system is static and safe remains true until C# 4.0 where the static property is also challenged.
In order to define an implicitly typed local variable in C# one has to use the "var" keyword instead of the type descriptor in front of a local variable:

private int ID;
1) Type varWithType = new Type();
2) var varWithType = new Type();
3) var varWithType = new anotherType();
There are few things to take away from the above example. The first line of code defines a variable with an explicit type "Type". In the second line the same variable is defined now with an implicit keyword. The most important thing to realize here is that the compiled versions of both the first and second line are exactly the same thing. This means that we have been rather careless to say that now C# allows implicitly typed local variables. Because if that was true then we could have line 3 right after line 2. But this would result in a compile error. There are some limitations on where the "var" keyword can be used. The limitations are as follows:

  • It can'e be used when defining static or instance methods. The variable should be local variable. 
  • The variable defined should be initialized as part of the declaration.
  • It cannot have the value NULL.
  • You can't have multiple declarations in the same line when declaring one with the var keyword.
  • The initialization expression cannot be an anonymous function(delegates and lambda expressions) or method groups(Can guess why?).
In some cases above, the "var" keyword can be used but a cast is needed to tell the compiler which type we want to use but this is against the purpose of using var !
With great power comes great responsibility. It is important to know that although you can now define all your local variables as implicitly typed, it doesn't really mean that you should. As we'll see later on in this post the var keyword is there to fit in the bigger picture of anonymous types which are there to account for the bigger picture which is LINQ. So unless you want to use an anonymous type where you don't really know the type or are using LINQ there is not much need to declare things implicitly. 
Unless you have a good reason to do so. It all comes down to how you want to represent your code to the code maintainers. Do you want to emphasize on the algorithm rather than the extra fluff of types and variables ? Go ahead and use var a lot. Do you want them to be able to tell which types you're using because they're significant ? Try not to use var at all. Balance is key.

Object, Collection and Array Initializers

In C# 3.0 there are another set of features introduced as syntactic sugar to further streamline the creation of objects. This again fits into the bigger picture of LINQ. I will give an example of their use all in one snippet as I think they are easy enough to be learned all at once.

   class Article
   {
      public Article(string _title) { this.title = _title }
      public Article() {}

      string Title {get; set;}

      List<Line> lines = new List<Line>();
      List<Line> Lines { get{ return lines; } }
   }
   class Line
   {
      public string[] Words {get; set;}
      public Line() {}
   }

   ...

   Article sampleArticleObject = new Article
   {
      Title = "Banana Joe was announced as PM for Canada",
      lines = new Line 
                     { 
                        Words = new [] {"First", "line", "of", "article"}, 
                        Words = new [] {"Second", "line", "of", "article"},
                        Words = new [] {"Last", "line", "of", "article"},
                     }      
      
   }

There are a couple of things to mention here about each of the new features. We'll go through them one by one:

  • Object initializers: these allow you to declare and initialize thw object inline with the declaration. the important enhancements allowed with this feature is not only the number of lines you'll save but the fact that now the initialization and declaration are now in a single expression. This means that you can use this feature to pass an object and initialize it at the same time. although this is not really recommended since as always features that allow better readability at times cause quite the opposite effect when used incorrectly. the second thing to notice here is the amount of extra fluff reduced with this feature. Notice that now the actual data is much more emphasized compared to the situation where each member is assigned to with a property access on a separate line. Now the entire object and its contained data are initialize in a single line. Another thing to note is the added flexibility with the ability to mix and match the use use of constructors and initializers. This effectively allows passing of some parameters to the constructor(if one exists) and using the rest in the initialization block. Object initializers allow the creation of "embedded objects" as well. This means that if an object contains a reference to another member object, while initializing the container object, one can initialize the child as the container class is being initialized using the same syntax. 
  • Collection Initializers: As seen above, collections like List can be initialized now inline with the declaration as well. The story here is a little more interesting however since each collection usually has different Add () methods with different signatures . Here the C# team had to make a design decision between academic purity and flexibility. They went with flexibility here. The story is that for a collection to be able to be initialized inline it has to have an Add method to be used for each element specified in the initialization block. One way to enable this is to force the class to be initialized to implement the ICollection interface. This was the case in the draft of C# 3.0 language specification. On the other hand, this means a lot of limitation for the implementing classes. For example the Dictionary class had to now have an add method with only one parameter. This led to a change of strategy. The team decided to allow any Add method with any signature. The only requirement now was for the class to implement a method named "Add". In order to make sure that the class is indeed a collection what was left now was only IEnumerable which should be implemented by the class for the inline initialization to work. Interestingly enough as Jon explains, it seems like this condition is never used in the implementation of initialization blocks at all.

Anonymous Types

It is rather point-less to talk about anonymous types without being able to show you how they make life so much easier when used in LINQ. But for now think of a scenario where you query a database for some data. You usually select certain columns from that table and return the results. What if the same scenario happened with objects ? 
As the name suggests anonymous types allow the declaration of types without creating a new type. Strange ? Not really, here is an example:

    var anonType = new { Name = "Joe", Address="Banana Street" };

As you can see the syntax is somewhat similar to what we have see so far with object and collection initializers. Of course this can be combined with them as well. Also you don't really have to initialized anonymous types using only constants. The RHS of the assignment operation could be an expression or a method call too. Take note that we indeed cannot infer the type of the anonType variable here ! So now the only keyword that we CAN use is var. Now in order to select only certain properties from an object we can do something like this:
    Person p = new Person();
    var anonType = new[] 
    {
       new { Name = p.Name, Address= p.Address },
       new { Name = p.Name + " Big Joe", Address= p.Address + " back door" }
    };

What I've done here is define an array of anonymous types. This shows that both the two anonymous types declared above should have the same actual type ! otherwise how were they categorized under the same array ? Indeed, the C# compiler considers two anonymous types with the same number of property name and types and order to be the same. If you change any of the order, type or name elements, the type is different. I really don't want to talk about anonymous types much longer as we will see them all over the place from now on and in LINQ. Even if you feel like you don't have the handle on them yet, they will become second nature as we move more into LINQ. Next post is on Expression Trees and Lambda Expressions. Those are my most favorite additions in C# 3.0 followed closely by extension methods which are the post after next. Stay tuned.

Sunday, 27 January 2013

Finishing up on C# 2.0 and moving to 3.0

Hello again,

It's been a while since I posted. The new year's holidays are to blame ! With a rather late happy new year to everyone we'll begin.

In this post, I will be covering Chapter 7 of the book. Chapter 7 finishes up on all the new features in C# 2.0. I won't be covering Chapter 7 in detail since the features are not necessary used that often. On the other hand, it's worth it to know they are there so that when you get into a situation where they can be useful, you know they're there and you can go and read in detail how to use them. So far we have introduced 4 major features with C# 2.0. Generics, delegate improvements, nullable types and iterator blocks. Here is a list of some other features worth mentioning with a brief overview:

  • Partial Types: If you have ever used the VS designer to design forms. You have noticed a method in one of your project's classes called InitializeComponents(). You should have never changed the content of this method as it was auto generated by the designer. Having an auto generated part for the code and having manually entered code in the same file is dirty. It would be much better for the designer to write to a single file and then rewrite it whenever it wishes and for the developer to write in an separate file and not have to worry about the auto-generated sections. This is where partial types come into play. They give us the ability to define functionality for a single entity in separate files. At the time of compilation, the compiler would merge all these files into one and create the class. The list of uses go on in the book: Unit tests, dividing a bloated class into smaller functional units, etc. In C# 3.0 we get partial methods. These methods can be defined in the auto generated class and then the manually written class can define functionality for that hook and that functionality is going to be executed as if it was implemented in the auto-generated class. Before partial methods the auto-generated class had to define an event and the manually written one would have to subscribe to it. But this is much more elegant since the hooks that are not used by the manually written file would be deleted during compilation and you wouldn't get a bunch of unused even publishers.
  • Static Classes: This feature was introduced to clarify the use of utility classes in the project. Utility classes are common place among projects. Before you had to declare a utility class like so:
        public sealed UtilityClass
        {
            private UtilityClass() {}
            public static Method1()
            {
               ...
            }
            public static Method2()
            {
               ...
            }
        }
        
    The private constructor was to keep others from instantiating your class. Why put it there in the first place ? because the C# compiler by default supplies a public parameter-less constructor with a class that doesn't implement a constructor. It wouldn't do that with a class that has declared one though. That's why we declare one but make it private and empty. The sealed keyword is also there to keep others from inheriting from this class since the static UtilityClass doesn't have anything to specialize since all its members are static. Of course this is all fine and it works. But defining an empty parameterless private constructor is ugly. Also, we can't really keep some from using this class as a type now and a statement like UtilityClass a = null; would run without any compile errors. We want the compiler to do some compile-time checking for us in this case and don't allow such code from compiling. This is exactly what a static keyword in the class definition would do:
        public static UtilityClass
        {
            public static Method1()
            {
               ...
            }
            public static Method2()
            {
               ...
            }
        }
        
    You no longer need to specify the constructor or the sealed keyword. The compiler would also make sure that the class is used properly.
  • Different access modifiers for property getter/setter: Before we were forced to have the same access modifier level for both a setter and getter of a property. It is not really uncommon for you to want to be able to change a property in the class itself and not the world outside. This is was not really possible at least without trying to circumvent it with an instance method. In C# 2.0 you can now assign a private setter and a public getter to accomplish the task.
  • Namespace aliases: This feature is very useful if you want to use different versions of a type or are using two types that have the same qualified name. You can assign different aliases to their namespace when importing them and use the alias to reference them. For more details on this read section 7.4 of the book.
  • Pragma Directives: These are compiler preprocessor commands just like #if, #elif, etc which allow conditional compilation of the code. There are two pragma directives that work with the the Microsoft C# compiler(for other compilers like Mono consult the latest documentation). The two pragma directives are warning and checksum. Warning is used to suppress warnings generated by the compiler and the checksum is used in ASP.net to detect the right source code to debug.
  • Fixed-size buffers in unsafe code: These are used to represent unsafe arrays of fixed size. They are mentioned here only for the sake of completeness.
  • Friend Assembiles: This feature allows to set an assembly as a friend to another source assembly. This means that the friend assembly has access to all internal members of the source assembly. The only use that Jon thought of is in unit testing where you usually have to set your members to public just to test them. This is kind of ugly because you may forget to set them back. By setting the test class as a friend the test class would be able to test the source class(notice that the class still wouldn't have access to private members).
This concludes covering of Chapter 7. I will cover Chapter 8 in the next post.

Saturday, 15 December 2012

Iterator Blocks

Hi Again,

The Iterator design pattern used in object oriented design is a pattern that aims to separate the container to be iterated over from the specific algorithm that is used for the iteration. The Iterator Pattern designates the classes to use to be able to access elements of different containers using a common interface. The implementation of foreach and the IEnumerable, IEnumerator pair(and their generic counterparts) are an example of implementation of this pattern in C#. Most programming languages out there support and implement this functionality as in C#, Python, C++, Java, etc.
The implementation of iterators in C# 1.0 required creating a type to implement the IEnumerable interface which would create another type that implemented the IEnumerator interface. This requires implementation of GetEnumerator(), MoveNext() and Current. This of course was too much of a hassle to go through in order to allow lazy access to your container class's elements. In C# 2.0 a new concept called iterator blocks are introduced which make it quite easy to implement this pattern. I must admit that I think this is a rather bizarre implementation of the pattern and goes against some of the previously known knowledge of the average developer. Closures and anonymous methods were rather natural to me but these are just weird ! Even so, soon you'll find that all these pieces would fall into place and make implementation of powerful libraries like LINQ so easy.

Simple Iterator blocks

Let's assume that I want to implement the IEnumerable interface for a class that contains the number between a certain range. I want to be able to access these numbers lazily with an iterator. In C# 1.0 we needed to add another type to supply the enumerator for us and we also have to implement all those mentioned methods, take care of the state of the iterator manually and increment the position manually as well. In C# 2.0 all this can be done with the following piece of code. In other hand this is all you should write in order to implement the iterator model:

public IEnumerator GetEnumerator()
{
    for(int i = 0; i < <Collection>.length; i++)
        yield return <Collection>[i];
}
In this example I have assumed that the container class that is implementing this interface holds its collection as an array. <Collection> is the place holder for the array's name. The only difference we see here from the old C# 1.0 syntax is the yield return statement. Effectively what this method currently does is returning the ith element of the collection each time a MoveNext() method is called on the enumerator. But you may ask where is this enumerator ? where is this state saved ? Who and how knows how far we've gone in the collection ? Yes I know. It seems rather bizarre but this method is a special method now since an iterator block was implemented inside of it. The method would no longer be executed sequentially ! What the compiler does after you create an iterator block in a method is to create a custom nested type to hold all the information(current position, last value yielded, reached end of collection, etc). This nested type is actually a state machine which would recurs until you reach the end of the collection and then always return the last value yielded. This solution works because in C# a nested type has access to even the private variables of the enclosing type.
In order to visualize how this state machine's execution translates into execution in code you have to think of this like so:

  • The method is only called when the first call to MoveNext() is made on the iterator and not when the enumerator is created.
  • After the method is called the execution continues from the top to the first mention of the yield return type(note that remember that the only allowed return values for a method that implements a iterator block is IEnumerator, IEnumerable and their generic types).
  • From this point the method freezes. Meaning the the execution halts until another mention of the MoveNext() method. When the next call to this method is made on the iterator the execution resumes from after the yield return statement. 
The iteration stops when the loop ends and the method normally terminates or a yield break command is issued. It is important to remind you again that you cannot return any value other than those mentioned from this method. Also the allowed types after the yield return element are object when we're implementing a non-generic iterator and the type T when we are implementing IEnumerable<T>.
A better way to understand this flow is to implement one yourself and just print before and after calls to MoveNext() and Current to see which part of the code above is executed. If you do so you'll see that there are two very important things that we have to remember when working with iterator blocks: 
  • Firstly as before non of our code is executed until the first call to MoveNext() so never ever add input validation or code that has to be executed immediately in the method implementing the iterator block. You can not do this if you are implementing the IEnumerable interface for a class but you may not always implement that interface. You can use iterator blocks to return the IEnumerable interface without implementing the IEnumerable in the class. In that case your method may accept input parameters and it may seem perfectly ok to do input validation there. But this would cause big debugging problems since the code doesn't get called right after the iterator is made. 
  • Secondly it is important to know that non of our code will ever be executed when Current is accessed from the iterator. That value is basically stored int the nested type created for us by the compiler and doesn't need any execution of our code.

Finally Blocks

Iterators can not be used in try statements that are paired with catch blocks. But they can be used with try regions that are paired with finally blocks. It is important for an iterator class to have a dispose method to release any allocated resource after its execution. In order to enable the release of resources if yield break statement was met or if the iterator exited normally we can pair the iterator definition with a finally block that would get executed no matter what after we are done with the iterator.
The foreach programming construct already has this mechanism built in. Meaning that it would call the Dispose() method on the iterator that it's using. Calling the Dispose() method on the iterator that is implemented using iterator blocks would call its finally block. 

Example From the Book

Although I didn't want to use examples that are exactly as are in the book I have no choice to do so in this case. This example is just too cool to leave behind. No worries though, this chapter is a free sampler chapter anyways.

public static IEnumerable<string> ReadLines(Func<TextReader> provider)
{
    using(TextReader reader = provider())
    {
        string line;
        while((line = reader.ReadLine()) != null)
            yield return line;
    }
}

So in the above example, we are receiving a generic delegate as an argument. The Func<TResult> is a generic delegate that doesn't get any parameters and returns a value of TResult as the return value. Here the provider delegate points to the method to call to get the proper text reader with the right encoding. We also own the provider and we can dispose of it ourselves. Also the lines in the files are iterated lazily which matters if we are working with big files. This example encapsulates the use of delegates and iterator blocks. Actually there is an easy way to create different providers in a different method and have them call this method. That example would include anonymous methods and the concept of closures as well which is all we have been talking about. I'll leave that to you as an exercise.
As seen from the previous example this is where everything is starting to fall into place. We now have the power of delegates and anonymous methods and we can also use iterator blocks to access containers lazily with much less effort. In the next post I will look at chapter 7 of the book which concludes C# 2.0's latest features and paves the road to enter into the world of C# 3.0. Stay tuned !

Sunday, 2 December 2012

Delegates and Anonymous Methods

Hi Again !

I have to tell you that today's topic is super cool. It is the stepping stone to anonymous methods and the idiomatic C# 3.0 constructs. In this post I'll cover chapter 5 from the book which is titled "Fast-tracked delegates". I absolutely love the new functional approach in C#, although it doesn't always produce the most readable code(if you put everything in one line) but I love the fact that you get so much flexibility and power with just a few lines of code and we have not even gotten to LINQ yet !

Let's start with C# 1.0 yet again. So in C# 1.0 whenever you want to create a delegate you first have to define a delegate type which would consist of the signature of the methods that can be called from that delegate and the name of the new type. Then you would have to instantiate that type as seen below:

public delegate void DelegateType();

This is all good. Everything seems normal since we are declaring a new type and then instantiating it. But sometimes the C# 1.0's approach to delegates is both restrictive and hard to read. This is seen when we have a lot of event declarations in our code. In that case we for example we have to keep instantiating the EventHandler delegate type and assign a method group name to it. Can't the compiler just induce the types on its own with regards to the event handler it is assigned to? This can be seen below:
this.checkBox1.CheckedChanged += new System.EventHandler(this.checkBox1_CheckedChanged);
this.button1.Click += new System.EventHandler(this.button1_Click);
this.checkBox1.KeyPress += new System.Windows.Forms.KeyPressEventHandler(this.checkBox1_KeyPress);
The section of the code where the actual name of the method to be executed is mentioned is called a method group. It is called so because of possible overloads to the method. Now in C# 1.0 there is no guess work about which of these overloads is going to be used for the delegate since not only the type has to be mentioned but the signatures should be exactly the same(no delegate variance). As can be seen from the example each of these events are explicitly defined to be of type KeyPressEventHandler or EventHandler. The issue is that the KeyPress event of the CheckBox control already is set to only accept delegates that are of type KeyPressEventHandler so the mention of this to create the delegate is extra. Indeed in C# 2.0 we can omit the delegate type and have the compiler decide which delegate type it is. This would be an implicit cast from method groups to delegates:
this.checkBox1.KeyPress += this.checkBox1_KeyPress
This implicit cast also comes with the added capability of variance. Just as in function overload resolution the argument types are checked for the proper overload to choose. We have talked about variance and their existence or non-existence in various parts of the language before. In this case we would be able to use parameter contra-variance and return value variance with our delegates. This means that the defined delegate may have been declared for a derived type and we would be able to use a group method that uses a less derived type(or a base class) as a parameter. For the latter case we would be able to call a function that is returning a more derived type than its delegate signature stipulates. Now what would happen to the functions return type after you use a variant delegate or the parameter that is now less derived ? You would basically lose the information associated with the derived type and you're stuck with the base class. An example of contra-variance can be seen below:
static void DoSameThing(Object sender, EventArgs e)
{
    Console.WriteLine("I'm not doing anything useful");
}

Form form = new Form();
form.Click += DoSameThing;
form.KeyPress += DoSameThing;
form.MouseClick += DoSameThing;
As can be seen above we are now able to use a delegate that has a parameter that is less derived than the defined parameter and assign it to the even handlers. This can be useful if you want to general purpose tasks no matter which method is called since with this method you're actually losing the specific information that the derived type carries. The variance example could be like so:
    public delegate A sampleDelegate();
    public class A
    {
        public void Hi()
        {
            Console.WriteLine("A");
        }
    }
    public class B : A
    {
        new public void Hi()
        {
            Console.WriteLine("B");
        }
    }

    public class RunExample
    {
        public B getSomeB()
        {
            return new B();
        }

        public void run()
        {
            sampleDelegate ourDelegate = getSomeB;
            getSomeB().Hi();
            ourDelegate().Hi();
        }
    }
    /*
    Outputs:
    B
    A
    */
As we noted earlier, here although the method getSomeB() is returning a B object because we are using the delegate through variance although it is a legal call but we won't have access to B anymore.
The addition of delegate variance in C# 2.0 was a breaking change since some previous code would no longer work. An example scenario is shown below:

public delegate void generalDelegate(BufferedStream sr);

public class parentClass
{
    public void DoSomethingWithBuffer(BufferedStream sr)
    {
        Console.WriteLine("Did something in parentClass");
    }
}
public class derivedClass : parentClass
{
    public void DoSomethingWithBuffer(Stream sr)
    {
        Console.WriteLine("Did something in derivedClass");
    }
}
...
derivedClass c = new derivedClass();
generalDelegate gd = new generalDelegate(c.DoSomethingWithBuffer);
gd(new BufferedStream());

In the above example the method called by gd in C# 1.0 would be the parent method's and in C# 2.0 the derived class's. Although this is a breaking change I would say that it is not usually the case for a derived class to implement a more general parameter that its parent anyway. The derived class is there to specialize the base class's methods.

Anonymous Methods

Okay so here is where the fun begins. Anonymous methods are a way of inlining the use of delegates. Let's say we have a list of student objects(List<Student>). Each student has a name and let's say we want to get students that have a name starting with 'A'. This There are many ways to go about this of course. The more straightforward way is to iterate through the list and just filter it according to the predicate but we can do this in a more elegant way using delegates. The list generic class in .Net supplies a FindAll method with the following signature:

public List<T> FindAll(
 Predicate<T> match
)
as This method accepts a Predicate<T> generic delegate and returns a List of all the elements in the list that matched the predicate. On way to use to this method is writing a method that has the Predicate<T> generic delegate's signature and just pass them method's name to FindAll using an implicit method group conversion. This although doable is not very elegant. Since the method could be doing something very trivial and introducing a new method that only makes sense in this scope kind of introduces a lot of noise in intellisense and basically gets in the way. Fortunately anonymous methods come to the rescue here:

List<Student> filteredList = studentList.FindAll(
                                delegate(Student std){ 
                                    return std.Name.StartsWith("A"); 
                                });
This code is so readable and concise and consequently appealing to me that I have actually gone to great lengths to keep myself from using it in every single scenario that they are applicable. Okay so let's get into the detail of things. What exactly is an anonymous method ? Is it really a method ? What does it mean to return from an anonymous method ?
The answer to the questions above is almost yes. You can almost do anything you can do in normal methods in anonymous methods as well. For example you can have loops or local variables, can return, etc. Actually the compiler is creating a method and setting it as the target for an instance delegate behind the scenes. Usually this method is created inside the same class and is named something like <className>.c__2 which is called an unspeakable name. These names are made like so, so that there are no name conflicts. You can use ILSpy to see your anonymous methods in your IL after the compilation(.Net reflector is not free anymore and should be avoided).
Note that when you specify the return statement in the anonymous method you are truly returning from that method not the enclosing method. It is easy to get those two mixed up.
Okay there are two things that remain in this section and I will just mention them without getting into any details about them. Firstly anonymous methods are not contravariant meaning that the method you define should have the exact same signatures as the delegate type expected and secondly the () after the delegate can be ignored if there is no ambiguous resolution for the delegate type and you don't need to supply an argument to the method.

Closures

This is were the true power of anonymous methods is revealed. Closures can be confusing for some and second nature to others. I found them quite straight forward so hopefully you will too. Closures are absolutely crucial to lambda expressions and LINQ. Jon warns readers to make sure they are awake and have some time to spend on the section since it could get confusing. But don't be alarmed, there are countless other articles on the internet about them if you find the topic hard to grasp here. 
Closure put in simple terms is for a function to be able to interact with an environment beyond the parameters supplied to it. Let's make this abstract definition a little more concrete but before that we need to define two types of variables:

  • Outer Variables: These are variables that have an anonymous method declared in their scope. 
  • Captured [Outer] Variables: These types of variables are outer variables that are used inside the anonymous method.
To go back to the definition of closures, the anonymous method is the function and the captured variables are the environment beyond its own that they interact with. 

void SampleMethod()
{
    string capturedVariable = "test";
    int outerVariable = 3
    
     MethodInvoker ourDelegate = delegate(){
                                  string variable = "amazing";
                                  Console.WriteLine("This is an " + variable + capturedVariable);
                                        }
     ourDelegate();
}
td This code should be blowing your mind right now ! or maybe not ? The fact that we were able to just use the captured variable as if it was declared inside the anonymous method seems really strange and it should go against your previous knowledge of methods. After all methods are only allowed to interact with the parameters that is passed to them. Maybe also the this operator in an instance method. But surely not with an environment beyond their own.
It is important to note two things at this point. The anonymous method is not called when it is defined. So when we are declaring the MethodInvoker delegate above we are not executing the anonymous method so any captured variable that is changed inside the method is not touched until it is executed. Also, the captured variable is used inside the anonymous method is the same variable that is used anywhere inside the enclosing method.
Now why should we use captured variables and why are they useful ? Well remember our example with the student names above ? We had to hard-code the character that the name started with. With captured variables now we can have a method that gets the character that accepts this character and then capture it in an anonymous method. I will leave the details of this approach to yourself.
So far so good, you may be wandering at this point that well this seems okay. There is nothing terribly complex about captured variables so far. But here is where things get a little strange non the least ! What if I told you that local variables in a method that are captured by an anonymous method's delegate can live on after the method has returned ? Pretty crazy isn't it ! It's enough hard of a job to understand the sentence let alone understanding it. I will give an example of what this means and the repercussions of such behavior shall be evident by the time we are through.

public MethodInvoker GetDelegate()
{
    int localVar = 10;

    MethodInvoker returningDelegate = delegate{
                                         Console.WriteLine(localVar);
                                         localVar ++;
                                               };
    return returningDelegate;
}
...
MethodInvoker y = GetDelegate();
y();

So what will happen in the above snippet after we call y() ? If the variable localVar was not a captured variable we would have expected it to be destroyed when the method returned. After all local variables live on the stack and when the function returns and the stack frame is popped the variable is destroyed. But the fact is that the localVar local variable is not actually on stack but stored in a class that lived on the heap. The GetDelegate method and the anonymous method both have a reference to that special class and can so access it by the means of the class. Captured variables live at least as long as the delegate instance who references them.
There is more, capture variables can actually be shared among the many different delegates that reference them ! The key thing to remember is that a captured variables is captured each time it is instantiated. A variable is said to be instantiated whenever execution enters the scope in which it is declared. So in the example below the index variable(i) of the for statement and the list variable are shared and the counter variable is not since it is declared inside the for statement and is hence instantiated in each loop. 
List<MethodInvoker> list = new List<MethodInvoker>();

for(int i = 0; i < 10; i++)
{
    int counter = i * 4;
    list.Add(delegate
             {
                Console.WriteLine("Counter: " + counter + " Index: " + i);
                counter++;
             });
}
foreach(MethodInvoker method in list)
{
    method();
}
...
list[0]()
list[0]();
}

If you run the code above you'd see that the counter variable is instantiated in each of the anonymous methods but the other variables are shared among the rest.
Finally to sum up, the rule of thumb in using captured variables is that you should avoid scenarios that make the code too complex to understand. Mixing shared and distinct variables can make the code very unreadable and the results unpredictable. But as I showed you before closures can be powerful methods when used properly. Hopefully this post has impressed upon you the power and beautiful world of anonymous methods and will result in you getting the push to use them every now and then when the circumstances are right.
The next topic would be iterator blocks. I will try to get around to do a post on them throughout the next week. For now, be well and try to stay warm !