Sunday, 2 December 2012

Delegates and Anonymous Methods

Hi Again !

I have to tell you that today's topic is super cool. It is the stepping stone to anonymous methods and the idiomatic C# 3.0 constructs. In this post I'll cover chapter 5 from the book which is titled "Fast-tracked delegates". I absolutely love the new functional approach in C#, although it doesn't always produce the most readable code(if you put everything in one line) but I love the fact that you get so much flexibility and power with just a few lines of code and we have not even gotten to LINQ yet !

Let's start with C# 1.0 yet again. So in C# 1.0 whenever you want to create a delegate you first have to define a delegate type which would consist of the signature of the methods that can be called from that delegate and the name of the new type. Then you would have to instantiate that type as seen below:

public delegate void DelegateType();

This is all good. Everything seems normal since we are declaring a new type and then instantiating it. But sometimes the C# 1.0's approach to delegates is both restrictive and hard to read. This is seen when we have a lot of event declarations in our code. In that case we for example we have to keep instantiating the EventHandler delegate type and assign a method group name to it. Can't the compiler just induce the types on its own with regards to the event handler it is assigned to? This can be seen below:
this.checkBox1.CheckedChanged += new System.EventHandler(this.checkBox1_CheckedChanged);
this.button1.Click += new System.EventHandler(this.button1_Click);
this.checkBox1.KeyPress += new System.Windows.Forms.KeyPressEventHandler(this.checkBox1_KeyPress);
The section of the code where the actual name of the method to be executed is mentioned is called a method group. It is called so because of possible overloads to the method. Now in C# 1.0 there is no guess work about which of these overloads is going to be used for the delegate since not only the type has to be mentioned but the signatures should be exactly the same(no delegate variance). As can be seen from the example each of these events are explicitly defined to be of type KeyPressEventHandler or EventHandler. The issue is that the KeyPress event of the CheckBox control already is set to only accept delegates that are of type KeyPressEventHandler so the mention of this to create the delegate is extra. Indeed in C# 2.0 we can omit the delegate type and have the compiler decide which delegate type it is. This would be an implicit cast from method groups to delegates:
this.checkBox1.KeyPress += this.checkBox1_KeyPress
This implicit cast also comes with the added capability of variance. Just as in function overload resolution the argument types are checked for the proper overload to choose. We have talked about variance and their existence or non-existence in various parts of the language before. In this case we would be able to use parameter contra-variance and return value variance with our delegates. This means that the defined delegate may have been declared for a derived type and we would be able to use a group method that uses a less derived type(or a base class) as a parameter. For the latter case we would be able to call a function that is returning a more derived type than its delegate signature stipulates. Now what would happen to the functions return type after you use a variant delegate or the parameter that is now less derived ? You would basically lose the information associated with the derived type and you're stuck with the base class. An example of contra-variance can be seen below:
static void DoSameThing(Object sender, EventArgs e)
{
    Console.WriteLine("I'm not doing anything useful");
}

Form form = new Form();
form.Click += DoSameThing;
form.KeyPress += DoSameThing;
form.MouseClick += DoSameThing;
As can be seen above we are now able to use a delegate that has a parameter that is less derived than the defined parameter and assign it to the even handlers. This can be useful if you want to general purpose tasks no matter which method is called since with this method you're actually losing the specific information that the derived type carries. The variance example could be like so:
    public delegate A sampleDelegate();
    public class A
    {
        public void Hi()
        {
            Console.WriteLine("A");
        }
    }
    public class B : A
    {
        new public void Hi()
        {
            Console.WriteLine("B");
        }
    }

    public class RunExample
    {
        public B getSomeB()
        {
            return new B();
        }

        public void run()
        {
            sampleDelegate ourDelegate = getSomeB;
            getSomeB().Hi();
            ourDelegate().Hi();
        }
    }
    /*
    Outputs:
    B
    A
    */
As we noted earlier, here although the method getSomeB() is returning a B object because we are using the delegate through variance although it is a legal call but we won't have access to B anymore.
The addition of delegate variance in C# 2.0 was a breaking change since some previous code would no longer work. An example scenario is shown below:

public delegate void generalDelegate(BufferedStream sr);

public class parentClass
{
    public void DoSomethingWithBuffer(BufferedStream sr)
    {
        Console.WriteLine("Did something in parentClass");
    }
}
public class derivedClass : parentClass
{
    public void DoSomethingWithBuffer(Stream sr)
    {
        Console.WriteLine("Did something in derivedClass");
    }
}
...
derivedClass c = new derivedClass();
generalDelegate gd = new generalDelegate(c.DoSomethingWithBuffer);
gd(new BufferedStream());

In the above example the method called by gd in C# 1.0 would be the parent method's and in C# 2.0 the derived class's. Although this is a breaking change I would say that it is not usually the case for a derived class to implement a more general parameter that its parent anyway. The derived class is there to specialize the base class's methods.

Anonymous Methods

Okay so here is where the fun begins. Anonymous methods are a way of inlining the use of delegates. Let's say we have a list of student objects(List<Student>). Each student has a name and let's say we want to get students that have a name starting with 'A'. This There are many ways to go about this of course. The more straightforward way is to iterate through the list and just filter it according to the predicate but we can do this in a more elegant way using delegates. The list generic class in .Net supplies a FindAll method with the following signature:

public List<T> FindAll(
 Predicate<T> match
)
as This method accepts a Predicate<T> generic delegate and returns a List of all the elements in the list that matched the predicate. On way to use to this method is writing a method that has the Predicate<T> generic delegate's signature and just pass them method's name to FindAll using an implicit method group conversion. This although doable is not very elegant. Since the method could be doing something very trivial and introducing a new method that only makes sense in this scope kind of introduces a lot of noise in intellisense and basically gets in the way. Fortunately anonymous methods come to the rescue here:

List<Student> filteredList = studentList.FindAll(
                                delegate(Student std){ 
                                    return std.Name.StartsWith("A"); 
                                });
This code is so readable and concise and consequently appealing to me that I have actually gone to great lengths to keep myself from using it in every single scenario that they are applicable. Okay so let's get into the detail of things. What exactly is an anonymous method ? Is it really a method ? What does it mean to return from an anonymous method ?
The answer to the questions above is almost yes. You can almost do anything you can do in normal methods in anonymous methods as well. For example you can have loops or local variables, can return, etc. Actually the compiler is creating a method and setting it as the target for an instance delegate behind the scenes. Usually this method is created inside the same class and is named something like <className>.c__2 which is called an unspeakable name. These names are made like so, so that there are no name conflicts. You can use ILSpy to see your anonymous methods in your IL after the compilation(.Net reflector is not free anymore and should be avoided).
Note that when you specify the return statement in the anonymous method you are truly returning from that method not the enclosing method. It is easy to get those two mixed up.
Okay there are two things that remain in this section and I will just mention them without getting into any details about them. Firstly anonymous methods are not contravariant meaning that the method you define should have the exact same signatures as the delegate type expected and secondly the () after the delegate can be ignored if there is no ambiguous resolution for the delegate type and you don't need to supply an argument to the method.

Closures

This is were the true power of anonymous methods is revealed. Closures can be confusing for some and second nature to others. I found them quite straight forward so hopefully you will too. Closures are absolutely crucial to lambda expressions and LINQ. Jon warns readers to make sure they are awake and have some time to spend on the section since it could get confusing. But don't be alarmed, there are countless other articles on the internet about them if you find the topic hard to grasp here. 
Closure put in simple terms is for a function to be able to interact with an environment beyond the parameters supplied to it. Let's make this abstract definition a little more concrete but before that we need to define two types of variables:

  • Outer Variables: These are variables that have an anonymous method declared in their scope. 
  • Captured [Outer] Variables: These types of variables are outer variables that are used inside the anonymous method.
To go back to the definition of closures, the anonymous method is the function and the captured variables are the environment beyond its own that they interact with. 

void SampleMethod()
{
    string capturedVariable = "test";
    int outerVariable = 3
    
     MethodInvoker ourDelegate = delegate(){
                                  string variable = "amazing";
                                  Console.WriteLine("This is an " + variable + capturedVariable);
                                        }
     ourDelegate();
}
td This code should be blowing your mind right now ! or maybe not ? The fact that we were able to just use the captured variable as if it was declared inside the anonymous method seems really strange and it should go against your previous knowledge of methods. After all methods are only allowed to interact with the parameters that is passed to them. Maybe also the this operator in an instance method. But surely not with an environment beyond their own.
It is important to note two things at this point. The anonymous method is not called when it is defined. So when we are declaring the MethodInvoker delegate above we are not executing the anonymous method so any captured variable that is changed inside the method is not touched until it is executed. Also, the captured variable is used inside the anonymous method is the same variable that is used anywhere inside the enclosing method.
Now why should we use captured variables and why are they useful ? Well remember our example with the student names above ? We had to hard-code the character that the name started with. With captured variables now we can have a method that gets the character that accepts this character and then capture it in an anonymous method. I will leave the details of this approach to yourself.
So far so good, you may be wandering at this point that well this seems okay. There is nothing terribly complex about captured variables so far. But here is where things get a little strange non the least ! What if I told you that local variables in a method that are captured by an anonymous method's delegate can live on after the method has returned ? Pretty crazy isn't it ! It's enough hard of a job to understand the sentence let alone understanding it. I will give an example of what this means and the repercussions of such behavior shall be evident by the time we are through.

public MethodInvoker GetDelegate()
{
    int localVar = 10;

    MethodInvoker returningDelegate = delegate{
                                         Console.WriteLine(localVar);
                                         localVar ++;
                                               };
    return returningDelegate;
}
...
MethodInvoker y = GetDelegate();
y();

So what will happen in the above snippet after we call y() ? If the variable localVar was not a captured variable we would have expected it to be destroyed when the method returned. After all local variables live on the stack and when the function returns and the stack frame is popped the variable is destroyed. But the fact is that the localVar local variable is not actually on stack but stored in a class that lived on the heap. The GetDelegate method and the anonymous method both have a reference to that special class and can so access it by the means of the class. Captured variables live at least as long as the delegate instance who references them.
There is more, capture variables can actually be shared among the many different delegates that reference them ! The key thing to remember is that a captured variables is captured each time it is instantiated. A variable is said to be instantiated whenever execution enters the scope in which it is declared. So in the example below the index variable(i) of the for statement and the list variable are shared and the counter variable is not since it is declared inside the for statement and is hence instantiated in each loop. 
List<MethodInvoker> list = new List<MethodInvoker>();

for(int i = 0; i < 10; i++)
{
    int counter = i * 4;
    list.Add(delegate
             {
                Console.WriteLine("Counter: " + counter + " Index: " + i);
                counter++;
             });
}
foreach(MethodInvoker method in list)
{
    method();
}
...
list[0]()
list[0]();
}

If you run the code above you'd see that the counter variable is instantiated in each of the anonymous methods but the other variables are shared among the rest.
Finally to sum up, the rule of thumb in using captured variables is that you should avoid scenarios that make the code too complex to understand. Mixing shared and distinct variables can make the code very unreadable and the results unpredictable. But as I showed you before closures can be powerful methods when used properly. Hopefully this post has impressed upon you the power and beautiful world of anonymous methods and will result in you getting the push to use them every now and then when the circumstances are right.
The next topic would be iterator blocks. I will try to get around to do a post on them throughout the next week. For now, be well and try to stay warm !

No comments:

Post a Comment