Monday, 4 February 2013

Starting Coding in C# 3.0

From Chapter 8 on-wards we will introduce all the bits and pieces that would eventually go hand-in-hand to accommodate the creation of LINQ. In this chapter we would look at Automatic Properties, Implicitly typed local variables, Object and collection initializers, Implicitly typed arrays and Anonymous Types.

In some ways from this post on we will pave the way into LINQ. Basically after we cover anonymous types (this post), expressions trees and lambda expressions (next post) and extension methods (post after next) we have covered all the bits and pieces that enable LINQ. What is only introducing the syntax of query expressions and we're done ! So basically think of this post and the later ones as learning the under-pinnings of LINQ. What is really interesting about this approach is that not only would you learn LINQ and see it's power but when it is introduced you will know how it's actually working underneath. This would solve all the confusion about how to optimize your queries, when to use LINQ and etc.

Automatic Properties

Automatic Properties is a feature that is easy to learn and implement. It is one of those features that you would use a lot if you learn it. I have. Remember all that extra code you had to write just to enable encapsulated access to your private fields ? Although it was only a few lines of code but it created extra un needed clutter when you implemented a few trivial properties. Trivial is meant properties that lack any kind of validation or logging and are just simple access providers. You can now create this type of property with one line of code:

C# 2.0:
private int ID;
public GetID
{
    get
    {
         return ID;
    }
    set
    {
        ID = value;
    }
}

C# 3.0:
public GetID {get; set;}

You can use access modifiers for the getter/setters and make the automatic property static.Wandering what the static automatic property may be used for ? How about setting a private setter and public getter and then implementing a Singleton object ? You would need a static automatic property for that as seen below:

public class Singleton
{
    public static Singleton()
    {
        SingletonInstance = new Singleton();
    }
    public static Singleton SingletonInstance {private set; public get;}
}

Notice that we have not guaranteed thread safety in the above example and neither would the .Net framework. Actually automatic properties don't have anything to do with the CLR and are just some syntactic sugar added to the language by the nice people designing the C# compiler.

Implicitly typed local variables

In the first few posts on C# we covered the fact that the C# type system is static, explicit and safe. This is due to the fact that each and every variable has a known explicit type which remains static throughout its lifetime. It is also safe as opposed to the type system in C++. This fact has been entirely true up to C# 3.0 . In C# 3.0 the concept of implicitly typed local variables is introduced. Notice that the fact that C#'s type system is static and safe remains true until C# 4.0 where the static property is also challenged.
In order to define an implicitly typed local variable in C# one has to use the "var" keyword instead of the type descriptor in front of a local variable:

private int ID;
1) Type varWithType = new Type();
2) var varWithType = new Type();
3) var varWithType = new anotherType();
There are few things to take away from the above example. The first line of code defines a variable with an explicit type "Type". In the second line the same variable is defined now with an implicit keyword. The most important thing to realize here is that the compiled versions of both the first and second line are exactly the same thing. This means that we have been rather careless to say that now C# allows implicitly typed local variables. Because if that was true then we could have line 3 right after line 2. But this would result in a compile error. There are some limitations on where the "var" keyword can be used. The limitations are as follows:

  • It can'e be used when defining static or instance methods. The variable should be local variable. 
  • The variable defined should be initialized as part of the declaration.
  • It cannot have the value NULL.
  • You can't have multiple declarations in the same line when declaring one with the var keyword.
  • The initialization expression cannot be an anonymous function(delegates and lambda expressions) or method groups(Can guess why?).
In some cases above, the "var" keyword can be used but a cast is needed to tell the compiler which type we want to use but this is against the purpose of using var !
With great power comes great responsibility. It is important to know that although you can now define all your local variables as implicitly typed, it doesn't really mean that you should. As we'll see later on in this post the var keyword is there to fit in the bigger picture of anonymous types which are there to account for the bigger picture which is LINQ. So unless you want to use an anonymous type where you don't really know the type or are using LINQ there is not much need to declare things implicitly. 
Unless you have a good reason to do so. It all comes down to how you want to represent your code to the code maintainers. Do you want to emphasize on the algorithm rather than the extra fluff of types and variables ? Go ahead and use var a lot. Do you want them to be able to tell which types you're using because they're significant ? Try not to use var at all. Balance is key.

Object, Collection and Array Initializers

In C# 3.0 there are another set of features introduced as syntactic sugar to further streamline the creation of objects. This again fits into the bigger picture of LINQ. I will give an example of their use all in one snippet as I think they are easy enough to be learned all at once.

   class Article
   {
      public Article(string _title) { this.title = _title }
      public Article() {}

      string Title {get; set;}

      List<Line> lines = new List<Line>();
      List<Line> Lines { get{ return lines; } }
   }
   class Line
   {
      public string[] Words {get; set;}
      public Line() {}
   }

   ...

   Article sampleArticleObject = new Article
   {
      Title = "Banana Joe was announced as PM for Canada",
      lines = new Line 
                     { 
                        Words = new [] {"First", "line", "of", "article"}, 
                        Words = new [] {"Second", "line", "of", "article"},
                        Words = new [] {"Last", "line", "of", "article"},
                     }      
      
   }

There are a couple of things to mention here about each of the new features. We'll go through them one by one:

  • Object initializers: these allow you to declare and initialize thw object inline with the declaration. the important enhancements allowed with this feature is not only the number of lines you'll save but the fact that now the initialization and declaration are now in a single expression. This means that you can use this feature to pass an object and initialize it at the same time. although this is not really recommended since as always features that allow better readability at times cause quite the opposite effect when used incorrectly. the second thing to notice here is the amount of extra fluff reduced with this feature. Notice that now the actual data is much more emphasized compared to the situation where each member is assigned to with a property access on a separate line. Now the entire object and its contained data are initialize in a single line. Another thing to note is the added flexibility with the ability to mix and match the use use of constructors and initializers. This effectively allows passing of some parameters to the constructor(if one exists) and using the rest in the initialization block. Object initializers allow the creation of "embedded objects" as well. This means that if an object contains a reference to another member object, while initializing the container object, one can initialize the child as the container class is being initialized using the same syntax. 
  • Collection Initializers: As seen above, collections like List can be initialized now inline with the declaration as well. The story here is a little more interesting however since each collection usually has different Add () methods with different signatures . Here the C# team had to make a design decision between academic purity and flexibility. They went with flexibility here. The story is that for a collection to be able to be initialized inline it has to have an Add method to be used for each element specified in the initialization block. One way to enable this is to force the class to be initialized to implement the ICollection interface. This was the case in the draft of C# 3.0 language specification. On the other hand, this means a lot of limitation for the implementing classes. For example the Dictionary class had to now have an add method with only one parameter. This led to a change of strategy. The team decided to allow any Add method with any signature. The only requirement now was for the class to implement a method named "Add". In order to make sure that the class is indeed a collection what was left now was only IEnumerable which should be implemented by the class for the inline initialization to work. Interestingly enough as Jon explains, it seems like this condition is never used in the implementation of initialization blocks at all.

Anonymous Types

It is rather point-less to talk about anonymous types without being able to show you how they make life so much easier when used in LINQ. But for now think of a scenario where you query a database for some data. You usually select certain columns from that table and return the results. What if the same scenario happened with objects ? 
As the name suggests anonymous types allow the declaration of types without creating a new type. Strange ? Not really, here is an example:

    var anonType = new { Name = "Joe", Address="Banana Street" };

As you can see the syntax is somewhat similar to what we have see so far with object and collection initializers. Of course this can be combined with them as well. Also you don't really have to initialized anonymous types using only constants. The RHS of the assignment operation could be an expression or a method call too. Take note that we indeed cannot infer the type of the anonType variable here ! So now the only keyword that we CAN use is var. Now in order to select only certain properties from an object we can do something like this:
    Person p = new Person();
    var anonType = new[] 
    {
       new { Name = p.Name, Address= p.Address },
       new { Name = p.Name + " Big Joe", Address= p.Address + " back door" }
    };

What I've done here is define an array of anonymous types. This shows that both the two anonymous types declared above should have the same actual type ! otherwise how were they categorized under the same array ? Indeed, the C# compiler considers two anonymous types with the same number of property name and types and order to be the same. If you change any of the order, type or name elements, the type is different. I really don't want to talk about anonymous types much longer as we will see them all over the place from now on and in LINQ. Even if you feel like you don't have the handle on them yet, they will become second nature as we move more into LINQ. Next post is on Expression Trees and Lambda Expressions. Those are my most favorite additions in C# 3.0 followed closely by extension methods which are the post after next. Stay tuned.