This is basically where all the things we talked about in the previous posts come together. Ideas and features like Lambda Expressions, Delegates, Deferred Execution and Extension Methods all go hand in hand to give us what we will see below.
In order for us to get started, we need a set of classes to actually run the queries on. The following shows the classes that we will be using to run our queries. We will fill in the objects with some sample data using object initialization we learned in previous posts:
class Book
{
public string Name { get; set; }
public string Genre { get; set; }
public decimal Price { get; set; }
public override string ToString()
{
return Name;
}
}
class Library
{
public string Name { get; set; }
public List<Book> AllBooks { get; set; }
public List<Member> AllMembers { get; set; }
public override string ToString()
{
return Name;
}
}
class Member
{
public string Name { get; set; }
public string Address { get; set; }
public short Age { get; set; }
public override string ToString()
{
return Name;
}
}
class Reservation
{
public Library InLibrary { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public Book ReservedBook { get; set; }
public Member MemberWhoTookIt { get; set; }
}
...
..
.
Library newLibrary = new Library
{
Name = "Biggest Library Ever!!",
AllBooks = new List<Book>
{
new Book { Name = "When I was at the gym", Genre = "Horror", Price=1000},
new Book { Name = "Hills can run", Genre = "Comedy", Price = 200},
new Book { Name = "Inspector Bandex", Genre = "Romantic", Price = 500}
},
AllMembers = new List<Member>
{
new Member { Name = "John", Address = "Saturn", Age = 23},
new Member { Name = "Joon", Address = "Sun", Age = 1500},
new Member { Name = "Jeen", Address = "Moon", Age = 2}
}
};
List<Reservation> allReservations = new List<Reservation>();
allReservations.Add(new Reservation
{
InLibrary = newLibrary,
StartDate = DateTime.Now,
EndDate = DateTime.Now.Add(new TimeSpan(2, 0, 0, 0)),
MemberWhoTookIt =
newLibrary.AllMembers.First(member
=>
member.Name.Equals("John")),
ReservedBook =
newLibrary.AllBooks.First(book
=>
book.Name.Equals("Inspector Bandex"))
});
allReservations.Add(new Reservation
{
InLibrary = newLibrary,
StartDate = DateTime.Now,
EndDate = DateTime.Now.Add(new TimeSpan(1, 0, 0, 0)),
MemberWhoTookIt =
newLibrary.AllMembers.First(member
=>
member.Name.Equals("Jeen")),
ReservedBook = newLibrary.AllBooks.First(book
=>
book.Name.Equals("Romantic"))
});
public static void PrintContent(object obj)
{
Console.WriteLine("--------");
PropertyInfo[] properties =
obj.GetType().GetProperties(BindingFlags.Instance | BindingFlags.Public);
foreach(var property in properties)
Console.WriteLine(String.Format("{0} : {1}",
property.Name,
property.GetValue(obj)));
}
As you see above, I have created a Library class that holds certain members and books. Another Reservation class is responsible to hold different Reservations for the library and its members and books. I have also added a PrintContent method which would help us print out all the properties of an object it receives as input using reflection. Don't worry about the PrintContent method and how it works as this post is not really about reflection. I will go ahead and write the first query on this setup:
var query = from reservation in allReservations
where reservation.StartDate.Equals(DateTime.MinValue)
select reservation;
foreach (var item in query)
PrintContent<Reservation>(item);
Console.ReadKey();
The code above would have the following output:
--------
InLibrary : Biggest Library Ever!!
StartDate : 0001-01-01 12:00:00 AM
EndDate : 0001-01-03 12:00:00 AM
ReservedBook : Inspector Bandex
MemberWhoTookIt : John
--------
InLibrary : Biggest Library Ever!!
StartDate : 0001-01-01 12:00:00 AM
EndDate : 0001-01-02 12:00:00 AM
ReservedBook : Inspector Bandex
MemberWhoTookIt : Jeen
You can definitely tell what the query is doing by looking at the outcome and by basically reading it ! Just read it the plain english. I'm trying to find all reservations in the allReservation collection where the StartDate of the resevation is of a certain value. Now you may wander what the "select" statement is for. We will get to that soon. But first let's see what exactly happens when you write those lines of code.
Firstly, you will notice that I have used the "var" keyword. This keyword if you remember is to ask the compiler to infer the type itself. Let's just say for now that finding the actual type of a LINQ query can sometime be complicated and more importantly usually it's not much of concern. That is why you usually would see the var keyword used instead of the actual type the query would return.
Another point to remember is that what we have written as LINQ query statements and assigned it to the query local variable are just operations to be performed later. In other words by assigning these to the variable we are not doing any type of processing on the collection yet. Instead a representation of this query is generated using expression trees and this tree would later be traversed by the LINQ engine when the query is actually executed(remember deferred execution and the yield return statement?). The process of transforming the LINQ query statements to function calls is completely mechanical in the sense that the compiler doesn't try to do any kind of optimization at this point. The statement is transformed to a series of function calls. For example the above statement would be translated to the following:
var query = from reservation in allReservations
where reservation.StartDate.Equals(DateTime.MinValue)
select reservation;
allReservations.Where(reservation
=>
reservation.StartDate.Equals(DateTime.MinValue)
.Select(reservation => reservation)
Now if you are familiar with SQL and you had a nagging feeling as to why this query is written in reverse(instead of SELECT * FROM ...), I can answer you why. As you can see they way we have written the query is the same way the query is translated into code.Now you may be wandering when exactly the processing starts then!? It actually starts the first time the query is used in an IEnumerable context. That could be either by calling MoveNext() and Current on an Enumerator(either implicitly in a foreach statement or calling it explicitly on an enumerator), or it could also be by calling a method that would accept an IEnumerable<T> and call these methods for us.
Up to here you have probably guessed that the return value of most LINQ operators(from, where, select, etc) should be IEnumerable<T>. That is true! There is another return type called IQueryable which we would analyze in the next posts. For now what we know is that LINQ to Object would receive a collection that implements IEnumerable<T> as input and in a sequence of operators, this input is passed along from one to the next all in some form of IEnumerable<T>. But let's delve a little bit deeper and check what the order of these operations are.
Deferred execution in LINQ actually has a maybe at first, strange behavior. Look at the translation from the query expression to the method calls above. Although, the methods are translated like so, the execution starts from the most inner method. Meaning that first the select method is called. This method would ask the outer method for the first element. The outer method(here where) would ask its outer method for the first element. The outer method here is an object which implements IEnumerable<T> so it would just yield the first element. This element would pass through all the methods all the way to the most inner method which would then yield the output element to the consumer of the LINQ query. This is called streaming of the sequence of the collection objects' elements. Most LINQ operators are streaming operators. There are some methods that require the entire list of items to perform their task. For example the sort or reverse operators are these types of operators which "buffer" the data and don't stream it. This is also why you should be careful where in the query you use these operators. For example it is usually better to use these operations after a where operator which would cause the buffering operation to be done on lesser items.
A Side Note
There are many operators in LINQ of which I would name a few more in passing. These all make sense if you have some experience with SQL. These include:
- OrderBy: When applied this operator would enable ordering of data in ascending and descending format. If you are ordering the elements by more than one criteria you can use the ThenBy operators in combination with OrderBy as well. In order to sort in a descending form you use the OrderByDescending and ThenByDescending operators.
Joins
LINQ was here to bring us a structured query language that we could use to query objects. What is a query without joins? Joins are mainly used in relational models where the only way of reaching a referenced data in another table is to join them on that data. In languages like C# however, we use references to reach the piece of data. Considering this, there are still scenarios in which we can use Joins to find objects of interest in a well expressed format. In this section I would talk about three main Join types. "Inner Joins", "Outer Joins" and "Cross Joins", all of which are available to us using LINQ.
Let's start with the Inner Join. In this type of join both the left and the right elements of the tables have to have the item joined on in order for a row to be generated in the result group. This means that if there is an element that exists in the left group and not in the right group, the result would not have a row corresponding to this value and so is true with the right group. The syntax of an inner join operation in LINQ is as follows:
var joinQuery = from member in newLibrary.AllMembers
where member.Age > 20
join reservation in allReservations
on member.Name equals reservation.MemberWhoTookIt.Name
select member;
foreach (var item in joinQuery)
PrintContent(item);
Here I've joined the AllMembers List with the allReservation List on the Name property. Also the left or outer sequence(AllMembers) is filtered before the join. Had we wanted to filter the sequence on the right, the sequence would have been more complicated.
var joinQuery = from member in newLibrary.AllMembers
where member.Age > 20
join reservation in (
from reservation in allReservations
where reservation.StartDate == DateTime.MinValue
select reservation
)
on member.Name equals reservation.MemberWhoTookIt.Name
select member;
Pay attention to the sub query that is used here. It is also worth mentioning that since the left sequence is streamed and the right sequence is buffered for key lookups, it is a better idea to put the sequence with more elements as the left sequence.Outer Joins are needed to have a one to one relationship between the elements of a group and the result group. Meaning that regardless of the existence of an equivalent element in the right group we may want a row representing an element of the left group. This join's syntax is as follows:
var joinQuery = from member in newLibrary.AllMembers
join reservation in allReservations
on member.Name equals reservation.MemberWhoTookIt.Name
into joinedElementsFromRight
select new { Member = member,
MembersReservations = joinedElementsFromRight };
foreach (var item in joinQuery)
{
PrintContent(item.Member);
foreach(var reservation in item.MembersReservations)
PrintContent(reservation);
}
The last type of join is Cross Join. This is the same as the Cartesian product of all the elements of the two groups. There are is no matching done. But since in this type of join the right group is streamed as well, this join can be used quite elegantly to produce a product in which the elements of the right group are dependent on the elements of the left group. See the example below:
var query = from num1 in Enumerable.Range(1, 10)
from num2 in Enuemrable.Range(1, num1)
select new { Left = num1, Right = num2 }
Run this query and check out the result. As you'd expect just like a nested loop the join would yield different number of elements for the right sequence for each item from the left sequence. It is worth to repeat that in this join the right sequence is not buffered and streamed also. This means that this join can be quite useful for unknown or endless streams of data as only one item is fetched and processed at any given time.
Group By
This is the last operator that Jon has covered in the book. This is a very important operator indeed and used much more than Join in my experience. This operator allows grouping of elements in the sequence by a key. The returned value from this operator is IGrouping<TKey,TValue> which is extended from IEnumerable<T> to only have a key per each element. Here is an example:
var joinQuery = from reservation in allReservations
group reservation by reservation.MemberWhoTookIt;
foreach (var item in joinQuery)
{
PrintContent(item.Key);
foreach(var reservation in item)
PrintContent(reservation);
}
There you are ! We have now covered LINQ's most used operators and have now covered all new features added in C# 3. Now it's time to delve deeper. In the next post I will talk about what is actually going on behind the scenes with the C# compiler. Until then ciao.
No comments:
Post a Comment