One of the first things I looked for in the DataSet after I moved to .NET was the equivalent of the Filter property of the ADO recordset. I relied on such functionality a lot, and in many cases I still do. It took just a short time for me to find out that you have to access the RowFilter property of the DataSet's DataView. The syntax in both cases was similar, you construct an SQL like string (similar to what would go into the WHERE clause of a SELECT query). So with this, you could get all the rows from a table and then as the user enters some filter criteria, you can narrow the results down to what matches the user's filter.
Before I go into this, I must say it isn’t advisable to get all the rows from every table, you must get only what you need. But there are some cases where it isn’t too expensive to get all the rows, typically for lookup fields (think Account Types, States, Cities etc.), but it could be inefficient to repeatedly go to the database for just what you want. After my first .NET project I started to look for better ways of achieving this, at least ways to do so without relying on a DataTable, DataSet or DataView.
One such way is using Delegates against a method you can find on some of the IEnumerable(Of T)/IEnumerable<T> classes, such as a List. The delegate in question is Predicate(Of T)/Predicate<T>. Anyone who has done some PROLOG before could be thinking, "Hmm, predicates hey?" these are not quite the same thing. The predicate delegate is simple enough; it takes an argument of type T, and returns a Boolean value. Your preferred IEnumerable class will call your method each time it needs to make a comparison when searching for something.
There are two methods I will cover here, the Find() and FindAll() methods. Find will return only one item, the first item in your List that matches your criteria, and FindAll is what you would use when filtering, it returns everything that is a match. Before LINQ came about, and if you weren’t using anonymous methods in C#, you had to write a full method body for this comparison. That is the technique I will show first. Don’t fret if you don’t see your favorite language, my samples are always in both C# and VB, though my articles may have only VB because that is my favorite language.
Full Method Body
Private Function CompareName(ByVal prod As Filter.Common.Product) As Boolean
If MethodNameFilter.Text = "" Then Return True
'do the the filter...
Return prod.ProductName.StartsWith(MethodNameFilter.Text, _ StringComparison.InvariantCultureIgnoreCase)
The code above is an example of how a VB method that conforms to the Predicate(Of Filter.Common.Product) will look like. In this case, I am looking for products with a name starting with the text inside a textbox named MethodNameFilter. The method is called like so:
filtered = filtered.FindAll(AddressOf CompareName)
In C#, the equivalent code to call the method is slightly different:
filtered = filtered.FindAll(new Predicate<Filter.Common.Product>(CompareName));
VB allows you to write code just like this, where you create an instance of the Predicate<T> for some T, and then specify the method, but it has the convenient syntax of just saying AddressOf and then it will do the rest because it will know what delegate to use, and it can verify the method signature as you type.
By specifying the method in the call to FindAll(), you are telling the List (in this case I used List<T>) to use the specified method to qualify any objects for the new filtered list. I used the variable name filtered to store my copy of the filtered down list, while preserving a copy of the original list in case the user cleared the filter text. Here is a screenshot to show you what it looks like at runtime.
I allow the user to specifying any combination of two filter criteria, Name and Supplier for the Method body, and Name and Category for the Lambda expression tab, and in both cases I then filter the list down to show only what is a match. As you can see from the screenshot, there is an auto complete list for the Supplier textbox. I included code to create that in the attached download to show how you could assist a user. The Category textbox also has one. I've had to do that with info that was not normalised in the database, I get a list of all existing values for a column and then suggest to the user, as they are typing, what values to pick from.
Using a Lambda Expression
Doing the same thing in VB using a lambda expression is like this:
filtered = filtered.FindAll(Function(prod) _
And in C#,
filtered = filtered.FindAll(prod =>
In both cases, I use the inference capacities of both compilers to avoid typing a lot of code; you can see an example of how to do so without Local Type Inference in the comments in the attached download.
These lambdas do the same thing as the method body, and they both match the signature of the Predicate(Of T)/Predicate<T> delegate.
Searching for a specific item
The above examples covered how to use this technique to get all matching items. There are times however when you may want to get only one item. The List class comes with a corresponding Find() method that accepts a Predicate(Of T) argument, and then returns a single result of type T. the return value can be Nothing (null in C#) if there is no match, it if there is more than one item satisfying the criteria, the method will return only the first one. The method bodies and lambdas look exactly the same as before, because it is the same type of comparison using the very same Delegate, so I will not include the code here.
The data in this case as you can see from the screenshot is from the Northwind same database. Just in case someone out there has not installed the database, I included an XML file with the data that a shared library (shared between the VB and C# windows applications) reads to construct the list. I read the data using LINQ to XML in VB with the very sexy XML literals. I also included the project that created this XML file and you can run it by right-clicking on it, Selecting debug, and then Start New Instance. It runs, creates the file and loads it in IE, or your default XML viewer. You may have to change your connection string to get from your copy of Northwind.
Other List(Of T) methods that use the Predicate(Of T) delegate are, FindIndex, FindLast, FindLastIndex, RemoveAll and TrueForAll. FindIndex and FindLastIndex return the zero based index of the first or last occurrence of a matching element respectively. FindLast does the opposite of Find; it returns the last matching element in the list. RemoveAll will remove all elements that match the criteria that are matched by the predicate you have specified. And finally the TrueForAll method will return true if all items in the list match the given criteria. It will call the predicate for each element in order and will stop processing the first time false is returned.
Have a look at the attached download project if you wish to see more about how to use this technique in your own code. Hope this helps, and happy coding.
I have spent the last couple of weeks going over an application I wrote 5 years ago, (the one I referred to in my garbage collection blog. There is nothing wrong with this application but I wanted to review some of the decisions I made then in light of what I know now so that I can see how far I've come and how I can make it perform better, because I know I definitely should have learnt enough about performance in the 5 years since to improve on it.
The reason I picked this particular application is that it was my first production application in .NET (started in April 2004, though I'd been using .NET since August/September 2003 – can't quite remember). It turns out that this application was a key stepping stone in my .NET education as I learned quite, not just about .NET. A note that I included in the code there is about patterns. Before starting on this project, I, like many people I still meet, knew very little about design and code patterns. My journey into studying patterns started when after the first release, I was reviewing some of the requirements against how the users used the system.
One particular requirement kept gnawing at me. There was some data that had to be cached for regular access by the users and they didn’t have to wait for it. Customers could walk in and ask random questions and the user's had to be on hand to answer these quickly to serve the next one in line. Different parts of the system itself had to refer to some of this info regularly based on customer activity on the hardware mentioned in the GC blog, and feedback had to be provided to the customer immediately by way of an LCD device.
Following the instructions in the design spec, I loaded this info upfront so that by the time the user was presented with the UI, the data would be available. After the first few days of watching users in different roles interact with this software, I realized that not all roles needed all the information as the spec had said. Also, some of the information could be first required late in the day. This meant that I was loading and holding on to information that may never be used in the context of a particular user. I began thinking about how to deal with this.
Some of this data I stored in simple collections, and the rest in datasets. I used datasets a lot in this application because a lot of the users would need the data displayed in a grid with a mechanism to filter it, and the only way I could filter stuff at in memory without using a refined database search at the time was using a DataView against a dataset/datatable. I re-wrote the custom collections to inherit from a base class that I wrote that served as a cache with a sliding timeout period. The client code would no longer search the database and populate the collections, but would just go to the collections for the info they required.
My client provided telecommunication services to their customers and there was a need to provide international dial codes to allow the customers to make pre-paid calls to different parts of the world. Some roles however did not need this information at all, or would only need it for when making a report, which could be once a week, or once a month. The initial strategy would get the data as soon as the application started in Sub Main (void Main() for C# developers). After a little refactoring, the startup code created empty instances of the collections. This data was then loaded the first time any client code referred to it and for the data that wouldn't be accessed the entire day, the sliding timeout would then be activated at this point, and would be moved forward 10 minutes at each access.
I called this the "Delayed Reading Strategy", and in a discussion with someone from Ukraine I met on a programmer's forum, I described this. He then gave me a link to what is now a vital resource for me, the Microsoft Patterns and Practices team website. I discovered this was known as the lazy-load pattern and through some reading material I found on MSDN, and other places, I learned a lot of patters that I used to improve the performance of this application. The result was that after about two weeks of use, users suddenly experience faster start up time because nothing was loaded until it was first needed. Other applications worked better when this application was running now because it required less memory. Anything it didn’t access or refer to in 10 minutes would be disposed to free the RAM.
There were also two tables whose runtime collections were indeterministic. There was never a time I would need to load all the rows, neither would there be a need to say get me all the rows where such a field was equal to blah. The collections would be created based on user and customer activity against specific rows in the table. The corresponding collection would then build itself by adding any row/object to the internal list whenever it was referred to specifically.
I now do a lot of mentoring to students and interns in some companies and find that a lot of them are unaware of code patterns. One thing I tell them when introducing them is that this is a very easy way, and very important way of reducing the bug-count of an application because you can approach problems in a predictable way. I've seen some weird code over the years to implement things like the singleton pattern, software factories and extensibility, all because the programmers were not aware of design and code patterns.
A year later when I decided to specialize in integrating systems, I started to study integration patterns and there is a good e-book available from the patterns and practices team called Integration Patterns, ISBN 0-7356-1850-X. I found this book very valuable and instrumental flattening my learning curve. If anyone out there hasn't venture into patterns, or hasn't heard of them, head over to the Patterns and Practices team site, or download some of their screencasts and you will be amazed at what you learn. This is a topic on which there is a lot of materials so I will not post examples right now but I am currently finishing off a few articles to post over the next couple of weeks that touch on some basic and some not so basic patterns to get people started on patterns.
In conclusion I will point out what I can do to improve my application. Having been performance conscious even back then, the main thing I can do is to change some of DataSet code to use other structures, like Custom Collections. There are other ways I now know to filter through a collection in .NET without the overhead of hanging on to a DataSet. My next blog will be covering some of these ways, which include using predicates with Lambda Expressions or full blown methods satisfying the Delegate signature against an IEnumerable(Of T)/IEnumerable<T> collection. I will also write about the biggest problem I faced once the service this application was providing became popular and it had a heavy load I couldn't simulate on my development machine. Happy coding.