Paging is not so easy as most developers think it is
A lot of websites use somekind of paging mechanish to limit the results shown. Either to save bandwidth where results are served in pages or because search criteria should be made more specific by not having paging but just limit the maximum records returned.
The datagrid supports paging but this paging mechanish first fetches all records from the database and then storing the datatable in the viewstate and only showing a subset of that data to the user.
Depending on the type of data either a table or index scan will be performed by the database. If you have a filter/where clause then hopefully you will have a matching index that is also ordered as expected.
- A lot of network traphic between webserver and database
- Large webpages because of the viewstate
A workaround to limit the viewstate is by reloading the dataset at each page hit and disabling viewstate on the datagrid.
Instead of letting the database return all records we ask it to return only a subset of records. Most applications work with a current page and a page size. So when your page size is 10 and the current page is 2 then rows 20-29 should returned. Some databases have native support for paging like MySql with its LIMIT keyword but paging really is hell with older Microsoft SQL Server versions and with SQL 2005 you can easily hack paging in a query by using the ROW_NUMBER() function and AFAIK SQL 2008 does not add functionality to make paging just as easy as MySql's LIMIT statement.
So what you actually want is to not be bothered with such code and the easiest way to do that is by either making use of linq to sql, the entity framework, nhibernate or any other database abstraction layer.
NHibernate for example has the SetMaxResults and the SetFirstResult methods. These are so easy to use to fetch a certain 'page'.
Linq to sql
Linq to sql has the similar methods Skip and Take and work in the same way as NHibernates methods.
27 IQueryable<T> source;
- Quite fast if good indexed are available in the database.
- Only the needed data is tranfered from the database to the webserver
- Paging properties must be passed from front-end to back-end
- Cannot use most ASP.NET controls that support paging
But even this can have performance issues. Imagine a table that a contains billions of records. A query that performs an index scan will be slow too! So you have to be sure that your query will use one or more indexes but that it will not be performing an index scan. For example when you have a LIKE '%x%' statement. Depending on the database you use this can even result in a table scan.
When for example you are viewing a log that results in paging a log table then it is highly imaginable that log rows are being added while you are browsing the log. So when you are watching page 0 and waiting for a couple of seconds and then viewing page two could result in even newer data being shown than on page 0 because data has been added. This is due to the results not being cached and the results ordered descending on creation data (or identity).
What you would like is that when you view page 0 that page 1 will contain older data (in this log example). Here paging like described previously will not work.
The first (0) page query would be something like:
top 20 id, title, status
Then the query for the second page (1) would be needing the id value of the LAST row of the previous query result
top 20 id, title, status
The solution here is not making use of a "page" but only a "next" set. This is exactly what you need in most situations and if you are not sure about this then I can refer to an excellent application that (I think) is making use of this mechanism, Google Reader. You scroll down and then data is fetched at the same time and all previous downloaded items are not removed.
- Easy to implement
- Page result is what you would expect
- Sorting will always result in viewing the result at the first 'page'
- Requires unique indexes / combined columns
Supporting non unique indexes / order by clause
So what if your table does not have a unique index for the where ORDER clause that you specified? Imagine a result where the filter column is a "name", the page size is 10 but you have 16 people sharing the same name. This would result in a page only showing the first 10 people and the next page showing people with another name. The viewer will miss 6 people! It is this reason that we want to see all ties in the result.
It could be that you do not want to extend the ORDER BY because that could result in a table scan because the ORDER BY needs data not available in the index.
This can be solved by making use of the sql keyword WITH TIES. This makes sure that the result will always contain all rows matching the where in combination with the order clause.
top 10 WITH TIES name, dob, county, city, accountnr
This query will result in a page containing at least 16 people in the situation I described above. The page will be larger then 10 but this will not be a problem in an application that fetches data like google reader.
- Very fast and most efficient usage of indexes.
- There it not a standard SQL syntax. Most vendors support similar behaviour (see http://troels.arvin.dk/db/rdbms/).
- Not supported natively by frameworks like NHibernate, Linq to sql, EF