Do more with a DataTable and improve performance

Jul 31, 2006

There are some features in the System.Data.DataTable class that a lot of developers don’t utilize. I base that statement on different code samples I’ve seen on blogs and article bases during the last couple of years. Some of these features can improve the performance.

Calculated columns

First of all, I’ll create a DataTable manually, even though it is more likely to be created from querying a database.

DataTable dt = new DataTable();
dt.Columns.Add("Name", typeof(string));
dt.Columns.Add("Price", typeof(double));
dt.Columns.Add("ItemsInStock", typeof(double));

Imaging that there is 100 rows in that DataTable and you now want to calculate total price of all item currently in stock. The calculation is Price*ItemsInStock. What I see in a lot of code samples is that this column is calculated in the database by a SQL statement like this:

SELECT name, price, itemsinstock, (price*itemsinstock) AS stockprice FROM products”

The overhead in letting the database do the calculation is not that much in this particular example, because it is a simple multiplication of two rows. It could easily be more complicated than this example. The thing is, that .NET performs these kinds of calculation much more efficient than a database and that’s why we would like .NET to do them.

The DataTable class supports on-the-fly calculated columns and they are perfect to use in the example. Just add another column to the DataTable and give it a calculation formula.

dt.Columns.Add("StockPrice", typeof(double), "Price*ItemsInStock");

The calculation expression ("Price*ItemsInStock") can also use predefined functions like an if-statement.

"IIF(ItemsInStock = 0, 100, PricePrice*ItemsInStock)"

There a many different functions to use in the calculation expression.

Auto increment

Let’s say you want to bind the DataTable to a DataGrid in an ASP.NET page and that you want a column to display the row number. This can be done by adding a column to  the DataTable that has enabled the AutoIncrement property.

DataColumn col = new DataColumn("#", typeof(int));
col.AutoIncrement = true;
col.AutoIncrementSeed = 1;
dt.Columns.Add(col);

Now you have a column named “#” that contains the row number.

Querying the DataTable

You can query a DataTable in different ways in order to find the row you need. If you want all the rows in the DataTable that matches a search expression then you would use the Select method.

DataRow[] rows = dt.Select("Price > 159");

The Select method returns a DataRow array you can loop through like you normally would loop through all the rows in the DataTable.

foreach (DataRow row in rows)
{
   DoSomeThing();
}

If you just want a single row based on the DataTable’s primary key, then you have to let the DataTable know which of the columns is the primary key.

dt.PrimaryKey = new DataColumn[] { dt.Columns["#"]};

When you have defined the DataTable’s primary key, you can now query directly for that key and get the whole row returned by using the Find method.

DataRow oneRow = dt.Rows.Find("19");

This method is faster than the Select method. If there is no row with the primary key value of “19”, the Find method returns null. So, before you use the returned DataRow, you probably want to check if the row exist first.

if (oneRow != null)
{
   DoSomething();
}

Column totals

You decide to add totals to the footer row of the DataGrid and therefore needs to sum the integer type columns. You can do that very easy with the Compute method.

dt.Compute("sum(price)", null)

Or, put a filter on

dt.Compute("sum(price)", "price > 40")

The DataTable class is very powerful and can improve the performance by removing calculations to .NET instead of doing them on the database. The different ways to query the rows are also very impressive and flexible and that makes the DataTable a serious in-memory database.

* $4.95/month BlogEngine.net Hosting – Click Here!

Comments (8) -

Jesper
Jesper
8/1/2006 6:46:11 AM #

I admit that I'm new to .Net - I've only just started programming C#, so I might be wrong, but what about indexes? Databases have quite a few huge advantages and one of them is indexes. Finding data is so much faster when you have the RDBMS taking care of a B+-tree; especially when you are working with 100.000+ rows, as I usually am.

You write "that a lot of developers doesn’t utilize" - I think it's "don't"; not "doesn't".

 Mads Kristensen
Mads Kristensen
8/1/2006 7:22:18 AM #

Jesper, not all DataTables are filled with data from a database (XML, CSV etc.) and you often want to keep a database table in memory for different reasons and if the table is relatively small. That's when the true power of the calculated columns shows. It is also very fast to query that in-memory DataTable by using the primary key and then saving the trip to the data source, which probably will be slower.

I have corrected my spelling, thanks for letting me know Smile

Jesper
Jesper
8/1/2006 7:53:14 AM #

That's true - instead of using an array or a dictionary, you'll have a lot more flexibility using a datatable created on the fly. I've used that extensively in VB and VBA over the years, but I've never used calculated columns. Always found it a lot easier to simply make the calculations in the code for easier debugging, but I think I'll try your way and find out if it works better. In my quest to conquer C#, accessing databases is one of the key issues I haven't gotten around to yet.

 Michal Talaga
Michal Talaga
8/1/2006 10:15:58 PM #

I find working with DataTables, DataSets and the like a pain in the a*s. I love when everything is compile-time checked and I have a real separation of the domain model of my application from the database structure.

Boromir
Boromir
8/3/2006 8:55:37 PM #

"The thing is, that .NET performs these kinds of calculation much more efficient than a database and that’s why we would like .NET to do them. "

Hmm! - I beg to differ... Databases are especially good at crunching numbers. Have you put this to the test?

 Mads Kristensen
Mads Kristensen
8/4/2006 7:15:50 AM #

Boromir: Yes, I have tested it and it is very much faster actually. But it depends on the complexity of the calculation. I've converted a lot of CASE/WHEN and REPLACE function to .NET and it performs better than on SQL Server 2005. The other thing about it is, that I try not to put to much strain on the database, because it's allready very strained as it is. If-statements also performs better as a calculated column in a DataTable.

But you are right, there is not much gained by doing simple calculations and number crunching.

Boris Yeltsin
Boris Yeltsin
8/4/2006 11:43:41 AM #

The big advantage is that the web front end is a lot easier and cheaper to scale than the database, so the more processing you can move from the DB to the web server, the better!

 atul
atul
12/26/2006 2:02:17 AM #

Nice article, Please check you site in IE 7, you may need to change some CSS i believe

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.