Spread the Word! SQL User-Defined Scalar Functions are Evil!

While at DDD8a ‘Modern.NET’ recently, I attended the session "Things you should know about SQL as a developer” session hosted by Simon Sabin.

While I’m not DBA-level when it comes to SQL Server, I thought I knew most of the performance issues in SQL and how to avoid them. Turns out (happily) that I was wrong, and it’s given me an almost ‘free’ performance benefit.

The culprit is SQL User Defined Scalar functions. I use only a few of these, but they are very useful. For example, when working with dates in data, I often want to group rows by a month rather than the date and time value.

Indeed it’s really weird to my mind that this fairly common pattern isn’t implemented in T-SQL – or indeed even in the .NET framework! It’s not hard to create your own of course, you can use the year and month value to return a date for the first of the month. In .NET I have a DateExtensions library that has System.DateTime extension methods such as .MonthOnly, .LastMonth and .NextMonth – which is so heavily used that I now include the namespace at project level.

So for five years I’ve had a SQL scalar function dbo.MonthOnly() which takes a SQL DateTime and returns the first day of the date’s month. This is really useful with LINQ to SQL as this function can be integrated into LINQ queries, so that the grouping takes place on the server and not the client, reducing the amount of data passing from the server to the client.

Realisation Dawns

So listening to Simon talking about these useful functions and saying they are a performance nightmare was a bit of a surprise. Turns out these functions cannot be parallelised, so any query using them is automatically throttled. Secondly they are not compiled in the same way as stored procedures, so there is an interpretation hit. You can read more about it on his post here.

The solution says Simon is to use .NET CLR functions and having tried out the performance difference I can testify that the difference is amazing.

Example:

I have a database table of phone call records, with a SQL datetime column that stores and full date and time value. If I wanted to summarise by day (i.e. remove the time component of each call date) I would use my scalar DateOnly function. This table has just over 900,000 rows of real data in it.

Test

TSQL scalar function

CLR function

Group by DateOnly()

3 seconds

sub-1 second

Group by MonthOnly()

10 seconds

sub-1 second

 

I have to say that writing the SQL CLR functions was terribly easy, and (of course) I also created some unit tests for these so I had 100% confidence in the results I was getting.

Many thanks Simon!

Advertisements

One thought on “Spread the Word! SQL User-Defined Scalar Functions are Evil!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s