Querying XML fields in LINQ to SQL

A while back I wrote about the limitations of LINQ to SQL when querying data in XML columns. It was frustrating but I hadn’t figured out an elegant solution at the time.

Then at then end of 2010 I attended a DDD8a session which re-opened my eyes to SQL CLR functions and how much better these were than SQL Scalar functions.

And today I linked the two problems in my head – the solution is right there! Write some SQLCLR functions which allow you to query an XML column at the server, rather than having to retrieve the full XML value and search it on the client.

Here is the VB.NET code I wrote as a first test to get the first element (or child element) of a given name and return the value.

    'Get value of an Xml column element (including child elements)
    <Microsoft.SqlServer.Server.SqlFunction()> _
    Public Shared Function XmlGetElementValue(ByVal column As SqlXml,
                                              ByVal elementName As SqlString) As SqlString
        'Check for nulls
        If column.IsNull Then Return SqlString.Null
        If elementName.IsNull Then Return SqlString.Null

        'Create an Xelement object from the XML
        Dim e As XElement = XElement.Parse(column.Value)
        'Try to find an element of given name
        Dim found = e.Descendants(elementName.Value).FirstOrDefault
        If found IsNot Nothing Then
            'return value
            Return New SqlString(found.Value)
        Else
            Return SqlString.Null
        End If
    End Function

LINQ to SQL bug

Came across an annoying bug in LINQ to SQL today. I have a process that uploads orders to a supplier service via the web. The process used work fine before I moved to LINQ to SQL.. now for some reason it wasn’t updating the order state.

I checked and rechecked my code – I had a loop that ran

 order.SetStatusAsSent()

Stepping thru the code I could see this had the simple code

  Me.OrderStateID = 2 ‘ 2 represents the order state ‘sent’
  Me.OtherInfo = "Order sent"

Problem was that it wasn’t saving the OrderStateID change. I had a .SubmitChanges statement right after the loop.. but examining the SQL I see it only updated the OtherInfo field.

I eventually figured out (after some headscratching) that because I had loaded the related object to the OrderStateID – which is a foreign key field – using DataLoadOptions – it was ignoring the change.

I changed the code to load the related object thus:

  Dim newState = db.OrderStates.Single(Function(s) s.OrderStateID=2
  Me.OrderState = newState
  Me.OtherInfo = "OrderSent"

Now the code works as expected. A nasty one!

LINQ to SQL FAQ – Inserts and Database Defaults

It’s Not My DeFault

LINQ to SQL does not support SQL default values on columns. For example, many of my tables have a "CreatedDate" which has the SQL default value of GetDate(). The column does not allow NULL values, so the LINQ to SQL class is DateTime, rather than DateTime?

The result of this is that since LINQ to SQL does not know about default values, it tries to insert the value of the underlying field even if it’s not been set – for example a DateTime that is essentially "zero" (DateTime.MinValue). This will result in a SQL error since SQL does not support a "zero" value date.

Modifying the DBML

One solution suggested here is that the columns with defaults are set to IsDbGenerated = "true". This means that LINQ to SQL will not attempt to insert a value from an object, and will always use the database value. This works – but it also means that the field is now effectively read-only as well, as any attempt to edit it will result in a LINQ error.

A second problem with this solution is that if you refresh your model from the database this manual setting (I have some code that does this) it will be overwritten. This makes keeping the setting and code stable a problem.

OnCreated solution

I used to think the alternative approach I found was suitable, which was to use the extensibility method OnCreated in any object with a default. You simply set the default value here so that any new instance would have it set, e.g.

Private Sub OnCreated()
    Me.CreatedDate = Now
End Sub

This sets the value when a new object is created as LINQ calls OnCreated from the constructor. And when LINQ loads an object from the database the .CreatedDate is overwritten with the actual value.

That seemed to work quite well but I came across a problem. I had a form databound to a LINQdataSource control. The CreatedDate value was not in the form, so in theory should be untouched. However, when LINQdataSource does an update, it creates an empty instance of your object, then copies the bound properties from the form into this object. It then compares the original object (loaded from the database) with this new object, and modifies the values of that, then updates.

In doing this it thinks that CreatedDate has been modified since it has been changed – and so the "default" field values get modified unintentionally.

Insert<Entity> Solution

The solution I eventually found was to use the Insert{Entity} extensibility method which lives in the DataContext. This is called by LINQ to SQL just before any Insert which is caused whenever .SubmitChanges is called. So for a class Customer we have

Private Sub InsertCustomer(instance as Customer)
    If instance.CreatedDate = DateTime.MinValue Then _
       
instance.CreatedDate = Now
    ‘tell datacontext to execute insert
    ExecuteDynamicInsert(instance)

End Sub

This will first check that the CreatedDate has a "blank" value, and will set the value if this is the case. This is only called for Inserts so will not affect edits or other operations. This supports both the manual setting of a value, and also duplicates the "default" behaviour that I previously had SQL Server perform.

Don’t forget the ExecuteDynamicInsert(instance) bit at the end – the Insert{Entity} method overrides the default LINQ to SQL DataContext insert so you have to explicitly insert the new record yourself.

LINQ to SQL FAQ – Beware of the Trigger

I love triggers (and tiggers) but they can have nasty side-effects.

My current app has an issue logger with an Issues table and an IssueActivity sub-table. When trying to insert a new activity on an issue I was getting a "Row not found or changed" exception. Since both tables had timestamp columns it was really puzzling. I wrote a unit test, and traced the SQL being generated. Everything looked okay in the SQL, but the code failed when it ran.

Finally I stepped through the code checking what the in-memory classes had with the database version said, and noticed the timestamp on the Issue changed after I inserted the IssueActivity.

The cause was an insert trigger on IssueActivity that updated the Issue.ModifiedDate. This meant the timestamp on the database was modified, but the LINQ in-memory version was now out of date. It was essentially, doing its job.

The simple solution was to use db.Refresh(..) on the issue to tell it to reload the changed values.

LINQ to SQL FAQ – mapping functions and methods

When trying to write a LINQ to SQL query you have to be mindful of what is supported as a translation from VB/C# into SQL. For example you can specify somestring.Trim() and this is translated into LTRIM(RTRIM(SOMESTRING)).

A lot of the time a suck-it-and-see approach is okay, but it can introduce runtime errors into programs. Fortunately there is a reference guide to help at http://msdn.microsoft.com/en-us/library/bb386970.aspx 

One interesting find here is the SQLmethods class. This allows you to perform specific SQL methods that have no direct translation from CLR into LINQ to SQL.

Most of these are DateDiff computations, to permit you to compute the differences between two date values directly in SQL.