In my article, Missing CLR Part 2, I called mdq.XmlTransform the “baddest CLR ever” because it adds a ton of new functionality to your SQL Server instance. In this article I will demonstrate some of that functionality. For many, the concept of XML transforms is new so I’ll keep it basic. The purpose of this article is not to sell you on XML transforms but rather to inform you that this technology is available. Hopefully, as more people see the potential of mdq.XmlTransform, more developers will take some of these ideas and run with them.
XSLT is about transforming Data
The conventional wisdom is that XSLT is used to transform XML into something else (such as a new XML file or text). This is only part of the truth. XSLT is for transforming data into data; that’s what it does. This means that you are not limited to XML as input or as output. XSLT can receive text, numbers (of any kind), dates, XML files, XML fragments, delimited strings, XHTML, HTML, etc, then transform that data, and then return text, numbers, dates, XML, HTML, or pretty much anything else you need.
mdq.XmlTransform can do math, parse strings, perform date calculations and almost anything else SQL can do to data. XSLT solves many complex XML problems better than T-SQL's limited XPath & XQuery functionality. It can also solve many problems SQL cannot solve yet and, unlike any Microsoft language, you can create new XSLT functions without creating new CLRs or compiling more code (as would be the case with LINQ and F#).
A quick note about performance
For many this is a new topic so I am not going to discuss performance considerations yet. There is plenty I can do to make the test code below run much faster but, for the sake of simplicity, I will table that for a future post.
Transforming XML Into SQL
In Setting up mdq.XmlTransform I showed you how to use XSLT to calulate a factorial. I did this to demonstrate recursion and computability using XSLT. Today I will show you the type of task that XSLT does better than regex, padindex or anyting else: transforming XML into SQL.
To demonstrate, I grabbed the HTML code from this Onion article and structured it similar (kinda) to an RSS XML feed (namespaces and CDATA removed for simplicity). The resulting XML code looks like this:
I also dirtied up the HTML code to look like the nonsense I often have to deal with (lot's stuff like brackets ([ ]), % signs and other things that make this tricky for T-SQL and Regular Expressions.)
Let's say you are asked to normalize the XML data posted above. In, other words, the requirement is to take that XML data, strip the tags and then insert the data into a couple tables like so:
How would you do that?
This would be extremely difficult if not impossible using anything else yet it's quite simple using XSLT. First we create two XSLT functions - one to strip the tags, another to clean up the white space like this:
Once my XSLT functions were ready I would pass them to mdq.xmlTransform like this:
... and that's it.
One thing to note is that I used a mdq.split, a CLR splitter function. You don't need a CLR splitter for this; that was the one I was using when I tested this code.
But what about portability?!?!
That's that best part. The XSLT functions I wrote were developed using XSLT 1.0. Not some "flavor" of XSLT, not T-XSLT, XSLT.NET-sharp, PL/XML or DBTransforms2 (I made those up). The XML format is version 1.0. People have been mastering XSLT 1.0 since last century (1999) because it works, it's proven and isn't going anywhere. Yeah, it's a little tricky but this opensource, non-proprietary functional coding goodness is available to you and your SQL instance at no cost and without compiling a new DLL or any new SQL objects. I have done transforms in MYSQL and Oracle using code I originally wrote for SQL Server and did not need to change a single line. That's what portability sounds like to me.
mdq.xmlTransform is a powerful tool; in this post I demonstrated how to use it to transform XML into relational data. As I said earlier, XSLT is about transforming data not just XML. If you don't like XML then think of mdq.xmlTransform as a way to turn your XML into SQL, CSV or anything else. XSLT 3.0 has great support for JSON. More about that in future posts. Thanks for reading!
P.S. Two Declarative programming languages, 0 loops!
Last Updated: 4/27/2013 (Posted, re-posted with typo's fixed and extra code)