Craig Rowe

Techlead / Developer

5th November 2009

Xml Explicit Always Sucks

  1. 1) Simple Database
  2. 2) Syntax Recap
  3. The Ordering
  4. Desired Output
  5. Table resultset
  6. Final Thoughts

SQL Server's Xml Explicit has always managed to annoy me. It's one of those things you don't do often enough to have it solidified in your mind... So here's a quick example to refresh both yours and my mind:

1) Simple Database

Contributor
PK ContributorId int
Name nvarchar(50)
Nickname nvarchar(50)
ContributorEmail
PK ContributorEmailId int
FK ContributorId int
Email nvarchar (1024)
ContributorUrl
PK ContributorUrlId int
FK ContributorId int
Url nvarchar (2048)

2) Syntax Recap

Most examples talk about the resultset order being important, and we'll address that, but lets recap the syntax first. A simple query for XML explicit could be:

            SELECT 1 as Tag,
              NULL as Parent,
              NULL AS [contributors!1],
              NULL As [contributor!2!id],
              NULL As [contributor!2!name],
              NULL as [contributor!2!nickname],
            FROM Contributor As contributor
            UNION
            SELECT 2 As Tag,
              1 as Parent,
              NULL AS [contributors!1],
              contributor.ContributorId As [contributor!2!id],
              contributor.Name As [contributor!2!name],
              contributor.Nickname as [contributor!2!nickname],
            FROM Contributor As contributor
            
            -- Outputs (without For Xml Explicit)
            -- ========
            -- Tag  Parent
            -- 1	  NULL	NULL	NULL	NULL	  NULL	NULL	NULL	NULL	NULL
            -- 2	  1	    NULL	1	    C Rowe	Craig	NULL	NULL	NULL	NULL
            -- 2	  1	    NULL	2	    M Pike	Matt	NULL	NULL	NULL	NULL
          
fig. 2.0

Each query being unioned represents a single element in the eventual xml heirachy. 'Tag' and 'Parent' are mandatory for SQL Server to identify the structure you intend to create for your xml document. The other items in the select list are used for each node name, it's attributes and child nodes. All nodes and attributes need to be represented in the full resultset so they each need to be referenced in all unioned queries (even as null on the nodes that don't require them).

The names of the resultset columns are used to parse the normal table output into an XML document. In brief:

  • The names are enclosed in square brackets.
  • The name of the element is first, separated by an exclamation mark, followed by the node number.
  • This is then optionally followed by another separator before the attribute name.
  • The attribute name can optionally be followed by !element to force it to show as a child element rather than an attribute or !hide to hide from results (useful for sorting).
  • !!cdata can be used to declare an item as cdata.

So, in fig 2.0 above, Tag 1 has no parent and it's element name is 'contributors'. A second tag is declared as 'contributor' with three attributes id, name and nickname. These results are nested so tag 2 is within tag 1 (if there were a tag 3 it would be within tag 2).

The Ordering

The important thing to ensure is that as you look down the resultset the row above the current row either has the same Parent value (is a sibling element) or has as its Tag value the Parent value of below (is the parent node). An incorrect order will result in the following error:

Parent tag ID is not among the open tags.

Desired Output

The XML we want from the database is (comments indicate tag numbers in relation to the query below):

            <contributors> <!-- This is tag #1 -->
              <contributor id="1" name="C Rowe" nickname="Craig"> <!-- This is tag #2 -->
                <urls> <!-- This is tag #3 -->
                  <url>http://cargowire.net</url> <!-- This is tag #5 -->
                </urls>
                <emails> <!-- This is tag #4 -->
                  <email>[email protected]</email> <!-- This is tag #6 -->
                  <email>[email protected]</email> <!-- This is tag #6 -->
                </emails>
              </contributor>
              <contributor id="2" name="M Pike" nickname="Matt" /> <!-- This is tag #2 -->
            </contributors>
          
fig. 1.0

Notice the node numbering increments for each differently named node and for each indentation i.e. no tab indent has one distinct element name (contributors) this is node 1. Single tab indent has one distinct name (contributor) and so is node 2. At two levels of indent 'urls' and 'emails' become node 3 and 4 respectively leaving three indents in 'url' and 'email' to be 5 and 6.

Query

The query below can be used on the above database to display the desired output. There may well be a more succinct way of doing this, but I like the readability.

            -- Section 1: Outer select allows for ordering at end
            SELECT Tag, Parent,
              [contributors!1],
              [contributor!2!id],
              [contributor!2!name],
              [contributor!2!nickname],
              null as [emails!3],
              null as [urls!4],
              [email!5],
              [url!6]
            FROM (
            -- end section 2
              -- Section 2: Create the overall parent element named 'contributors'
              -- This has null values for all other fields as it has no attributes
              SELECT 1 as Tag,
                NULL as Parent,
                NULL AS [contributors!1],
                NULL As [contributor!2!id],
                NULL As [contributor!2!name],
                NULL as [contributor!2!nickname],
                NULL As [emails!3],
                NULL As [urls!4],
                null AS [email!5],
                null AS [url!6]
              FROM Contributor As contributor
              -- end section 2
              UNION
              -- Section 3: Select all the contributor nodes with attributes
              -- for id, name and nickname
              SELECT 2 As Tag,
                1 as Parent,
                NULL AS [contributors!1],
                contributor.ContributorId As [contributor!2!id],
                contributor.Name As [contributor!2!name],
                contributor.Nickname as [contributor!2!nickname],
                NULL As [emails!3],
                NULL As [urls!4],
                NULL As [email!5],
                null As [url!6]
              FROM Contributor As contributor
              -- end section 3
              UNION
              -- Section 4: Create 'emails' container nodes, inner join to ContributorEmail to ensure
              -- no 'emails' container if no 'email' nodes to fill it later
              SELECT 3 As Tag,
                2 as Parent,
                NULL AS [contributors!1],
                contributor.ContributorId As [contributor!2!id],
                contributor.Name As [contributor!2!name],
                contributor.Nickname as [contributor!2!nickname],
                'emails' As [emails!3],
                NULL As [urls!4],
                NULL As [email!5],
                null As [urls!6]
              FROM Contributor As contributor
                INNER join ContributorEmail AS email on Email.ContributorId = contributor.ContributorId
                -- end section 4
              UNION
              -- Section 5: Create 'urls' container nodes, inner join to ContributorUrl to ensure
              -- no 'urls' container if no 'url' nodes to fill it later
              SELECT 4 AS Tag,
                2 as Parent,
                NULL AS [contributors!1],
                contributor.ContributorId As [contributor!2!id],
                contributor.Name As [contributor!2!name],
                contributor.Nickname as [contributor!2!nickname],
                NULL As [emails!3],
                'urls' As [urls!4],
                NULL As [email!5],
                null As [url!6]
              FROM Contributor As contributor
                INNER join ContributorUrl As url on Url.ContributorId = contributor.ContributorId
               -- end section 5
              UNION
              -- Section 6: Select all email nodes with the 'emails' tag as parent
              SELECT 5 AS Tag,
                3 as Parent,
                NULL AS [contributors!1],
                contributor.ContributorId As [contributor!2!id],
                contributor.Name As [contributor!2!name],
                contributor.Nickname as [contributor!2!nickname],
                'emails' As [emails!3],
                NULL As [urls!4],
                email.Email As [email!5],
                NULL As [url!6]
              FROM Contributor As contributor
                INNER join ContributorEmail AS email on Email.ContributorId = contributor.ContributorId
              -- end section 6
              UNION
              -- Section 7: Select all url nodes with the 'urls' tag as parent
              SELECT 6 AS Tag,
                4 as Parent,
                NULL AS [contributors!1],
                contributor.ContributorId As [contributor!2!id],
                contributor.Name As [contributor!2!name],
                contributor.Nickname as [contributor!2!nickname],
                NULL As [emails!3],
                'urls' As [urls!4],
                null As [email!5],
                url.url As [url!6]
              FROM Contributor As contributor
                INNER join ContributorUrl As url on Url.ContributorId = contributor.ContributorId
              -- end section 7
            ) as Contributors
            -- Section 8: By Ordering by contributor id followed by emails and urls we ensure
            -- the correct order to avoid the 'parent tag is not among the open tags' problem
            ORDER BY [Contributor!2!id], Contributors.[emails!3], Contributors.[urls!4]
            FOR XML EXPLICIT
          
fig. 3.0

Table resultset

Note the ordering of the Tag and Parent columns and the way the 'emails' and 'urls' columns can be used to assist this ordering
Tag Parent [contributors!1] [contributor !2!id] [contributor !2!name] [contributor !2!nickname] [emails!3] [urls!4] [email!5] [url!6]
1 NULL NULL NULL NULL NULL NULL NULL NULL NULL
2 1 NULL 1 C Rowe Craig NULL NULL NULL NULL
4 2 NULL 1 C Rowe Craig NULL ('urls') NULL NULL NULL
6 4 NULL 1 C Rowe Craig NULL ('urls') NULL NULL http://cargowire.net
3 2 NULL 1 C Rowe Craig NULL ('emails') NULL NULL NULL
5 3 NULL 1 C Rowe Craig NULL ('emails') NULL [email protected] NULL
5 3 NULL 1 C Rowe Craig NULL ('emails') NULL [email protected] NULL
2 1 NULL 2 M Pike Matt NULL NULL NULL NULL

Final Thoughts

Xml Explicit is particularly powerful and is often the easiest course of action if a service is designed with Xml in mind. However it can lead to large complex queries. Users should consider the use of temporary tables if multiple unions occur on laborious queries.

All article content is licenced under a Creative Commons Attribution-Noncommercial Licence.