2012-06-18

How to simulate the MySQL group_concat function in MS SQL Server

The issue: I am writing a web app to track agreements where each agreement can have one, or multiple,  account numbers associated with them, like so:

Agreement No. 1: Account Numbers = XYZ-123-123-123; XYZ-123-123-456; XYZ-123-123-789
Agreement No. 2:  Account Numbers = XYZ-123-123-333
Agreement No. 3: Account Numbers = XYZ-123-123-123; XYZ-123-123-789


I quickly decided to store the account numbers in a separate table so that each agreement could have a variable number of account numbers associated with it:

Agreement_Account_Numbers table
Agreement_ID   Account_Number
1                         XYZ-123-123-123
1                         XYZ-123-123-456
1                         XYZ-123-123-789
2                         XYZ-123-123-333
3                         XYZ-123-123-123
3                         XYZ-123-123-789

But then I ran into the problem of how to display all of an agreements account numbers in a table listing a number of agreements. I wanted an SQL query that would return one agreement per line, with all of the account numbers for each agreement concatenated in a single field.  In MySQL this would be easy: do JOIN query with the agreements table linked by Agreement_ID to the  on the agreement ID and then GROUP BY Agreement_ID and use the group_concat function to concatenate the Account_Number field.  However, for this project I have to work with MS SQL Server and it apparently does not have any such function.  Googling around led me to this excellent question on stackoverflow:

Simulating group_concat MySQL function in MS SQL Server 2005?

It took me a while to sort out which answer to this question was easiest to implement, so I am writing down what worked for me for my own future reference.  Here is what I ended up with:

 SELECT
     Agreement_ID,
     STUFF(
         (SELECT '; ' + Account_Number
          FROM agreement_account_numbers
          WHERE agreement_id = a.agreement_id
          FOR XML PATH (''))
          , 1, 1, '')  AS Account_Numbers
FROM Agreement_Account_Numbers AS a
GROUP BY agreement_id

The result of this query is:

Agreement_ID      Account_Numbers
1                           XYZ-123-123-123; XYZ-123-123-456; XYZ-123-123-789
2                           XYZ-123-123-333
3                           XYZ-123-123-123; XYZ-123-123-789

This query uses a sub-select clause to retrieve all of the account numbers for each agreement by linking the Agreement_Account_Numbers table back to itself (by giving the table the alias "a") and then putting sub-select results in a string using the MS SQL Server specific command FOR XML PATH ('').  Then the STUFF function replaces the first character in the sub-select result string with a zero length string ('') to strip off the leading semi-colon that would otherwise be in front of the first account number.

The command is actually FOR XML with the PATH option specified, and then the tag to surround each row specified to be nothing ('').  I tried figure out exactly now FOR XML works, and what the PATH option means, but as usual I found the MS documentation to be completely cryptic so I just accepted that it works.