Saturday, October 29, 2016

SQL Server GUID and UuidCreateSequential sorting

Recently I worked on an issue which was related with SQL Server uniqueidentifier columns. Almost all tables in our system used a uniqueidentifier column as the primary key and clustered index, but we didn't use NEWSEQUENTIALID as the default value for the column. Instead we was using UuidCreateSequential() to generate sequential IDs in the .NET code. However, the generated Guids were not really properly sorted in SQL Server. I started to look into how these sorting algorithms work.

1. SQL Server Guid sorting

I created a list of Guids in SQL Server and this is the sorting result, from less to more:

01000000-0000-0000-0000-000000000000
00010000-0000-0000-0000-000000000000
00000100-0000-0000-0000-000000000000
00000001-0000-0000-0000-000000000000
00000000-0100-0000-0000-000000000000
00000000-0001-0000-0000-000000000000
00000000-0000-0100-0000-000000000000
00000000-0000-0001-0000-000000000000
00000000-0000-0000-0001-000000000000
00000000-0000-0000-0100-000000000000
00000000-0000-0000-0000-000000000001
00000000-0000-0000-0000-000000000100
00000000-0000-0000-0000-000000010000
00000000-0000-0000-0000-000001000000
00000000-0000-0000-0000-000100000000
00000000-0000-0000-0000-010000000000

Alberto Ferrari's great post How are GUIDs sorted by SQL Server listed the sorting order. I think the opposite way, i.e. from the most important to least important, can help me understand the order better. When we compare things, the logic way is comparing from the most important to the least important. So I slightly modified the order as this:

  • A..F are evaluated in left to right order and are the most important, then
  • 8..9 are evaluated in left to right order, then
  • 6..7 are evaluated in right to left order, then
  • 4..5 are evaluated in right to left order, then
  • 0..3 are evaluated in right to left order and are the less important

2. UuidCreateSequential() Generated Guid sequence

I wrote a test program to generate a series of Guids with UuidCreateSequential(), and here are some examples:

afee48fe-9abe-11e6-9697-005056c00008
afee48ff-9abe-11e6-9697-005056c00008
afee4900-9abe-11e6-9697-005056c00008
afee4901-9abe-11e6-9697-005056c00008

fe89491d-9adc-11e6-9697-005056c00008
fe89491e-9adc-11e6-9697-005056c00008
052ccf38-9add-11e6-9697-005056c00008
052ccf39-9add-11e6-9697-005056c00008

We can see in bytes 0..3 and 4..5, the significance is from left to right, and the right is less important. Since it was difficult to generate Guids to reflect other bytes, I referenced some posts, Unraveling the mysteries of NEWSEQUENTIALID and How to generate sequential GUIDs for SQL Server in .NET, to get the conclusions that the sequence is like the following:

  • A..F are evaluated in left to right order and are the most important, then
  • 8..9 are evaluated in left to right order, then
  • 6..7 are evaluated in left to right order, then
  • 4..5 are evaluated in left to right order, then
  • 0..3 are evaluated in left to right order and are the less important
So basically in each group, the order always is from left to right, from most important to least important.

3. Guid comparison in .NET code

When I worked on getting the sequence generated by UuidCreateSequential(), I thought I could use Guid.CompareTo() to compare a list of Guids.  I used the Bubble Sort to sort the list and got the following order, from small to big:

00000000-0000-0000-0000-000000000001
00000000-0000-0000-0000-000000000100
00000000-0000-0000-0000-000000010000
00000000-0000-0000-0000-000001000000
00000000-0000-0000-0000-000100000000
00000000-0000-0000-0000-010000000000
00000000-0000-0000-0001-000000000000
00000000-0000-0000-0100-000000000000
00000000-0000-0001-0000-000000000000
00000000-0000-0100-0000-000000000000
00000000-0001-0000-0000-000000000000
00000000-0100-0000-0000-000000000000
00000001-0000-0000-0000-000000000000
00000100-0000-0000-0000-000000000000
00010000-0000-0000-0000-000000000000
01000000-0000-0000-0000-000000000000

To surprise me, this order is totally different with previous two.  It seems just simply compare byte by byte, from left to right, from more important to less important. We can check this MSDN to see the explanation.  This means when we compare 2 Guids generated by UuidCreateSequential(), the later one could be less than the former one.  It looks weird to me.  The simple conclusion to me is that we cannot use Guid.CompareTo() to decide the sequence generated by UuidCreateSequential().

4. Generate sequential GUIDs for SQL Server

After understanding these different sorting, it is easy to generate sequential IDs for SQL Server in .NET code. The post How to generate sequential GUIDs for SQL Server in .NET has the C# code.  Basically we just need to switch 0..3, 4..5 and 6..7 bytes and regenerate the Guid.

No comments: