Newsequentialid error validating

posted by | Leave a comment

The following SQL will give us stats on all four tables: name Fragmentation Percent fill_factor page_count PK__My Guid Se__3214EC276477ECF3 0.71 0 3096 PK__My Guid__3214EC275FB337D6 99.08 0 4343 PK__My Int__3214EC275812160E 0.44 0 1608 PK__My Big Int__3214EC275BE2A6F2 0.48 0 2101 There's a tiny amount of fragmentation on the INT and BIGINT tables, and a little bit more for UNIQUEIDENTIFIER when NEWSEQUENTIALID() is used, but as you can see, the My Guid table (which uses NEWID() as the column default value) is almost 100% fragmented. Fragmentation slows down the retrieval of data from the table, meaning the SQL query engine has more work to do when it needs to scan the index. Well, NEWID() produces a pseudo-random GUID which is pretty much unpredictable, and 100% non-sequential.By their very nature, non-sequential rows are inserted into the middle of an index, whereas sequential rows are generally appended to the end of an index.However, if your reason for using a GUID is that you want a unique identifier that's difficult to guess, then you're shit out of luck.Unfortunately NEWSEQUENTIALID() generated identifiers are predictable. NEWSEQUENTIALID() also inherently helps to keep indices contiguous - filling data and index pages fully before creating a new page.Index fragmentation occurs when a new page is inserted into an existing index, and the index does not have space for the page.Pages can be inserted or moved around for lots of different reasons, but the most common are inserting new records, and updating indexed columns in existing records.So then, to make it űberfair, let's take the My Guid Seq table and work out how much more storage is uses than the My Big Int table: 24,986 - 16,904 = 8,064Kb extra storage per 1 million rows at its best, 21,824Kb at it's worst.Not much on its own, but consider that a table is likely to be much larger than that, and using the column as a foreign key in other tables means your storage requirements will rapidly expand.

In fact, I see this more often than you would expect, and misconfigured UNIQUEIDENTIFIER columns can create "hidden" problems that can be difficult to discover and / or rectify, depending on the SQL experience throughout your team.

As it stands, this doesn't mean too much - as you'd want to avoid getting to 99% fragmentation anyway (by managing any indices that become more than 5% fragmented - but that's a whole other article).

So, I also ran the above test with 5% and 30% fragmentation values, and both got to 324 records before becoming fragmented.

However, it stands to reason that - due to the fact NEWSEQUENTIALID() is reset when Windows is restarted (and can subsequently start at a lower GUID seed value than the last seed) - that the use of NEWSEQUENTIALID() can still insert pages into the middle of an index, fragmenting the index quicker than you otherwise would with an INT or BIGINT.

Also worth mentioning - if you ever update a UNIQUEIDENTIFIER column that is indexed, even if you use NEWSEQUENTIALID(), it will increase fragmentation quicker over time.

Leave a Reply

updating bitdefender