GUID is not Always GOOD !!!!!!! (a true RTFM story)

A collegue of mine asked me some questions about a clustered index. He was preparing for a SQL Exam and he had trouble understaning some concepts. He had trouble with the concepts not because they were too difficult for him, but they were completely opposite to what he experienced in everyday programming life !!!!

To make a long story short.

A lot of programmers have a GUID as a Primary key.

In the Books online that come with SQL Server there is the following statement :

PRIMARY KEY constraints create clustered indexes automatically if no clustered index already exists on the table and a nonclustered index is not specified when you create the PRIMARY KEY constraint

So if there is no other index on a table, the primary key will be clustered. (so far nothing wrong) But if you have a GUID as a PK then let's have a look at what the books online say about the UniqueIdentifier.........

The uniqueidentifier data type has several disadvantages:

  1. The values are long and obscure. This makes them difficult for users to type correctly, and more difficult for users to remember.
  2. The values are random and cannot accept any patterns that may make them more meaningful to users.
  3. There is no way to determine the sequence in which uniqueidentifier values were generated. They are not suited for existing applications that depend on incrementing key values serially.
  4. At 16 bytes, the uniqueidentifier data type is relatively large compared to other data types such as 4-byte integers. This means indexes built using uniqueidentifier keys may be relatively slower than implementing the indexes using an int key. Consider using the IDENTITY property when global uniqueness is not necessary, or when having a serially incrementing key is desirable.

Besides That. There is also another Disadvantage on having a GUID as the primary key. The data type is relatively large......(also from the books online)

Wide keys , The key values from the clustered index are used by all nonclustered indexes as lookup keys and therefore are stored in each nonclustered index leaf entry.

So every record refers to it's position in the table with the GUID. Since a guid is 16 bytes and an ID is 4 the index with the GUID is much larger and will there therefore be slower.....

Hope this is usefull for someone......


Patrick

 

 

Published Mon, Mar 15 2004 11:36 AM by Patrick Wellink
Filed under:

Comments

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 1:07 PM by Patrick Wellink
Great summary, but I think that having GUIDs as a key have some great advantages as well, like:

- garanteed uniqueness across applications/servers
- support for key initialisation by the client (something you can't do with identities).

I wouldn't like to give that up to get smaller indexes. How often is bare-bone performance the one and most important required feature of a system anyway? Not often, I think.

Anyway, what was the question that started this discussion?

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 1:22 PM by Patrick Wellink
Carlo.......

About Statement nr 1
----------------------

- garanteed uniqueness across applications/servers

Don't Worry let MS-SQL take care of that. Or do you mean a Replicated Scenario.... Because this is one of the few scenario's where guid's are ok to use as a PK.

Statement Nr 2
_-----------------

- Support for key initialisation by the client

This is something you don't need. Have the sproc to create a record return the scope_identity() and everything should work fine.

Sure sure when you use dynamic SQL it could be a little tricky....

BUT THIS IS SOMETHING THAT SHOULD BE AVOIDED. Always use stored procedures to execute an insert.

Let the database worry about ID's and integity. Choosing a GUID will work for a small application. But when performance is an issue FORGET IT.

Choosing a GUID as PK can have SERIOUS implications in large tables. ( A simple insert that takes 30 seconds )

So forget the GUID and do it the proper way.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 2:45 PM by Patrick Wellink
Sorry? Forget about GUIDs? I couldn't disagree more.

I do agree that you should take care when using GUIDs, like you should take care when using any technology. But by declaring GUIDs dead, you miss the point here.

It's like saying that starting too many threads will slowdown the OS and then declare multi-threading a bad technology.

I think that great programmers use common sense to decide instead of this kind of generic rules based only on raw performance. Sometimes it's better to use GUIDs, sometimes it's not. Not all tables will grow to over a million rows.

BTW: It's quite a statement to say that just the choice of key type will lead up to 30 second inserts. Where did you find/experience this figures?

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 3:29 PM by Patrick Wellink
Well try it yourself. Create a table with a couple of fields (20 or 30 or so) and create a clustered index on the GUID.

Now start inserting records..........
Not just 10 or 20 but 100.000 at a time and watch performance degrade.....

And yes. Also for small files try to avoid a guid...

For example We create a table Countries with three fields

1 GUID
2 ISO CODE
3 DESCIPTION

There is nothing wrong with this file whatsoever I completely agree with you. But now we have a customer table with 3 million records that has a countrycode.....

By using the GUID we have increased the table (on disk) with 3.000.000 customers * 16 bytes = 48 Million bytes

By using an ID the table would be 3.000.000 * 4 bytes = 12 Million Bytes

So a difference of 36 Million Bytes.

And that is only for one field..... Suppose there is also a language table, a currency table etc......

I am not declaring the GUID dead. I only tell people to use it when there is a Reason for it.....

And Performance problems always come after the programmers are gone and the files start to grow.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 3:50 PM by Patrick Wellink
Okay, I completely agree with you there.

You can argue about the meaning of winning 36 MB when a regular harddisk is 20 GB or over, but that's too easy ;-). When you expect 1 million rows, you should better take this into consideration.

But you just can't wave away the benefits of a GUID, both for me as a developer as for you as a DBA.

For me as a developer it's very handy to know an entities key at the moment it's created, and not have to wait untill it's persisted to the database. A GUID gives you that possibility.

For you as a DBA, you must have been in a situation where you wanted to migrate a portion of staging data to production with a new release and you couldn't do it right away because keys overlapped. At least happened to me a couple of times in the last few years.

These are things that define the overall maintainability of the system and I think that is as important as it's performance.

Saying that, I think we agree more then I initially thought, which is good to know... ;-)

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 3:51 PM by Patrick Wellink
PROOF.

Run 1000 inserts on an identical table only the ID column Differs. And in the small test the difference was almost 5 seconds.....

ok here are the results :
------------- -----------
DURATION GUID 14500
----------- -----------
DURATION ID 9890

Don't believe me ????

try it yourself......

SET NOCOUNT ON

DECLARE @START DATETIME
DECLARE @END DATETIME
DECLARE @COUNTER INT

CREATE TABLE TESTTABLEGUID
(
ID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
FIELD1 CHAR(200) DEFAULT NEWID(),
FIELD2 CHAR(200) DEFAULT NEWID(),
FIELD3 CHAR(200) DEFAULT NEWID(),
FIELD4 CHAR(200) DEFAULT NEWID(),
FIELD5 CHAR(200) DEFAULT NEWID(),
FIELD6 CHAR(200) DEFAULT NEWID(),
FIELD7 CHAR(200) DEFAULT NEWID(),
FIELD8 CHAR(200) DEFAULT NEWID(),
FIELD9 CHAR(200) DEFAULT NEWID(),
FIELD10 CHAR(200) DEFAULT NEWID(),
FIELD11 CHAR(200) DEFAULT NEWID(),
FIELD12 CHAR(200) DEFAULT NEWID(),
FIELD13 CHAR(200) DEFAULT NEWID(),
FIELD14 CHAR(200) DEFAULT NEWID(),
FIELD15 VARCHAR(200) DEFAULT NEWID()
)



SET @START = GETDATE()
SET @COUNTER = 1
WHILE @COUNTER < 1000
BEGIN
INSERT TESTTABLEGUID DEFAULT VALUES
SET @COUNTER = @COUNTER + 1
END
SET @END = GETDATE()
SELECT 'DURATION GUID',DATEDIFF(MS,@START,@END)
DROP TABLE TESTTABLEGUID
-- IDENTICAL CODE BUT NOW WITH AN ID
DECLARE @START1 DATETIME
DECLARE @END1 DATETIME


CREATE TABLE TESTTABLEID
(
ID INT IDENTITY PRIMARY KEY,
FIELD1 CHAR(200) DEFAULT NEWID(),
FIELD2 CHAR(200) DEFAULT NEWID(),
FIELD3 CHAR(200) DEFAULT NEWID(),
FIELD4 CHAR(200) DEFAULT NEWID(),
FIELD5 CHAR(200) DEFAULT NEWID(),
FIELD6 CHAR(200) DEFAULT NEWID(),
FIELD7 CHAR(200) DEFAULT NEWID(),
FIELD8 CHAR(200) DEFAULT NEWID(),
FIELD9 CHAR(200) DEFAULT NEWID(),
FIELD10 CHAR(200) DEFAULT NEWID(),
FIELD11 CHAR(200) DEFAULT NEWID(),
FIELD12 CHAR(200) DEFAULT NEWID(),
FIELD13 CHAR(200) DEFAULT NEWID(),
FIELD14 CHAR(200) DEFAULT NEWID(),
FIELD15 VARCHAR(200) DEFAULT NEWID()
)



SET @START1 = GETDATE()
SET @COUNTER = 1
WHILE @COUNTER < 1000
BEGIN
INSERT TESTTABLEID DEFAULT VALUES
SET @COUNTER = @COUNTER + 1
END
SET @END1 = GETDATE()
SELECT 'DURATION ID',DATEDIFF(MS,@START1,@END1)
DROP TABLE TESTTABLEID

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 3:56 PM by Patrick Wellink
Carlo one more comment :

Quote : For me as a developer it's very handy to know an entities key at the moment it's created, and not have to wait untill it's persisted to the database. A GUID gives you that possibility.


THAT IS THE PROBLEM. You create a key and you don't want to wait until SQL has persisted the key...... INTEGRITY.......

The only way to know for SURE the key can be used is if it is persisted in the database. Otherwise the stuff can become corrupted.

And after the little test i hope you understand that me as a DBA am not happy with poor performing databases..... Better a good solid design from the start that will perform always.

Unfortunately when there is a Oracle DB involved everybody screams they need a DBA but with SQL everybody seems to know how things work.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 4:50 PM by Patrick Wellink
O jee, this can lead into an endless discussion ;-). But anyway, a little reaction on your note.

Integrity is a crucial thing, but that's exactly the reason why I want to know the key upfront. Because with that I can persist all related records in one batch in one transaction, instead of round-tripping the server for every record. I guess that even IMPROVES performance by shortening the life-time of the transaction. Of course, there are ways to get around the round-tripping, like using stored procedures, but that often gets complicated, especially in master-detail kind of scenario's.

But your performance figures are very impressive. You made your point about the GUIDs performance very clear and I now also feel one should take care about when to use GUIDs. Good job.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, March 15, 2004 4:55 PM by Patrick Wellink
Thanks carlo.

This is exactly why I wrote the article. Everybody uses the guids because they are so easy to use.

Sure you can use them, but with care and only when needed, not by default.

And indeed I will not discuss integrity with you.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Tuesday, March 30, 2004 9:51 AM by Patrick Wellink
I read this article just some minutes ago. Interesting topic that never ends. The comment of Carlo is what I would have written. But then my ramblings.. WHY would you put a clustered index on a PK? Choose a column where costly queries would benefit from. The insertdate for statistics for example or if data can be grouped it could be such an attribute like city, country or the combination gender with date of birth :)

Almost all projects I started were first using int/identities as PK's but most database were getting a replication treatment after some time. Using guid's as PK's is a wise decision when you know that database availability and growth are important issues and you don't know what kind of fail-over scenario will be used.

Using guid's has nothing to do with easier development in for example the middletier. I can only see a clear advantage for the dba with a minimal performanceloss. As said earlier the PK is 4 times bigger but this disadvantage does not compare to most total record lengths of tables that have lots of modifications compared to current hardware specs and prices, maintainability and performance gain in the applicationlogic.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Tuesday, March 30, 2004 11:36 AM by Patrick Wellink
If you create a table in SQL and you assign it a primary key, this becomes the default clustered index.

You have to change it in nonclustered.

The examples you give, CITY and Country are NO candidates at all for a CLUSTERED index. And indeed Only when you replicate you could use a GUID in a table.

And have you tried the query example ?. If it doe'n't make a big difference just increase the loop to 100.000 and see performance degrade.

The performance Loss is not minimal.

And if you know what you are doing you can use guids. (only in a replicated scenario). But unfortunately lot's of people design a table in the Enterprise Manager, they select a UniqeIdentifier as the primary key and then it becomes automatically a clustered index.

I hope this answers your question, Why put a Clustered index on a GUID.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Tuesday, March 30, 2004 6:29 PM by Patrick Wellink
Why wouldn't city and country not be a perfect candidate for a clustered index? It all depends of the costs of an insert againt the win of disk I/O with a select statement.

Ofcourse you will want columns in a clustered index that preferably won't change in time. But that is just preferable in most databases.

What I ment is that using guids as primarykey WITHOUT using it in an clusteredindex (because that would result in random datapage inserts thus resulting in huge performanceloss) relative to it's 4x size of a normal int does not result in a bad performance. You cannot test this in a queryanalyzer session just on SQL server because you must do these kind of test from a normal application that also generates the guid.

The performance loss is in most situation's neglectable because of application logic and network i/o.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Wednesday, March 31, 2004 8:49 AM by Patrick Wellink
No CITY and Country would neve be good clustered indexes.

As a rule of thumb, every table should have a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases--such as an identity column, or some other column where the value is increasing--and is unique. In many cases, the primary key is the ideal column for a clustered index.


City and country both share the same caracteristics. 'They don't Increase monotonically' and 'are not unique' that is why I would never create a clustered index on them.

Second....

I have seen the performance hit on a logging table. The logging table would grow to over 1.5 million records. The last insert took over 30 seconds and caused a Timeout.

So you really have to know hat you are doing if you are playing around with guids. If you do, it is ok to use them.


# Stirring up the GUID discussion

Tuesday, April 13, 2004 1:51 PM by TrackBack

# GUID is not always GOOD! (a true RTFM story)

Thursday, April 15, 2004 4:04 PM by TrackBack
GUID is not always GOOD! (a true RTFM story)

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, May 14, 2004 10:05 PM by Patrick Wellink
This is currently a heated discussion within our organization. I personally do not care if my PK is a guid or autonumber. I DO care about generating this ID in the middle tier instead of at the database level. Can you please elaborate on why this is a bad practice and should be avoided? I know it revolves around database integrity, but I can remember the exact issue. Any help or other refrences would be greatly appreciated.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, May 17, 2004 10:52 AM by Patrick Wellink
Well I think it is better to let the database decide what a key is than decide yourself.

But ok here is the underlying logic....

Since the database decides if a record is valid, only the database can assure me that a key is valid. And a key is only valid if a record was succellfully committed....

By assuming that a GUID is unique you think you have solved the problem of creating keys....

But a key is only valid if it is accepted by the database... (and commited) So this involves a round trip to make sure it is accepted.

Since you are doing a roundtrip anyway why not get the pk back from the database in that same roundtrip.

I know this is not true for specific situations, but this is the general rule

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, May 17, 2004 4:24 PM by Patrick Wellink
Patrick, thanks for the response, it has been very helpful. I too am also strongly opposed to the use of GUID's as a priamry key. As you have pointed out the disadvanteges far outweigh the advantages of global uniqueness. (Even though this too can easily be accounted for with ints as you have pointed out.) Here is my exact problem, our team is convinced that a round trip can be avoided through the use of guids and an isolation level of serializeable. We will be using serializeable transactions for both reads and writes. In this scenario, is there anything else you can think of that would be determental to data integrity? What about cascading updates and deletes? I just need some sort of concrete evidence to turn the tides and convince everyone once and for all. This thread has been extremely helpful and has made a huge impact in our organization. I would like to thank everyone who has responded.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, May 17, 2004 4:42 PM by Patrick Wellink
Run the query !!!!!

If you don't see a difference your machine is to fast, just increase the numbers a little

This whole discussion was about performance.

The reason why you shouldn't use GUID's is performance...

And I mean performance in life situations with lotsa records....

Sure sure in a test environment everything works fine, until you start inserting a serious amount of records.

Success

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, May 24, 2004 11:56 AM by Patrick Wellink
There seems to be some misunderstanding about advantages of clustering here - with a clustered index the index is the data so once the page entry of the index is found that page has the data - there are OFTEN times when a non-unique key is the best for clustering - e.g. take an app where you have a number of client's and their purchase history - chances are you will want to pull back full purchase history for each client - so on the purchase history table the id of the client would be the one to cluster - keeping all data for same client together in the same data pages so once SQL Server has retrieved a page most of the required data is there in the same page

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, June 04, 2004 8:49 PM by Patrick Wellink
One thing that I found VERY interesting wasn't the bulk insert times, but rather comparing the search times on large tables with guids vs ints. Making all things equal by creating non-clustered indexes on all tables, I found that the search were between 35-40% faster by using the integer index!!! This test involved 2 sets of 4 tables with 10,000 rows each. One set was for integer primary keys, another for guids. The first table represents an "accounts" table, and the other three were joined in one-one relationships based on the primary key.

Here is the modified script taken from the previous example.

SET NOCOUNT ON

-- IDENTICAL CODE WITH AN ID

DECLARE @START1 DATETIME
DECLARE @END1 DATETIME
DECLARE @COUNTER1 INT
DECLARE @ACCOUNTID INT

CREATE TABLE ACCOUNTIDTBL
(
ID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ACCOUNTCREATEDATE DATETIME DEFAULT GETDATE(),
ACCOUNTFIELD1 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD2 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD3 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD4 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD5 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD6 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD7 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD8 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD9 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD10 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD11 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD12 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD13 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD14 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD15 VARCHAR(200) DEFAULT NEWID()
)
CREATE TABLE TABLE1IDTBL
(
ID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ACCOUNTID INT,
TABLE1CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE1FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD15 VARCHAR(200) DEFAULT NEWID()
)
CREATE TABLE TABLE2IDTBL
(
ID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ACCOUNTID INT,
TABLE2CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE2FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD15 VARCHAR(200) DEFAULT NEWID()
)
CREATE TABLE TABLE3IDTBL
(
ID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ACCOUNTID INT,
TABLE3CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE3FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD15 VARCHAR(200) DEFAULT NEWID()
)

SET @START1 = GETDATE()
SET @COUNTER1 = 1
WHILE @COUNTER1 < 10000
BEGIN
INSERT ACCOUNTIDTBL DEFAULT VALUES
SET @ACCOUNTID = @@IDENTITY
INSERT TABLE1IDTBL (ACCOUNTID) VALUES (@ACCOUNTID)
INSERT TABLE2IDTBL (ACCOUNTID) VALUES (@ACCOUNTID)
INSERT TABLE3IDTBL (ACCOUNTID) VALUES (@ACCOUNTID)
SET @COUNTER1 = @COUNTER1 + 1
END
SET @END1 = GETDATE()
SELECT 'INSERT DURATION TIME USING ID',DATEDIFF(MS,@START1,@END1)

--End of insert, now do select benchmarking
SET @START1 = GETDATE()

SELECT *
FROM ACCOUNTIDTBL INNER JOIN
TABLE1IDTBL ON ACCOUNTIDTBL.ID = TABLE1IDTBL.ACCOUNTID INNER JOIN
TABLE2IDTBL ON ACCOUNTIDTBL.ID = TABLE2IDTBL.ACCOUNTID INNER JOIN
TABLE3IDTBL ON ACCOUNTIDTBL.ID = TABLE3IDTBL.ACCOUNTID
WHERE (ACCOUNTIDTBL.ID = @ACCOUNTID)
SET @END1 = GETDATE()
SELECT 'SELECT DURATION TIME USING ID',DATEDIFF(MS,@START1,@END1)



-- IDENTICAL CODE BUT NOW WITH A GUID

DECLARE @START DATETIME
DECLARE @END DATETIME
DECLARE @COUNTER INT
DECLARE @ACCOUNTGUID UNIQUEIDENTIFIER

CREATE TABLE ACCOUNTGUIDTBL
(
ID UNIQUEIDENTIFIER DEFAULT NEWID()PRIMARY KEY NONCLUSTERED,
ACCOUNTGUID UNIQUEIDENTIFIER,
ACCOUNTCREATEDATE DATETIME DEFAULT GETDATE(),
ACCOUNTFIELD1 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD2 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD3 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD4 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD5 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD6 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD7 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD8 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD9 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD10 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD11 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD12 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD13 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD14 VARCHAR(200) DEFAULT NEWID(),
ACCOUNTFIELD15 VARCHAR(200) DEFAULT NEWID()
)

CREATE TABLE TABLE1GUIDTBL
(
ID UNIQUEIDENTIFIER DEFAULT NEWID()PRIMARY KEY NONCLUSTERED,
ACCOUNTGUID UNIQUEIDENTIFIER,
TABLE1CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE1FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE1FIELD15 VARCHAR(200) DEFAULT NEWID()
)

CREATE TABLE TABLE2GUIDTBL
(
ID UNIQUEIDENTIFIER DEFAULT NEWID()PRIMARY KEY NONCLUSTERED,
ACCOUNTGUID UNIQUEIDENTIFIER,
TABLE2CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE2FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE2IELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE2FIELD15 VARCHAR(200) DEFAULT NEWID()
)

CREATE TABLE TABLE3GUIDTBL
(
ID UNIQUEIDENTIFIER DEFAULT NEWID()PRIMARY KEY NONCLUSTERED,
ACCOUNTGUID UNIQUEIDENTIFIER,
TABLE3CREATEDATE DATETIME DEFAULT GETDATE(),
TABLE3FIELD1 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD2 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD3 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD4 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD5 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD6 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD7 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD8 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD9 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD10 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD11 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD12 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD13 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD14 VARCHAR(200) DEFAULT NEWID(),
TABLE3FIELD15 VARCHAR(200) DEFAULT NEWID()
)

SET @START = GETDATE()
SET @COUNTER = 1
WHILE @COUNTER < 10000
BEGIN

SET @ACCOUNTGUID = NEWID()
INSERT ACCOUNTGUIDTBL(ACCOUNTGUID)VALUES(@ACCOUNTGUID)
INSERT TABLE1GUIDTBL(ACCOUNTGUID)VALUES(@ACCOUNTGUID)
INSERT TABLE2GUIDTBL(ACCOUNTGUID)VALUES(@ACCOUNTGUID)
INSERT TABLE3GUIDTBL(ACCOUNTGUID)VALUES(@ACCOUNTGUID)
SET @COUNTER = @COUNTER + 1
END
SET @END = GETDATE()
SELECT 'INSERT DURATION TIME USING GUID',DATEDIFF(MS,@START,@END)

--End of insert, now do select benchmarking
SET @START = GETDATE()

SELECT *
FROM ACCOUNTGUIDTBL INNER JOIN
TABLE1GUIDTBL ON ACCOUNTGUIDTBL.ACCOUNTGUID = TABLE1GUIDTBL.ACCOUNTGUID INNER JOIN
TABLE2GUIDTBL ON ACCOUNTGUIDTBL.ACCOUNTGUID = TABLE2GUIDTBL.ACCOUNTGUID INNER JOIN
TABLE3GUIDTBL ON ACCOUNTGUIDTBL.ACCOUNTGUID = TABLE3GUIDTBL.ACCOUNTGUID
WHERE (ACCOUNTGUIDTBL.ACCOUNTGUID = @ACCOUNTGUID)
SET @END = GETDATE()
SELECT 'SELECT DURATION TIME USING GUID',DATEDIFF(MS,@START,@END)

-- DROP TABLE ACCOUNTGUIDTBL
-- DROP TABLE TABLE1GUIDTBL
-- DROP TABLE TABLE2GUIDTBL
-- DROP TABLE TABLE3GUIDTBL
-- DROP TABLE ACCOUNTIDTBL
-- DROP TABLE TABLE1IDTBL
-- DROP TABLE TABLE2IDTBL
-- DROP TABLE TABLE3IDTBL

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, June 07, 2004 8:14 AM by Patrick Wellink
Thanks Chris,

Another reason to leave the guid.

GUID is not always bad, sometimes you have to but you need to have a real strong argument to use one.

# Reread post about GUID's

Monday, June 07, 2004 8:23 AM by TrackBack

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, July 23, 2004 11:42 PM by Patrick Wellink
Patrick,

I am about to use GUIDs since I have to support an online/offline briefcase model.

GUID is the only way I can create data in the offline mode without a trip to the server for a PK. The other option is to code a complex "pending" local database to run parallel with the server-approved local database or some-such malarky. Tell me if there are other solutions you have seen that are possible.

(There is enough data on the clients to have necessitated MSDE. I don't plan to use Sql Server's merge replication feature, though I am aware that it works and uses GUIDs. Our system is more real-time when online than merge-replication can offer. I have some data that is read-only for the remote users, and other that they must author -- use GUID only for the data they must author or standardize with GUIDs everywhere? I have control of schema.)

In my case, GUIDs seem the best choice.
Also, the data volume is not large whatsoever.

Any last suggestions, blessings, wardings?

Very good thread; a meaningful debate.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, July 23, 2004 11:43 PM by Patrick Wellink
Patrick,

I am about to use GUIDs since I have to support an online/offline briefcase model.

GUID is the only way I can create data in the offline mode without a trip to the server for a PK. The other option is to code a complex "pending" local database to run parallel with the server-approved local database or some-such malarky. Tell me if there are other solutions you have seen that are possible.

(There is enough data on the clients to have necessitated MSDE. I don't plan to use Sql Server's merge replication feature, though I am aware that it works and uses GUIDs. Our system is more real-time when online than merge-replication can offer. I have some data that is read-only for the remote users, and other that they must author -- use GUID only for the data they must author or standardize with GUIDs everywhere? I have control of schema.)

In my case, GUIDs seem the best choice.
Also, the data volume is not large whatsoever.

Any last suggestions, blessings, wardings?

Very good thread; a meaningful debate.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, July 26, 2004 7:33 AM by Patrick Wellink
Well if you feel you have to use GUIDs, use them, But one of the issue's in the debate was, to only use them when you really have to. So the second part of your question is a definite no. (should we use them everywhere)

In the scenario you describe I would probably use a GUID as well.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Thursday, September 02, 2004 12:22 AM by Patrick Wellink
Hi John,

Try using COMB GUIDs as described in
http://www.informit.com/articles/article.asp?p=25862&redir=1

It may help in increasing performance.

Could you let me know what replication technology you are using if you are not using Merge Replication...just curious.

Niben

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Thursday, September 02, 2004 9:18 AM by Patrick Wellink

Well I had a read of the article, interesting but it is still VERY undesirable to have them as a clustered primary key.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Thursday, September 02, 2004 6:19 PM by Patrick Wellink
Patrick, why is it undesirable to have as clustered pk?

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Thursday, September 02, 2004 6:59 PM by Patrick Wellink
Also, any thoughts about GUID as non-clustered index. It seems to work pretty well for me.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, September 03, 2004 9:10 AM by Patrick Wellink
Well Try it yourself....

Create a table with a GUID as a PK
Insert 2 milion records to it in batches of 10.000

Note the time it takes to insert the records.

you will see a real SUBSTANTIAL increase in time.

To get you started here is a sample table :

SET NOCOUNT ON

DECLARE @START DATETIME
DECLARE @END DATETIME
DECLARE @COUNTER INT

CREATE TABLE TESTTABLEGUID
(
ID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
FIELD1 CHAR(200) DEFAULT NEWID(),
FIELD2 CHAR(200) DEFAULT NEWID(),
FIELD3 CHAR(200) DEFAULT NEWID(),
FIELD4 CHAR(200) DEFAULT NEWID(),
FIELD5 CHAR(200) DEFAULT NEWID(),
FIELD6 CHAR(200) DEFAULT NEWID(),
FIELD7 CHAR(200) DEFAULT NEWID(),
FIELD8 CHAR(200) DEFAULT NEWID(),
FIELD9 CHAR(200) DEFAULT NEWID(),
FIELD10 CHAR(200) DEFAULT NEWID(),
FIELD11 CHAR(200) DEFAULT NEWID(),
FIELD12 CHAR(200) DEFAULT NEWID(),
FIELD13 CHAR(200) DEFAULT NEWID(),
FIELD14 CHAR(200) DEFAULT NEWID(),
FIELD15 VARCHAR(200) DEFAULT NEWID()
)



SET @START = GETDATE()
SET @COUNTER = 1
WHILE @COUNTER < 10000
BEGIN
INSERT TESTTABLEGUID DEFAULT VALUES
SET @COUNTER = @COUNTER + 1
END
SET @END = GETDATE()
SELECT 'DURATION GUID',DATEDIFF(MS,@START,@END)
DROP TABLE TESTTABLEGUID


# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, September 03, 2004 9:12 AM by Patrick Wellink
Oh and I forgot to tell yopu to try it with an ID as a PK too, you will see there is no performance loss at all....

# OH MAN .......A SECRET ABOUT GUID

Wednesday, December 29, 2004 4:26 PM by Patrick Wellink
GUID's are life savers.

1.RAM is gonna be cheap

2.Harddisk is gonna be cheap.

******

we will be fired to write a "spagetti code".

WE NEED INTEGRITY

:=)

Code Name: write less ,DO MORE

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, January 03, 2005 8:56 AM by Patrick Wellink
Well always nice to know that you know it better than the SQL-Server Books online....

Guess you never really programmed a big database...

And you mention ram and HD is gonna be cheap....

But you forget to mention the most important thing....

P E R F O R M A N C E

And let me tell you that changing a table here and there is far cheaper than buying a new Server.
Furthermore, you can't win the performance you loose with hardware if the table design is bad.

Faster hardware can increase performance by a factor 2 or so but designing your tables differently can increase performance by 30.000 times....

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, April 29, 2005 3:17 PM by Patrick Wellink

there is a link somewhere on the internet to this discussion....

its here : http://dotnetjunkies.com/weblog/joeolsen/archive/2005/01/12/43722.aspx?Pending=true

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Sunday, May 01, 2005 7:06 PM by Patrick Wellink
Patrick, your point is well taken that GUIDs present a considerable performance issue--and all the cheap RAM and cheap disk space in the world won't help you at I/O time... the real performance hit comes every time you have to suck that ballooned data to or from the disk... all else being equal, it takes twice as long to load 200 bytes of data as it does 100 bytes.

My dilemma: While I am all about carrying the most cost-effective datatypes, I am building a large multi-site application where site autonomy is essential. I have been considering merge replication, but the performance nut in me isn't excited about the cost of repeating GUIDs hundreds of thousands of times over (in foreign key columns, for example).

We need a solution that allows any site to master new records independent of the others, but that will replicate changes to other sites when convenient.

My only thought is to build a mechanism to allocate blocks of integers to the various sites... or acquiesce and use GUIDs...

Even should I do some sophisticated managing of integers, I cannot use out-of-the-box SQL Merge Replication without including the GUID fields in my tables... arrgh!

Any observations on how you'd approach this problem?

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, May 02, 2005 8:20 AM by Patrick Wellink
Well if you really have to do merge replication, adding a GUID is a logical step.

You could indeed have a bigint and then allocate blocsks of integers to site's but i have no experience with that. (sorry)

However, this is just the place where you would introduce a guid.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Friday, June 03, 2005 7:16 PM by Patrick Wellink
GUIDs are useful if you dont know SQL. Talented db engineers have been doing just fine without them for years. Its really sad to read about how "easy" GUIDs will make db development... and how performance is a non-issue because "RAM is cheap" and space is plenty.

That all may be true... but Im a little pissed that no one seems to care about developing quality code. Anyone can build a crappy database and get by. Using a GUID is one way to ensure that your key is unique... building a sound database system is anotherl.

If its 100 ms faster, then its 100 ms better. Using a GUID over an INT because its easy is like fishing for 3 lb trout with 300 lb test line. Sure, its easier... but wheres the sport? Wheres the style.

Long live the INT!

Suck it.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Monday, June 13, 2005 10:35 PM by Patrick Wellink
One aspect of a GUID over an INT that I have not seen discussed is the security aspect of a GUID. Having a constantly incrementing key be visible to users, as in a web based scenario where the ID of a record is often passed back and forth, is a security risk. Even if the value is encrypted, just the fact that the key increments and is "guessable" provides a hacker an easier method of determining how to break the key. A random GUID is much much harder to crack.

Just my 2 cents with another piece of info to help people make their decisions.

# re: GUID is not Always GOOD !!!!!!! (a true RTFM story)

Tuesday, June 14, 2005 4:19 PM by Patrick Wellink

What has this to do with good database design....

Where is the leak if you use parametherized queries ???

I dont get it this is only true if you use inline dynamic SQL.....

and if you do that you could as wel program a guid for everything........

Hyperlinks etc could still use GUIDs ..... I never said there shouldn't be any GUIDS.

But NEVER AS A PK.....
AND NEVER with a clusered index !!!!!!!!



# Inserts with guids have become faster... (in SQL2005 that is)

Friday, April 28, 2006 7:07 AM by Patrick Wellink
Ok,
I am not a real fan of the GUID datatyoe in SQL server cause some people will create clustered indexes...

# re: GUID can be good but should never be clustered

Monday, August 07, 2006 10:48 AM by Ross
Though this is an old thread, I am compelled to add to it for anyone that does a web search and reads through attempting to make a decision on GUID vs. INT.

I'd like to summarize all of the above into a simple, concise logic fork as a guideline to GUID vs. INT (disregarding natural keys).

1. For a non-replicated, non-clustered database, a clustered INT PK will most often be the better choice, giving performance and smaller data file sizes.

2. For a replicated, clustered, or otherwise distributed database where synchronization, data merging, etc is needed, a NON-CLUSTERED GUID PK will grant reliable record uniqueness, providing data integrity and simple data merging and updating.

That's it. Apply this simple fork to each table and you will achieve an acceptable solution for your needs.

One cautionary caveat, however: Look to the future. Maybe you requirements now don't include taking a client system off-line to go remote, then come back and merge changes, but what about in a year or two when you want to release version 2? I can say from personal experience that designing and implementing an upgrade strategy to convert your complex INT-based database to be GUID-based to support the data merge is no trivial task.