Wilson Kutegeka

Microsoft MVP - Visual Basic www.clinicmaster.net

Good Database Table Design Practice– Managing table unique identifier (Primary key)

Every table should have a primary key, that’s according to First Normal Form which dictates that all rows in a table should be uniquely identified.

MS SQL Server allows us to define a numeric column as an IDENTITY column, which automatically generates a unique value for each row. Alternatively, one can use NEWID() to generate a random unique value for each row. These types of values, when used as keys, are what are known as Surrogate Keys.

A Surrogate Key in a database is considered to be a unique identifier for a row in the table that is not derived from the data itself. Most times, these keys exist for internal use only as permanent record identifiers and shouldn't be shown to users.

Now consider a situation where by our customers or the end users of our applications are accustomed to referring to Orders by a unique reference number such as ‘1234’. Thus acclimated, they start to expect that the next invoice created will be ‘1235’. If the order jumps to ‘1238’, for example, they start to worry as to what happened to '1236' and ‘1237’. Next thing you know, they're going to ask us to fill in the sequence gaps. Oooch... now they want to append the reference number with the last two digits of year of the order (‘07-1237’) and also suffix it with the customer initials (‘07-1237-KW’). They go on and on. Definitely surrogate key can’t work here.

Perhaps one can think of avoiding surrogate keys by using composite keys (multi column primary keys). Consider a situation where by you have a table Visits that has columns Visit Number and Patient Number as composite keys with many objects referencing it (Visits). Well referencing Visits through Visit Number and Patient Number is just pain for me. The truth is composite keys make any kind of Object/Relational mapping and persistence in general harder.  Life is so much easier with surrogate keys compared to composite keys.

The rule of thumb I use is simple, each of your tables should have a natural key that means something to the user, and can uniquely identify each row in the table. At best, the key should be automatically generated and may be replaced by the user. e.g. if Patient Number is ’07-1234-JK’ table Visits could have Visit numbers such as ’07-1234-JK-001’, ’07-1234-JK-002’ and ’07-1234-JK-003’, Visit Number being the only primary key for Visits and automatically generated.

To accomplish the above,

  1.  Add an int column to your table that will store the incremental number.
  2.  Add a Natural Key column, preferably a varchar.

i.e. for Visits,  add columns among others the following

  • VisitID (preferably of int datatype). I call this a helper column that’s internal
  • VisitNo (preferably of varchar(20) datatype). This is the primary key
  • PatientNo
  • etc

There after determine automatically the next VisitID, generate the VisitNo by padding the VisitID with the padding length of your choice, concatenate it with PatientNo, and present it to the user, and if the user wants, she can replace it.

In my applications, I have a table that manages all Auto Numbers; a couple of functions and at times triggers to ensure that helper’s column sequence is not altered.

Create table code

create table AutoNumbers
(ObjectName varchar(40) not null
constraint fkObjectNameAutoNumbers references AccessObjects(ObjectName),
AutoColumnName varchar(60) not null ,
constraint pkObjectNameAutoColumnName primary key(ObjectName,AutoColumnName),
HelperColumnName varchar(60) not null,
AutoColumnLEN tinyint not null,
PaddingCHAR char(1) not null,
PaddingLEN tinyint not null ,
SeparatorCHAR char(1) not null,
SeparatorPositions varchar(20), -- coma separated
StartValue int not null,
Increment smallint not null constraint ckIncrement check (Increment >= 0),
AllowJumpTo bit not null constraint dfAllowJumpTo default 0,
JumpToValue int not null constraint dfJumpToValue default 0
)
go

Explanation

ObjectName:- referrers to name of table to have the natural key e.g. Visits
AutoColumnName:- the automatically generated natural key e.g VisitNo
HelperColumnName:- the helper column that stores the increment e.g VisitID.
AutoColumnLEN:- key column length e.g. 11 for ’07-1234-JK-003’ (- not stored)
PaddingCHAR:- the padding character e.g 0  as seen in 003 last above
PaddingLEN:- padding length e.g 3 that leads to 003 in our example
SeparatorCHAR:- separator character such as ‘-’ that is inserted for better viewing
SeparatorPositions:- positions where the separator character will be displayed e.g at ‘2,7,10’ in our example
StartValue:- start value say 1 to start at 001
Increment:- The increment value say 1
AllowJumpTo:- If set to true, the Auto Number can jump to the jump to value if greater than Start Value.
JumpToValue:- The value to jump to if allow jump to value is set to true

The following is a SQL function that will help to retrieve the next ID for your Natural key column.

--------Function Get Next AutoNumber--------------------------------------------

if exists (select * from sysobjects where name = 'GetNextAutoNumber')
 drop function GetNextAutoNumber
go

create function GetNextAutoNumber(@ObjectName as varchar(40), @AutoColumnName varchar(60), @ID int) returns int
with encryption as

begin
 declare @NewID int
 declare @SeparatorPositions varchar(20)
 declare @StartValue int
 declare @Increment smallint
 declare @AllowJumpTo bit
 declare @JumpToValue int

 set @SeparatorPositions = (select SeparatorPositions from AutoNumbers
       where ObjectName = @ObjectName and AutoColumnName = @AutoColumnName)

 set @StartValue = (select StartValue from AutoNumbers
      where ObjectName = @ObjectName and AutoColumnName = @AutoColumnName)

 set @Increment = (select Increment from AutoNumbers
     where ObjectName = @ObjectName and AutoColumnName = @AutoColumnName)

 set @AllowJumpTo = (select AllowJumpTo from AutoNumbers
     where ObjectName = @ObjectName and AutoColumnName = @AutoColumnName)

 set @JumpToValue = (select JumpToValue from AutoNumbers
     where ObjectName = @ObjectName and AutoColumnName = @AutoColumnName)

 if (@AllowJumpTo = 1) and (@JumpToValue > @ID) and (@JumpToValue > @StartValue)
 begin
  set @NewID = @JumpToValue + @Increment
 end
 else
 begin
  set @NewID = isnull(@ID, @StartValue) + @Increment
 end
 
return @NewID

end
go

 When Inserting Visits, you can have edit your InsertVisits stored procedure to look as follows

create proc uspInsertVisits(
@VisitNo varchar(20),
@PatientNo varchar(20),
….
)as

declare @VisitID int


begin


set @VisitID = (select max(VisitID) from Visits where PatientNo = @PatientNo)
set @VisitID = dbo.GetNextAutoNumber('Visits', 'VisitNo', @VisitID)

insert into Visits
(VisitID ,VisitNo ,PatientNo …)
 values
(@VisitID ,@VisitNo ,@PatientNo…)

return 0
end
go


The function below will submit the Next Auto Number to your code that you can pad, format, merge with Patient No and display it in the entry box for the user to accept or replace.

create proc uspGetNextVisitID(
@PatientNo varchar(20) ,
@VisitID int = null output
)as

set @VisitID = (select max(VisitID) from Visits where PatientNo = @PatientNo)
set @VisitID = dbo.GetNextAutoNumber('Visits', 'VisitNo', @VisitID)

return 0

Let me know what you think

Comments

Ramon Smits said:

I've learned it as having a technical primary key and a functional primary key. Both can be the same but most of the time they do not as having a functional key as these are often bad for performance.

I would not store such sequence logic in the database any more as this is business logic and most applications these days have that available in a different tier.

# April 22, 2009 11:10 AM

Wilson Kutegeka said:

I am not very sure if I understand your comment very well, but yes, it depends upon the architecture.

If one of the condition(s) in my architecture is that I only connect to and disconnect from the DB only when I want to save new or update/delete only what I’ve retried i.e. not first present the 40 million records to my business layer to merely update or add one entry. I don’t see my self not putting the sequence in the DB

And well performance can be an issue at time but with so much processing power available on our servers nowadays, elegancy will just override it (to me anyway).

Otherwise I appreciate your comment

# April 22, 2009 4:26 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 


Please add 4 and 5 and type the answer here: