Bulk delete log entries from a database table
At my current assignment, we have a number of applications that log extensively into a SQL server 2000 database table. These applications log thousands of rows per day, so you can imagine that this table contains huge numbers of rows after only a few weeks.
We were asked to come up with a clean up procedure that will clean the log entries without locking to many rows at a time. We came up with a stored procedure which will be scheduled to run somewhere around midnight, when no one is working here. The table itself could be declared as follows:
CREATE TABLE [dbo].[LogInformation] (
[LogInformationId] [bigint] IDENTITY (1, 1) NOT NULL ,
[LogDate] [datetime] NOT NULL ,
[LogText] [varbinary] (50) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[LogInformation] WITH NOCHECK ADD
CONSTRAINT [PK_LogInformation] PRIMARY KEY CLUSTERED
(
[LogInformationId]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[LogInformation] WITH NOCHECK ADD
CONSTRAINT [DF_LogInformation_LogDate] DEFAULT (getdate()) FOR [LogDate]
GO
There is more information in the table, but I stripped it to what is used in the stored procedure we created. The following code shows this stored procedure:
CREATE PROCEDURE CleanLog
@daysToKeep int = 30, -- Leave 30 days of log untouched
@maxTotRows int = 100000, -- Delete no more than 100000 rows in total
@maxBatchRows int = 1000 -- Delete max. 1000 rows each batch
AS
BEGIN
DECLARE @batchRows int -- Deleted number of rows in current batch
DECLARE @maxID int -- Maximum PK-id to be deleted in the LogTable
-- Get the biggest PK-id within the criteria
SELECT @maxid = Max(LogInformationId)
FROM LogInformation (nolock)
WHERE LogDate > dateadd(day, @daysToKeep * -1, GetDate())
IF (NOT @maxID is NULL)
BEGIN
SET ROWCOUNT @maxBatchRows -- delimit rows per Batch
SET @batchRows = 1
-- keep deleting rows until: there are no more rows to delete OR
-- the maximum number of deleted rows is reached.
WHILE (@batchRows > 0) AND (@maxTotRows > 0)
BEGIN
IF (@maxTotRows > @maxBatchRows)
BEGIN
SET ROWCOUNT @maxTotRows
END
DELETE FROM LogInformation
WHERE LogInformationId <= @maxID
SET @batchRows = @@rowcount
SET @maxTotRows = @maxTotRows - @batchRows
END
END
END
It works fine, and as far as we can tell, the impact on database performance while the procedure is running is minimal. We came up with this solution because this way we don't need temp tables or cursors, and we do not create giant transactions that lock thousands of rows.
But maybe somebody out there has a better solution?