High concurrency SELECT / UPDATE procedure

Mau C

unread,

Feb 10, 2011, 5:15:41 AM2/10/11

to

Hi,
I'm messing with this scenario:
- SQL Server 2005 (it should work on 2008 too)
- high concurrency jdbc client connections.

The code below is a "simplified" exctract of the real application to
make easier the example which is like a job scheduler or, if you prefer,
a "process starter" :

Requirements (steps)
1) it takes the first 10 available jobs (row=job)
2) it marks "taken" those rows and pass them to the external jdbc client
(as resultset)
3) the external client processes those jobs and, when each job is
completed, deletes the corresponding row.

The SQL code below refers to the 1) and 2) step only:

CREATE TABLE [tmp_table] (
id int IDENTITY(1,1) NOT NULL,
field1 varchar(45) NOT NULL DEFAULT '',
field2 varchar(45) NOT NULL DEFAULT '',
status smallint NOT NULL DEFAULT '0', -- 0 free, 1 busy
PRIMARY KEY (id)
)

PROCEDURE [pSelectTmpTable]
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRANSACTION;
SELECT TOP 10 * FROM [tmp_table] WITH (XLOCK,ROWLOCK) WHERE status=0;
UPDATE [tmp_table] SET status=1 WHERE id IN (SELECT TOP 10 id FROM
[tmp_table] WHERE status=0);
COMMIT TRANSACTION;
COMMIT;
END

Does it the right way to manipulate records in high concurrency context ?
Do particularly the clauses (XLOCK,ROWLOCK) properly work under those
requirements ?

Thanks in advance
Best regards,
M.

Erland Sommarskog

unread,

Feb 11, 2011, 6:04:42 PM2/11/11

to

Mau C (nob...@hotmail.com) writes:
>
> PROCEDURE [pSelectTmpTable]
> AS
> BEGIN
> SET NOCOUNT ON;
> BEGIN TRANSACTION;
> SELECT TOP 10 * FROM [tmp_table] WITH (XLOCK,ROWLOCK) WHERE status=0;
> UPDATE [tmp_table] SET status=1 WHERE id IN (SELECT TOP 10 id FROM
> [tmp_table] WHERE status=0);
> COMMIT TRANSACTION;
> COMMIT;
> END
>
> Does it the right way to manipulate records in high concurrency context ?
> Do particularly the clauses (XLOCK,ROWLOCK) properly work under those
> requirements ?

There is a potential flaw here. Since there is no ORDER BY here, there is
no guarantee that the two SELECT TOP will produce the same rows. So that's
the first thing to fix, and an ORDER BY clause which sorts the rows on
something which is unique.

Then again you could use the TOP and OUTPUT clauses to your advantage:

UPDATE TOP (10) tmp_table
SET status = 1
OUTPUT inserted.col1, inserted.col2, ...
WHERE status = 0

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Mau C

unread,

Feb 14, 2011, 2:45:40 AM2/14/11

to

Il 12/02/2011 0.04, Erland Sommarskog ha scritto:
> Then again you could use the TOP and OUTPUT clauses to your advantage:
>
> UPDATE TOP (10) tmp_table
> SET status = 1
> OUTPUT inserted.col1, inserted.col2, ...
> WHERE status = 0

I didn't know this clause and it is very useful for my purposes.
Due to the "order by" issue which may cause performance lacks (even if
there's an index), I've decided to modify the architecture as:

1) @my_id = @@SPID
2) UPDATE SET status = @my_id
3) retreive updated rows
4) DELETE WHERE status = @my_id
(in the same stored)

In the previous architecture the DELETE was delayed after each job run,
but I've seen there's no really any reason to do it.

Thank you for you support.

M.

Erland Sommarskog

unread,

Feb 14, 2011, 5:19:02 PM2/14/11

to

Mau C (nob...@hotmail.com) writes:
> I didn't know this clause and it is very useful for my purposes.
> Due to the "order by" issue which may cause performance lacks (even if
> there's an index), I've decided to modify the architecture as:
>
> 1) @my_id = @@SPID
> 2) UPDATE SET status = @my_id
> 3) retreive updated rows
> 4) DELETE WHERE status = @my_id
> (in the same stored)
>
> In the previous architecture the DELETE was delayed after each job run,
> but I've seen there's no really any reason to do it.

Just make sure that you the OUTPUT clause, so that you operate on the
same rows all the way through.

Mau C

unread,

Feb 15, 2011, 2:55:15 AM2/15/11

to

Il 14/02/2011 23.19, Erland Sommarskog ha scritto:
> Just make sure that you the OUTPUT clause, so that you operate on the
> same rows all the way through.

With OUTPUT clause the UPDATE statement becomes useless. I need to
extract the TOP(@BufferSize) rows/jobs from the table and delete them...
So DELETE .. OUTPUT deleted.col1, deleted.col2,... works out properly as
I wanted.
Preliminary tests give good results.

Regards,
M.