Unexplained SQL Server 2000 slow execution

Stephen Howe

unread,

Dec 29, 2009, 6:20:18 PM12/29/09

to

Hi

Using SQL Server 2000 Professional. SP4 I think
I have 2 databases, an older version and a newer version with a few modifications
Using Query Analyser with execution plans to analyse what is going on.

One looks like

>>
CREATE TABLE [dbo].[Tbl1] (
[ID1] [int] NOT NULL ,
[Date] [int] NOT NULL ,
[Fld1] [tinyint] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[Tbl2] (
[ID1] [int] NOT NULL ,
[ID2] [smallint] NOT NULL ,
[Fld1] [int] NOT NULL ,
[Fld2] [smallint] NOT NULL ,
[Fld3] [tinyint] NOT NULL ,
[Fld4] [int] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[Tbl3] (
[ID2] [smallint] NOT NULL ,
[Fld1] [int] NOT NULL ,
[Fld2] [int] NOT NULL ,
[Fld3] [tinyint] NOT NULL ,
[Desc] [char] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Fld4] [smallint] NOT NULL ,
[Fld5] [smallint] NOT NULL
) ON [PRIMARY]

ALTER TABLE [dbo].[Tbl1] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl1] PRIMARY KEY CLUSTERED
( [ID1]
) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl2] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl2] PRIMARY KEY CLUSTERED
( [ID1],
[ID2],
[Fld1],
[Fld2],
[Fld3]
) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl3] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl3l] PRIMARY KEY CLUSTERED
( [ID2]
) WITH FILLFACTOR = 90 ON [PRIMARY]

CREATE INDEX [IX_Tbl1] ON [dbo].[Tbl1]([Date]) WITH FILLFACTOR = 90 ON [PRIMARY]

CREATE INDEX [IX_Tbl2] ON [dbo].[Tbl2]([Fld4]) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl3] ADD
CONSTRAINT [IX_Tbl3] UNIQUE NONCLUSTERED
( [Fld1],
[Fld2],
[Fld3]
) WITH FILLFACTOR = 90 ON [PRIMARY]
>>

and when I do

SELECT DISTINCT b.* FROM Tbl3 b
JOIN Tbl2 s ON s.ID2=b.ID2
JOIN Tbl1 sl ON s.ID1=sl.ID1
WHERE [date]= 40159
ORDER BY b.Fld1,b.Fld2,b.Fld3

it is fast. I see in the execution plan that the joins are 2 Nested Loops.
Tbl1 fetches back 31035 rows. An index seek has been used and IX_Tbl1 has been used.
Time taken is about 28 seconds.

However, a newer version of the database looks like

>>
CREATE TABLE [dbo].[Tbl1] (
[ID1] [int] NOT NULL ,
[Date] [int] NOT NULL ,
[Fld1] [smallint] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[Tbl2] (
[ID1] [int] NOT NULL ,
[ID2] [smallint] NOT NULL ,
[Fld1] [int] NOT NULL ,
[Fld2] [tinyint] NOT NULL ,
[Fld3] [tinyint] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[Tbl3] (
[ID2] [smallint] NOT NULL ,
[Fld1] [int] NOT NULL ,
[Fld2] [int] NOT NULL ,
[Fld3] [int] NOT NULL ,
[Fld4] [tinyint] NOT NULL ,
[Desc] [char] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Fld5] [int] NOT NULL
) ON [PRIMARY]

ALTER TABLE [dbo].[Tbl1] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl1] PRIMARY KEY CLUSTERED
( [ID1]
) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl2] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl2] PRIMARY KEY CLUSTERED
( [ID1],
[ID2],
[Fld1],
[Fld2],
[Fld3]
) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl3l] WITH NOCHECK ADD
CONSTRAINT [PK_Tbl3l] PRIMARY KEY CLUSTERED
( [ID2]
) WITH FILLFACTOR = 90 ON [PRIMARY]

CREATE INDEX [IX_Tbl1] ON [dbo].[Tbl1]([Date]) WITH FILLFACTOR = 90 ON [PRIMARY]

CREATE INDEX [IX_Tbl2] ON [dbo].[Tbl2]([Fld5]) WITH FILLFACTOR = 90 ON [PRIMARY]

ALTER TABLE [dbo].[Tbl3] ADD
CONSTRAINT [IX_Tbl3] UNIQUE NONCLUSTERED
( [Fld1],
[Fld2],
[Fld3],
[Fld4]
) WITH FILLFACTOR = 90 ON [PRIMARY]
>>

but now when I do

SELECT DISTINCT b.* FROM Tbl3 b
JOIN Tbl2 s ON s.ID2=b.ID2
JOIN Tbl1 sl ON s.ID1=sl.ID1
WHERE [date]= 40159
ORDER BY b.Fld1,b.Fld2,b.Fld3,b.Fld4

there are still 2 Nested loops
Tbl1 fetches back 40,336,827 rows.
It is using IX_Tbl1. Time taken is 3 minutes 37 seconds.
Why is this so much slower than the 1st database?

If I do

SELECT COUNT(*) FROM Tbl1 WHERE [Date]=40159

I see 14604 rows. So why does QA fetch back 40,336,827 rows ???

Something strange is going on but I cant see what.

Thanks

Stephen Howe

Uri Dimant

unread,

Dec 30, 2009, 3:43:56 AM12/30/09

to

Are you sure that both databases are identical I mean in terms of the data?

"Stephen Howe" <sjhoweATdialDOTpipexDOTcom> wrote in message
news:151lj5dft391j4s1a...@4ax.com...

Stephen Howe

unread,

Dec 30, 2009, 7:12:33 AM12/30/09

to

On Wed, 30 Dec 2009 10:43:56 +0200, "Uri Dimant" <ur...@iscar.co.il> wrote:

>Are you sure that both databases are identical I mean in terms of the data?

Hi Uri

They are not identical. They are nearly identical.

Each database has data from an industry-standard body in the UK, it is just that database 1 is the old contract, database 2 is
the new contract and at the moment, parallel runs of data for November-December are being done.
Eventually the old contract data will cease on 31st December 2009, the new contract data will start 1st January 2010

But both have the same volume of data, those tables I quoted have a few extra non-Key fields I omitted (all are tinyint's,
smallint's or integer's). I dont understand the difference at the moment.

FWIW, I have been using SQL Server 2000 for 5 years, our IT department is due to move the servers onto SQL Server 2005.
Cant happen soon enough.

In those 5 years, SQL Server 2000 has proved reliable except on 1 occasion.
I had a triple-JOIN which normally takes 8 seconds to produce 1 day of data and it ballooned up into 68 minutes.
QA showed me that due to a long IN statement in a WHERE clause, the Optimiser had chosen, inappropriately, to use LOOP joins.
200 million rows were considered. I forced a MERGE JOIN through a hint and 68 minutes became 8 seconds again.
This is the only time SQL Server 2000 got it wrong.

Now I am wondering if I am facing the same type of issue.

Thanks

Stephen Howe

Gert-Jan Strik

unread,

Dec 30, 2009, 7:23:03 AM12/30/09

to

Stephen,

There are two obvious query plans for this statement:
1. starting with Tbl3
2. starting with Tbl1

The Tbl3 in your second script is wider, which makes it more expensive
to access (more estimated I/O). Therefore, the optimizer may have
decided to switch from approach [1] to approach [2] or vice versa.

Maybe you should make sure your statistics are up to date (when in
doubt, run UPDATE STATISTICS ... WITH FULLSCAN).

Also, you should consider creating an index on Tbl2(ID2)

--
Gert-Jan
SQL Server MVP

Erland Sommarskog

unread,

Dec 30, 2009, 5:57:09 PM12/30/09

to

Stephen Howe (sjhoweATdialDOTpipexDOTcom) writes:
> it is fast. I see in the execution plan that the joins are 2 Nested
> Loops. Tbl1 fetches back 31035 rows. An index seek has been used and
> IX_Tbl1 has been used.
> Time taken is about 28 seconds.

> but now when I do

>...

>
> there are still 2 Nested loops
> Tbl1 fetches back 40,336,827 rows.
> It is using IX_Tbl1. Time taken is 3 minutes 37 seconds.
> Why is this so much slower than the 1st database?
>
> If I do
>
> SELECT COUNT(*) FROM Tbl1 WHERE [Date]=40159
>
> I see 14604 rows. So why does QA fetch back 40,336,827 rows ???
>
> Something strange is going on but I cant see what.

Obviously, the more interesting question is why you get so many rows
back. Execution time is secondary. Returning 40 millions rows is not
done in an instant.

Since I don't know the tables, I can't tell why you get differnet
results, but a start is to run:

SELECT COUNT(*) FROM Tbl3 b

JOIN Tbl2 s ON s.ID2=b.ID2
JOIN Tbl1 sl ON s.ID1=sl.ID1
WHERE [date]= 40159

in both databases. Then run SELECT COUNT(*) FROM Tbl3 and so on.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Stephen Howe

unread,

Dec 31, 2009, 7:29:34 AM12/31/09

to

On Wed, 30 Dec 2009 22:57:09 +0000 (UTC), Erland Sommarskog <esq...@sommarskog.se> wrote:

> SELECT COUNT(*) FROM Tbl3 b
> JOIN Tbl2 s ON s.ID2=b.ID2
> JOIN Tbl1 sl ON s.ID1=sl.ID1
> WHERE [date]= 40159

Hi Erland

On database 1 that returns 229506 rows.
On database 2 that returns 103448 rows.

> SELECT COUNT(*) FROM Tbl1 WHERE [date]= 40159

On database 1 that returns 31035 rows.
On database 2 that returns 14604 rows.

> SELECT COUNT(*) FROM Tbl2 WHERE ID1 IN (SELECT ID1 FROM Tbl1 WHERE [date]= 40159)

On database 1 that returns 229506 rows.
On database 2 that returns 103449 rows.

> SELECT COUNT(*) FROM Tbl3

On database 1 that returns 2212 rows.
On database 2 that returns 3073 rows.

The funny thing is when I execute both queries together in QA, it says

Query 1: Query cost (relative to the batch) : 1.20% (Database 1)
Query 2: Query cost (relative to the batch) : 98.80% (Database 2)

I would expect cost to be closer to 50%-50%, not weighted in this manner.
Where things really differ on looking at the execution plan is Tbl2.

For Query 1, it is doing a Clustered Index Seek for Tbl2 and a LOOP JOIN with Tbl1, only getting back records where ID1 is the
same.
For Query 2, it is doing a Clustered Index Scan for Table 2 and a MERGE JOIN with Tbl1.

The problem seems to be Tbl2.
For Query 2, 91% of the cost is spread between the Clustered Index Scan of Tbl2 (65%) and the MERGE JOIN (26%).
For Query 1, 21% of the cost is spread between the Clustered Index Seek of Tbl2 (18%) and the LOOP JOIN (3%).

Cheers and thanks for help (everyone)

Stephen Howe

Gert-Jan Strik

unread,

Dec 31, 2009, 8:08:59 AM12/31/09

to

Stephen Howe wrote:
> The problem seems to be Tbl2.

Right, that is why I advised you to create an index on Tbl2(ID2).
Currently, the optimizer doesn't have a very efficient way of going from
Tbl3 to Tbl2, short of an index scan (which is what you currently get).

Stephen Howe

unread,

Dec 31, 2009, 8:45:29 AM12/31/09

to

Hi Gert

But it is part of the Primary Key, even if the second part.
I see what you are saying.
QA reveals that for Query 2 (Database 2), 91% of the cost is spent joining Tbl1 and Tbl2, only 1% is spent JOINing and reading
Tbl3.

I have just added this. Even stranger.
Query 2 is now using this index to Index Scan read Tbl2. That does not makes sense as only ID1 links Tbl1 and Tbl2

Stephen

Gert-Jan Strik

unread,

Dec 31, 2009, 9:55:01 AM12/31/09

to

Stephen Howe wrote:
>
> On Thu, 31 Dec 2009 14:08:59 +0100, Gert-Jan Strik <sorrytoomuc...@xs4all.nl> wrote:
>
> >Stephen Howe wrote:
> >> The problem seems to be Tbl2.
> >
> >Right, that is why I advised you to create an index on Tbl2(ID2).
> >Currently, the optimizer doesn't have a very efficient way of going from
> >Tbl3 to Tbl2, short of an index scan (which is what you currently get).
>
> Hi Gert
>
> But it is part of the Primary Key, even if the second part.

Yes, but order in an index is very important. The storage engine cannot
seek on the second column (of the first column's value is unknown or
undetermined).

> I see what you are saying.
> QA reveals that for Query 2 (Database 2), 91% of the cost is spent joining Tbl1 and Tbl2, only 1% is spent JOINing and reading
> Tbl3.
>
> I have just added this. Even stranger.
> Query 2 is now using this index to Index Scan read Tbl2. That does not makes sense as only ID1 links Tbl1 and Tbl2
>
> Stephen

Every nonclustered index uses the clustered index key to refer to the
actual row. In other words, the clustered index key is included in each
nonclustered index. So for your query - since it uses only clustered
index columns - any nonclustered index on Tbl2 will cover your query for
Tbl2.

So apparently, the optimizer still chose to scan the index (instead of
seek it). This implies that the optimizer thinks that there are so many
matches between Tbl2 and Tbl1 that it is the best approach to process
all rows (of Tbl2). Apparently, your query is not very selective.

Are you sure you want this query's result? Anyway, maybe you get better
performance if you rewrite it to:

SELECT *
FROM Tbl3 b
WHERE EXISTS (
SELECT *
FROM Tbl2 s

JOIN Tbl1 sl ON s.ID1=sl.ID1

WHERE s.ID2=b.ID2
AND [date]= 40159
)
ORDER BY b.Fld1,b.Fld2,b.Fld3,b.Fld4

Stephen Howe

unread,

Dec 31, 2009, 11:11:40 AM12/31/09

to

>So apparently, the optimizer still chose to scan the index (instead of
>seek it). This implies that the optimizer thinks that there are so many
>matches between Tbl2 and Tbl1 that it is the best approach to process
>all rows (of Tbl2). Apparently, your query is not very selective.
>
>Are you sure you want this query's result? Anyway, maybe you get better
>performance if you rewrite it to:
>
> SELECT *
> FROM Tbl3 b
> WHERE EXISTS (
> SELECT *
> FROM Tbl2 s
> JOIN Tbl1 sl ON s.ID1=sl.ID1
> WHERE s.ID2=b.ID2
> AND [date]= 40159
> )
> ORDER BY b.Fld1,b.Fld2,b.Fld3,b.Fld4

I did rewrite.

What I dont understand is that for this day, Database 1 has twice the numbers of row on Tbl1 and Tbl2 than Database 2, yet when
I execute both queries one after another in QA, Database 1 takes 1.63% of the cost, Database 2 takes 98.37% of the time.

The database that has twice the data, takes 1.63% of the cost.
And the optimiser does a Clustered Index Seek on Tbl2.

All very strange.

Stephen Howe

Erland Sommarskog

unread,

Dec 31, 2009, 2:49:30 PM12/31/09

to

Stephen Howe (sjhoweATdialDOTpipexDOTcom) writes:
> On Wed, 30 Dec 2009 22:57:09 +0000 (UTC), Erland Sommarskog
<esq...@sommarskog.se> wrote:
>
>> SELECT COUNT(*) FROM Tbl3 b
>> JOIN Tbl2 s ON s.ID2=b.ID2
>> JOIN Tbl1 sl ON s.ID1=sl.ID1
>> WHERE [date]= 40159
>
> Hi Erland
>
> On database 1 that returns 229506 rows.
> On database 2 that returns 103448 rows.

But in your previous post, you said the query returned 40 million rows?

Looking back, I see now that you said: "Tbl1 fetches back 40,336,827 rows."
Did you mean to say that the estimated number of rows from tbl1 is
40 million?

I'm sorry if I'm picky, but it's difficult to assist if I get incorrect
data to work with.

>> SELECT COUNT(*) FROM Tbl1 WHERE [date]= 40159
>
> On database 1 that returns 31035 rows.
> On database 2 that returns 14604 rows.

>...

>> SELECT COUNT(*) FROM Tbl3
>
> On database 1 that returns 2212 rows.
> On database 2 that returns 3073 rows.

OK, it seems best to start with this table, provided that there is a
good index, and I see that you added one on Gert-Jan's suggestion.

> Query 2 is now using this index to Index Scan read Tbl2. That does not
> makes sense as only ID1 links Tbl1 and Tbl2

If it's part of a merge or hash join it makes sense.

> The funny thing is when I execute both queries together in QA, it says
>
> Query 1: Query cost (relative to the batch) : 1.20% (Database 1)
> Query 2: Query cost (relative to the batch) : 98.80% (Database 2)

Don't pay much attention to these percentages. They are estimates and
are often out of touch.

If you want to compare execition times, do this:

DECLARE @d datetime
SELECT @d = getdate()
SELECT .... INTO #temp <yourquery>
PRINT 'This query took ' + lrtim(str(datediff(ms, @d, getdate()))) + ' ms.'
DROP TABLE #temp

By selecting into temp table, you factor out the time it takes for
the client to receive the data.