Appropriate Query Optimization Technique(s) Sought for the Following Case

Don Li

unread,

Nov 24, 2007, 9:22:13 PM11/24/07

to

Hi,

Env: MS SQL Server 2000

DB Info (sorry no DDL nor sample data):
tblA has 147249 rows -- clustered index on pk (one key of
datatype(int)) and
has two clumns, both are being used in joins;

intersecTbl4AB has 207016 rows -- clustered index on two fks and
this intersection table has six colums but only the two fks are being
used for joins;

tblB has 117597 rows -- clustered index on pk (one key of
datatype(bigint)) and
has four columns but only its key are being used for joins

A complex query involving the above the three tables includes inner
and outer joins, aggregate, sorting, predicate, math function, derived
table etc.
On the first run, the query takes about 4200ms to finish;
after some research on index optimization, I provided some index hint,
then the query runs at
about 3000ms. (That was yesterday).
Just now I realized that MS SQL Server 2000 is quite "intelligent", I
think it saves search terms
into cache because the second time search of the a same term is much
much faster. Now, a couple of questions:

a) if I construct a long long list of "common" terms and
programmatically let the sql server to cache them would it speed up
the overal query performance in my case? (Or it may depend on the
quality of the "common" terms?) and if your answer is yes (supposedly
you've been there, do you have to know where I could find such a
"common" term list, for everyday life or the general public?)

b) what other techniques out there to speed up the above described
query? Bring it down to 1000ms would be most desirable.

Thanks.

Erland Sommarskog

unread,

Nov 25, 2007, 4:20:09 AM11/25/07

to

Don Li (tatat...@gmail.com) writes:
> On the first run, the query takes about 4200ms to finish;
> after some research on index optimization, I provided some index hint,
> then the query runs at
> about 3000ms. (That was yesterday).
> Just now I realized that MS SQL Server 2000 is quite "intelligent", I
> think it saves search terms
> into cache because the second time search of the a same term is much
> much faster. Now, a couple of questions:

SQL Server caches:

1) Execution plans.
2) Data.

It does not cache invidiual search terms if that is what you have in
mind. If you submit one query, and then tweak that just a little bit,
you will have no benefit of that the plan for the first query is in
cache.

However, you will have benefit of that first query dragged data into
the cache. And this is something you need to watch out, as it may
distort your measurement of execution time. If you need to press the
execution time to 1000 ms, you must first determine whether that is
1000 ms with the data in cache, or with the data on disk.

If you think it's safe to assume that the table will almost always be
in cache run the same query several times, to make sure that nothing is
read from disk, and discard the first measurement.

If you what to measure worst case, issue DBCC DROPCLEANBUFFERS between
each run to flush the cache. (But don't do this on a production server!)

> b) what other techniques out there to speed up the above described
> query? Bring it down to 1000ms would be most desirable.

Without knowledge about the tables and the query I cannot but decline.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Gert-Jan Strik

unread,

Nov 25, 2007, 7:51:24 AM11/25/07

to

Don,

For performance questions, the database structure is very important.
With the information you provided, it is hard to say if all relevant
information is posted.

So I can only post a few generic recommentations here:
- make sure you have a properly defined foreign key between
intersecTbl4AB and tblB. If no columns are selected from tblB, and no
filtering is done based on tblB, then this foreign key constraint can
eliminate this table's access for this query

- if only the primary key columns are used in intersecTbl4AB, then make
sure this table has nonclustered indexes for both column orders. IOW,
try adding an index on intersecTbl4AB(fkB, fkA). Also, add a
nonclustered index on top of the clustered index. IOW, try adding a
nonclustered index on intersecTbl4AB(fkA, fkB). You can later drop the
indexes that the query does not use (check the query plan for that)

- avoid expressions in the join predicates, and use only columns
(whenever possible) without calculations. Make sure the columns are of
the same data type (and length)

- avoid scalar UDFs

HTH,
Gert-Jan

Don Li

unread,

Nov 25, 2007, 9:21:06 PM11/25/07

to

Thank you, Gert-Jan and Erland, I did the following for

>make
> sure this table has nonclustered indexes for both column orders.

for my intersecTbl4AB. Also, I found the data type mismatch for a
column, which was odd... fixed.
Now, the query runs like a rocket :)

Once again I greatly appreciate it, you both are the men of MS SQL
Server xxxx :)

Don

> > Thanks.- Hide quoted text -
>
> - Show quoted text -