We're having the general problem that -- in one customer's database --
the performance of the queries varies widely. I've found one example
which is easy to explain and which can be reproduced reliably:
SQL A: SELECT * FROM table WHERE a = 1234 AND b <> 0
returns one record (by coincidence, for this particular choice of
field "a"; it could be more) and is very fast (< 1s). On the other
hand
SQL B: SELECT TOP 1 * FROM table WHERE a = 1234 AND b <> 0
ORDER BY primaryKeyOfTable DESC
returns the same record, but is very slow (~ 17s).
Apparently, SQL Server is choosing a very bad query plan here: SQL
Server is doing "smart stuff", like Index Seek, Hash Match, etc. for
SQL A, but for SQL B it's just making a slow Clustered Index Scan (the
table has ~600.000 records, BTW). Changing "WHERE a = 1234" to "WHERE
a = 1235" causes it to use a much better (and much faster) query plan.
This lead me to believe that it's a problem with outdated statistics.
However, neither "UPDATE STATISTICS table" nor "UPDATE STATISTICS
table WITH FULLSCAN" improve the situation.
Of course, the obvious question is: What *can* I do to ensure that SQL
Server chooses a better query plan? (And please don't tell me to
hard-code query hints into hundreds of SQL queries or to do a manual
"TOP 1" in business logic on the client...)
Greetings,
Heinzi
PS: This is a simplified example. We don't use "SELECT *". It also
means that I'm more interested in gerneric ways to improve query
performance rather than in optimizing this simple toy example.
PPS: Of course, fields "a" and "b" are both indexed.
PPPS: Removing the "TOP 1" in SQL B also makes the query very fast (<
1s), obviously with the same result set.
--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se
Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
"Heinrich Moser" <use...@heinzi.at> wrote in message
news:87ljr4u...@msgid.heinzi.at...
Erland Sommarskog <esq...@sommarskog.se> writes:
> Heinrich Moser (use...@heinzi.at) writes:
>> We're having the general problem that -- in one customer's database --
>> the performance of the queries varies widely. I've found one example
>> which is easy to explain and which can be reproduced reliably:
>>
>> SQL A: SELECT * FROM table WHERE a = 1234 AND b <> 0
>>
>> returns one record (by coincidence, for this particular choice of
>> field "a"; it could be more) and is very fast (< 1s). On the other
>> hand
>>
>> SQL B: SELECT TOP 1 * FROM table WHERE a = 1234 AND b <> 0
>> ORDER BY primaryKeyOfTable DESC
>>
>> returns the same record, but is very slow (~ 17s).
>>
>> Apparently, SQL Server is choosing a very bad query plan here: SQL
>> Server is doing "smart stuff", like Index Seek, Hash Match, etc. for
>> SQL A, but for SQL B it's just making a slow Clustered Index Scan (the
>> table has ~600.000 records, BTW). Changing "WHERE a = 1234" to "WHERE
>> a = 1235" causes it to use a much better (and much faster) query plan.
>> This lead me to believe that it's a problem with outdated statistics.
>> However, neither "UPDATE STATISTICS table" nor "UPDATE STATISTICS
>> table WITH FULLSCAN" improve the situation.
>
> Did you look at the output from DBCC SHOW_STATIISTCS? Seems like 1234
> and 1235 are in different buckets in the histogram.
Indeed, they are.
Nevertheless, I'm still a bit lost here. How does this information
help me improve the performance of a = 1234? I've tried adding a
statistic containing both fields a and b (and running update
statistics fullscan afterwards), but this does not change anything --
the query plan is still terrible.
Greetings,
Heinzi
Hi!
"Uri Dimant" <ur...@iscar.co.il> writes:
> "Heinrich Moser" <use...@heinzi.at> wrote:
>> We're having the general problem that -- in one customer's database --
>> the performance of the queries varies widely. I've found one example
>> which is easy to explain and which can be reproduced reliably:
>>
>> SQL A: SELECT * FROM table WHERE a = 1234 AND b <> 0
>>
>> returns one record (by coincidence, for this particular choice of
>> field "a"; it could be more) and is very fast (< 1s). On the other
>> hand
>>
>> SQL B: SELECT TOP 1 * FROM table WHERE a = 1234 AND b <> 0
>> ORDER BY primaryKeyOfTable DESC
>>
>> returns the same record, but is very slow (~ 17s).
>>
>> Apparently, SQL Server is choosing a very bad query plan here: SQL
>> Server is doing "smart stuff", like Index Seek, Hash Match, etc. for
>> SQL A, but for SQL B it's just making a slow Clustered Index Scan (the
>> table has ~600.000 records, BTW). [...]
>
> Why do you think that these query should perform identical ?
I'm not saying they should perform identical. I'm saying that SQL B
should perform much faster. Clearly,
1. executing SQL A according to its query plan (< 1s),
2. sorting this one row (trivial),
3. doing TOP 1 on this one row (trivial)
would be a much better query plan for SQL B than what SQL Server is
doing. My question is: How can I improve SQL Server's query plan
guessing?
> What if you have a CI with DESC order, does it run faster?
Yes (~ 3s). [SQL Server still uses a CI scan, so this does not solve
the problem but just eases the symptoms.]
Interstingly, changing the CI back to ASC order causes SQL B to take
~6s (instead of ~17s). Argh... indeterminism makes things so much
harder to debug.* :-(
Greetings,
Heinzi
* Of course, I'm running DBCC DROPCLEANBUFFERS before every test run
to avoid caching effects.
If you are on SQL 2008, it is possible that a filtered index or a
filtered statistics would help.
For eariler versions, I don't think I have much better suggestions than
to use an index hint. Had I seend the real table definitions and indexes,
maybe I would had had more ideas.
"Heinrich Moser" <use...@heinzi.at> wrote in message
news:87hc1ox...@msgid.heinzi.at...
"Heinrich Moser" <use...@heinzi.at> a �crit dans le message de
news:87ljr4u...@msgid.heinzi.at...