I have a basic select on a table:
*select col1, col2, col3 from tab1 where col3 > 100*
I setup an index on col3 and saw no improvement (in the query analyzer,
execution plan).
When I setup 2 more indeces on col1 and col2 I saw massive improvement.
The execution plan looked like it was using each of the indeces to
search the whole table and then merging the results with a "Hash
match/Inner join".
Why is it searching on col1 or col2? I would think it just needs to
search on col3 and return the associated entries in that row. The query
text (in the execution plan) reads something like:
select [col1]=[col1], [col2]=[col2], [col3]=[col3] from tab1 where [col3]>@1
thanks for any help.
-A Freeman
andrews...@acm.org
--
Jason W. Paul
"A S Freeman" <andrews...@acm.org> wrote in message
news:b8404t$dar$1...@houston.jhuapl.edu...
> I am relatively new to tuning SQL Server databaseses and I am
trying
> to get a handle on indeces.
We generally call them indexes (yeah, I know, it *should* be
indeces, but that isn't the term in general usage 'round here)
> I have a basic select on a table:
>
> *select col1, col2, col3 from tab1 where col3 > 100*
>
> I setup an index on col3 and saw no improvement (in the query
analyzer,
> execution plan).
There are a large number of variables that affect
what plan is generated. Simply adding an index
will not actually force the optimizer to choose that
index.
> When I setup 2 more indeces on col1 and col2 I saw massive
improvement.
That's because you now have a "covering index". Google
should be able to explain that to you. This is generally referred
to as either a "covering index" or a "covered index".
It should be ind_i_ces, not indeces. Blame it all on the Romans, shifting
i's and e's around, they're all long dead anyway ;). Indexes is also correct
btw.
> > I have a basic select on a table:
> >
> > *select col1, col2, col3 from tab1 where col3 > 100*
> >
> > I setup an index on col3 and saw no improvement (in the query
> analyzer,
> > execution plan).
>
> There are a large number of variables that affect
> what plan is generated. Simply adding an index
> will not actually force the optimizer to choose that
> index.
Correct. Most likely the where clause it not selective enough. The query
optimizer will in this case probably use the index if less than a few
percent of the rows are returned. YMMV. If you did *select col3 from tab1
where col3 > 100* the index would be used as it is a covering index now.
> That's because you now have a "covering index". Google
> should be able to explain that to you. This is generally referred
> to as either a "covering index" or a "covered index".
Not completely correct. The term covering index is generally used for _one_
index that covers all the requirements of the query (columns that are used
in the select, where, group by and order by clauses). In this case we have
index intersection, where multiple indexes are used to solve the query.
> > Why is it searching on col1 or col2? I would think it just needs to
> > search on col3 and return the associated entries in that row.
From the behaviour you describe I assume that you have a clustered index on
col3 and non-clustered indexes on col1 and col2. The behaviour you see is
then explained because a non-clustered index on a table that also has a
clustered index will use the clustered index to point to the rows in the
table, which means that the non-clustered index in made up of the column(s)
in that non-clustered index and the column(s) in the clustered index, your
case the non-clustered indexes are made up of col1, col3 and col2, col3
respectively. Apparently the query optimizer thought it was more efficient
to combine these two indexes than to access the table. Are there any other
columns in that table? And are they wide compared to col1, col2 and col3?
That would explain things more.
Anyway, if you want to know more about indexes and performance tuning in
general there are 2 books I can advise you to read:
- Inside SQL Server 2000 by Kalen Delaney
- Microsoft SQL Server 2000 Performance Optimization and Tuning Handbook
both are very very good books.
"Jason W. Paul" <nob...@nowhere.com> wrote in message
news:3ea59113$0$89170$a046...@nnrp.fuse.net...
A S Freeman (andrews...@acm.org) writes:
> I am relatively new to tuning SQL Server databaseses and I am trying
> to get a handle on indeces.
>
> I have a basic select on a table:
>
> *select col1, col2, col3 from tab1 where col3 > 100*
>
> I setup an index on col3 and saw no improvement (in the query analyzer,
> execution plan).
Apparently, you creaetd a non-clustered index.
Non-clustered indexes are not always good to use. With only col3 in
the index, SQL Server still needs to look up the data pages to get
the values of col1 and col2 for all matching rows. This can lead to
more pages reads than scanning the table from left to right. Therefore
the optimizer is somewhat conservative with regarding the use of
non-clustered index.
Had you created a clustered index, SQL Server would have used the
index successfully, as in a clustered index, the leaf level of the
index is stored with the data.
> When I setup 2 more indeces on col1 and col2 I saw massive improvement.
> The execution plan looked like it was using each of the indeces to
> search the whole table and then merging the results with a "Hash
> match/Inner join".
>
> Why is it searching on col1 or col2? I would think it just needs to
> search on col3 and return the associated entries in that row. The query
> text (in the execution plan) reads something like:
It is not searching on col1 and col2. But rather than looking up the values
from the data pages, SQL Server gets the values from the index pages.
This is possible because the complete index node consists of indexed
columns + a key that unique defines the row. For a table without a
clustered index that is a RID (Row ID), for a table with a clustered
index, it is simple the values of the clustered index keys for that
row.
Thus, SQL Server finds col3 = 255 and RID = 4711. It can then find
RID 4711 in the indexes for col2 and col3 too, and thus get the
values for col1 and col2. But since the indexes are not sorted on
RID, SQL Server composes a hash table, to speed up the lookup.
Composing a hash table, so this is still not a perfect plan. The index
you should create is (col3, col2, col1). You now have a covering
index, and SQL Server can find all data for the query in this index.
--
Erland Sommarskog, SQL Server MVP, som...@algonet.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp
I'll trade ya! here is your answer (please look into my question
posted same place:
select *
from with INDEX(indexname1, indexname2...)
where ....
This will tell specifically the optimizer to use the indexes. Warning
this is the SQL2K form.
For SQL 7 it was INDEX =
"Jason W. Paul" <nob...@nowhere.com> wrote in message news:<3ea59113$0$89170$a046...@nnrp.fuse.net>...
select *
from tablename with index(indexname1, indexname2,......)
where....
A S Freeman <andrews...@acm.org> wrote in message news:<b8404t$dar$1...@houston.jhuapl.edu>...
--
Tibor Karaszi, SQL Server MVP
Archive at: http://groups.google.com/groups?oi=djq&as_ugroup=microsoft.public.sqlserver
"Francois" <gibe...@hotmail.com> wrote in message
news:86699517.03042...@posting.google.com...
"Tibor Karaszi" <tibor.please_reply_to...@cornerstone.se> wrote in message news:<#lFSlvXC...@TK2MSFTNGP12.phx.gbl>...
I stand corrected. Thanks.
one of the fields that I need to select on is a nchar(2000) field . The
enterprice manager does not allow me to add that to my covering index.
What kind of index or other optimization can I do that does not require
the me to include every column from the select clause?
-A Freeman
You could create a clustured index on the column you called col3 in your
original example.
You could also consider creating an indexed view, if you already have a
clustered index on the table.
There might be other solutions too, but without knowledge about the tables
and the data in it, it is a little hard to give precise advice.