Indexing 101

Scott

unread,

Aug 22, 2010, 1:38:48 PM8/22/10

to

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 1', 'This is product 1', '3', '1',3.25, '1')

I'm just getting into indexing after noticing a very slow query of my main
products table. The table is normalized and only contains 320 records. When
I run a paging stored procedure and return 20 or 50 rows, it runs fast, but
a request of all 320 records is extremely slow.

I provided a ddl below to illustrate. You'll notice that I have 2 fields
catID (Category ID) and sizeID (Size ID). These are related table id fields.

I was hoping to get some advice on indexing this table. As you can see, the
prodID column is a primary key and has a clustered index. Should I create
single column indexes on the catID and sizeID columns or should I create a
composite index of the 2 vfields. Should I create an index on the prodName
field? Also, what about a composite index of the prodID, catID and sizeID
fields?

Any guidance and code examples would be appreciated.

DDL:

IF object_id('tempdb..#tmpProducts') IS NOT NULL

BEGIN

DROP TABLE #tmpProducts

END

BEGIN

CREATE TABLE #tmpProducts (

prodID int identity(1,1) NOT NULL,

prodName varchar(75) NULL,

prodDescription varchar(800) NULL,

catID int NULL,

sizeID int NULL,

prodCost money NULL,

prodActive int NULL

CONSTRAINT [PK_t_products] PRIMARY KEY CLUSTERED

( prodID ASC

)

) ON [PRIMARY]

END

BEGIN

SET NOCOUNT ON

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 1', 'This is product 1', '3', '1',3.25, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 2', 'This is product 2', '1', '3',7.45, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 3', 'This is product 3', '4', '2',6.59, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 4', 'This is product 4', '3', '1',4.14, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 5', 'This is product 5', '1', '2',5.85, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 6', 'This is product 6', '5', '5',4.52, '1')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 7', 'This is product 7', '2', '3',5.90, '0')

INSERT INTO #tmpProducts(prodName, prodDescription, catID, sizeID, prodCost,
prodActive) VALUES('Product 8', 'This is product 8', '2', '6',2.65, '1')

END

SELECT * FROM #tmpProducts

DROP TABLE #tmpProducts

Erland Sommarskog

unread,

Aug 22, 2010, 2:02:01 PM8/22/10

to

Scott (sba...@mileslumber.com) writes:
> I'm just getting into indexing after noticing a very slow query of my
> main products table. The table is normalized and only contains 320
> records. When I run a paging stored procedure and return 20 or 50 rows,
> it runs fast, but a request of all 320 records is extremely slow.
>
> I provided a ddl below to illustrate. You'll notice that I have 2 fields
> catID (Category ID) and sizeID (Size ID). These are related table id
> fields.
>
> I was hoping to get some advice on indexing this table. As you can see,
> the prodID column is a primary key and has a clustered index. Should I
> create single column indexes on the catID and sizeID columns or should I
> create a composite index of the 2 vfields. Should I create an index on
> the prodName field? Also, what about a composite index of the prodID,
> catID and sizeID fields?

You could say there are two purposes of an index. The main purpose is to
make it faster to locate rows. But if you retrieve all rows, that's is not
of interest. But a side-effect of an index is that it includes only a subset
of the data, and therefore is smaller.

Thus in your case, there is not much point with an index, unless, you are
only retrieving a subset of the columns, and the table has some very wide
columns which you don't include. In this case a index that covers your
query, or at least exclude the biggies, be useful.

Obviously something is wrong with your paging procedure, if it's dead slow
on 320 rows. But the issue is likely to be something else than a missing
index.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Scott

unread,

Aug 22, 2010, 3:22:21 PM8/22/10

to

I think you're right about the paging sproc. However, I just want to make
sure I understand you.

You are saying that if you had a typical products or orders table (like
northwind or other ms sample db's), you wouldn't do anything more than
create a primary key?

"Erland Sommarskog" <esq...@sommarskog.se> wrote in message
news:Xns9DDCCBCB...@127.0.0.1...

Erland Sommarskog

unread,

Aug 22, 2010, 4:36:10 PM8/22/10

to

Scott (sba...@mileslumber.com) writes:
> You are saying that if you had a typical products or orders table (like
> northwind or other ms sample db's), you wouldn't do anything more than
> create a primary key?

No, I didn't say that. For a 320 row-table I would probably not add any
index because of physical reasons. (I may still add a UNIQUE constraint on
the name of logical reasons.)

Then again, most shops have more than 320 products, so yes, I would probably
add index on FK columns, at least for entities where deletions or key
updates can occur. And of course attributes where searches are likely,
like "show me all products in the widget category".

An Orders table is likely to become really big, so indexing is essential.
But since it also is likely to see a high insert frequency, there is
reason to be restrictive, since there is a cost for maintaining the indexes
as well. But again, FKs where deletions are likely indexes are necessary.

The last point may call for some clarification. Say that you want to be able
to delete customers that have not placed any orders for a while, and that
you age out old orders. If there was no index on CustomerID, the deletion
would be slow, since SQL Server would have to scan Orders to verify that the
customer is not there. On the other hand say that you have a column
OrderTypeID which can hold one of the thirty-five values in the table
OrderTypes, a table which is loaded when the system is installed and beyond
user control. In this case, there is no requirement to have an index on
OrderTypeID only because is an FK.

I looked at one of the major transactional tables in the system I work with.
There are some 25 FK columns in this table, but at most five are indexed.

Tom Cooper

unread,

Aug 22, 2010, 6:14:51 PM8/22/10

to

You almost certainly want other indexes. Probably on some but not all
foreign keys in that table, as Erland pointed out. Columns which appear in
join clauses of lots of your queries are often good choices. And also on
some columns which are commonly queried on where the queries return a small
subset of rows. An index on those columns will often make those queries run
significantly faster. But every time you add a new nonclustered index to a
table. you will slow down all inserts and all deletes to that table and also
slow down any updates that affect the column(s) in that index. So, for
example, possible indexes you might want on an order table could include

Index on CustomerId - for queries like show me all orders for customer x
Index on DueDate and Status, filtered to include only active orders - for
queries like show me all past due orders or all orders due in the next 3
days
PurchaseOrderNumber - if when one of your customers calls with an inquiry,
the only thing they are likely to know is their PO number

There could be many others depending on what type of queries are most often
done to fullful your business needs. Always remembering that what you want
is not the best performance possible for any given query, but the best
overall performance of your system. If you have a report that runs in the
middle of the night from 3AM to 6AM and needs to be available at 9AM, and
you find an index that lets you run the query in 5 minutes rather than 3
hours, but that index adds 10 milliseconds to the time to enter an order
during the day and you get 10,000 orders an hour, you probably don't want
that index.

Transactional tables (like an orders table) should typically have very few
indexes because there are many inserts, updates, and, sometimes, deletes.
Data warehouse type tables (like an orders history table) - which get very
large and don't change much except to add new rows and even the addition of
new rows can be done during a time when the overall load on the system is
light - often benefit from having more indexes.

Indexing is. like almost everything to do with performance, more of an art
than a science. Book and classroom learning helps. Having experience helps
a lot. But much of it is trial and error. Try to get a realistic volumn of
data (testing performance with small amounts of data is usually almost
worthless, since almost anything you do works) and a realistic load of
queries and see what works. You may be able to use the Database Engine
Tuning Advisor to get suggestions on which possible indexes might help. Use
its results as a suggestion only. It will often try to suggest you add ten
or fifteen new indexes. Don't do that. Try them one at a time and only
implement the ones which help.

Tom

"Scott" <sba...@mileslumber.com> wrote in message
news:%23TAtZ9i...@TK2MSFTNGP05.phx.gbl...