First, SQL Server checks if a trivial plan is available. If so, it will
go ahead with this plan. For example, with an INSERT statement using
the VALUES clause, there is only one possible execution.
If no trivial plan is available, SQL will try to simplify the query, so
that a trivial plan will be available. This will not result in a new
plan, but helps SQL to analyze the query. At this point SQL server will
load any statistics that will help it in the cost-based process that
follows. This cost-based process has three steps.
First is the transaction processing phase. In this phase, SQL Server
picks out plans for simple queries typical of transaction processing
databases. If a plan is formed with a total cost below a threshold, it
will use this plan.
If a cheap plan can not be found, the Quick Plan phase starts. In
this phase, SQL includes choices that are often useful in more complex
queries. This might include use of indexes and nested loop joins.
Again, if a cheap enough plan is found, SQL Server will use this plan.
The last phase is the Full Optimization Phase. In this phase, SQL
Server compares every possible execution plan and then goes with the
cheapest one. This phase has a time limit. When the time limit is
reached, SQL Server goes ahead with the cheapest plan found up until
that moment.
Statistics
===========================
This process -- except for the trivial plan -- is based solely on
statistics. Bad statistics lead to bad execution plans. If an index is
placed on a small table, the optimal plan might be to ignore the index
and use a table scan. If the statistics are not updated while the table
grows significantly, SQL Server will still assume that the table is
small and use the table scan, even if this is no longer the optimal
plan.
In most cases it is wise to let SQL Server automatically update
statistics. To turn on the automatic updates for all statistics in a
database, if not already on, execute the following statement:
USE master
EXEC sp_dboption 'MyDb', ' Auto Update Statistics', 'true'
To turn on the automatic updates for all statistics on a specific
table, such as Clients, execute the following statement:
USE MyDb
EXEC sp_autostats Clients, 'ON'
To manually update the statistics on specific table, such as Clients,
execute the following command:
USE MyDb
UPDATE STATISTICS Clients
How Statistics Work
===================================
Keeping statistics up to date is crucial for optimal performance of a
database. In some cases, however, an available execution plan is not
ideal for the given situation. To understand when and why this happens,
it is important to understand how statistics work.
The following command displays the statistics for a given index (index
INX1a on table Clients)
USE MyDb
DBCC SHOW_STATISTICS (Clients, INX1a)
This gives all statistical information on this index. Among the date
and time of the last update and it's density, it returns samples of the
index.
Work Around
============================
Statistics can not be expanded. They are designed to be kept small, so
that the process of determining an execution plan is kept small. Even
if the statistics could be doubled in size, this problem will still
arise.
The option of using an index hint will only be helpful if used in all
the queries with a search parameter in this field. Adding an index hint
is a manual job, requiring knowledge of the dataset. A human will
encounter the same difficulties in determining a best use of indexes as
SQL Server itself. Also, automating the use of index hints in a
production environment is time-consuming and far from efficient.
In these situations, a work around is the best option. By changing the
specific index into a clustered index, a bookmark lookup is no longer
necessary. This saves considerable time and effort. If the statistics
are off and the number of hits exceeds the expected number, all
relevant records are clustered together and require little extra time.
Index Planning
============================
The effect of changing currently existing clustered index into a
non-clustered index should be examined. There can only be one clustered
index on a table. In general, the difference in performance for a field
containing unique values is minimal between a clustered and a
non-clustered index. If, however, this field is used in a one to many
relationships, the general performance for the overall queries to the
database might decline.
It is wise to take all these aspects into consideration when designing
the index structure for your databases. It is often worth the effort to
try different configurations and go with the one best suitable. Of
course using the Index Tuning Wizard will help in this process