Some questions about relational indexes and statistics in SS2005

rja.ca...@gmail.com

unread,

May 7, 2013, 7:39:33 AM5/7/13

to

I have some questions about relational indexes and statistics in
Microsoft SQL Server 2005, mainly Standard or Enterprise Editions.

This is in the context of improving indexes and statistics in many
copies of a database that wasn't designed by me, and in which many
tables are dropped and re-created by stored procedures overnight,
that I can change, after which users run various reports at will.

1. A nonclustered index can "include" columns that are not part of
the index key: can this be done with a nonclustered primary key
constraint index?

I think the answer is "No", and I don't particularly want to do it
anyway, but the workaround might be to /not/ have a primary key,
but have a unique nonclustered index with included columns, which
does everything except display that it is the table's primary key,
/or/, to create and then /disable/ the primary key index (if you
can do that), /and/ create the unique nonclustered index.

If you can do it, and if someone already has in my databases,
then I want to be able to reproduce it.

2. Will SQL Server ever automatically create a separate statistics group
on columns that are already the key of a clustered or nonclustered
index? And, if so, should I consider improving the index so that
it provides a satisfactory statistics group, and, how can I do that?

I think the answer is "No" in the first place, but that I may find
cases where (1) an index was created later than when the table was
created, and a statistics group was already auto-created on the
columns before the index was; (2) one copy of a table has an index,
and another copy does not have the index, but has the statistics;
(3) an index and a statistics group are on more than one column
and have some of the same columns, or have columns in a different order;
(4) a user-created statistics group has the same columns as an index.
Maybe all at the same time. All I know now is, they're there,
and I think they're a bad sign.

Thanks for advice,

Robert Carnegie

Erland Sommarskog

unread,

May 7, 2013, 3:26:41 PM5/7/13

to

(rja.ca...@gmail.com) writes:
> 1. A nonclustered index can "include" columns that are not part of
> the index key: can this be done with a nonclustered primary key
> constraint index?

No.

> but the workaround might be to /not/ have a primary key,
> but have a unique nonclustered index with included columns, which
> does everything except display that it is the table's primary key,

You could, but I have never seen the need. But, yeah, I could see the
scenario.

> /or/, to create and then /disable/ the primary key index (if you
> can do that), /and/ create the unique nonclustered index.

You can't disable the PK.

> 2. Will SQL Server ever automatically create a separate statistics group
> on columns that are already the key of a clustered or nonclustered
> index? And, if so, should I consider improving the index so that
> it provides a satisfactory statistics group, and, how can I do that?

Not sure that I understand the question, but SQL Server always creates
statistics on the index columns. I have never heard "statistics group"
before, but maybe you mean multi-column statistics.

> (1) an index was created later than when the table was created, and a
> statistics group was already auto-created on the columns before the
> index was;

Auto-stats are always single column statistics. Multi-column statistics
are always user-initiated, either through CREATE INDEX or CREATE
STATISTICS.

> (3) an index and a statistics group are on more than one column
> and have some of the same columns, or have columns in a different order;

Then they are different indexes and statistics. Orders matter in
both.

> (4) a user-created statistics group has the same columns as an index.
> Maybe all at the same time. All I know now is, they're there,
> and I think they're a bad sign.

If there is a user-defined statistics which has the same definition as
an index, that statistics is redundant and should be dropped.

--
Erland Sommarskog, Stockholm, esq...@sommarskog.se

rja.ca...@gmail.com

unread,

May 9, 2013, 5:45:24 PM5/9/13

to

On Tuesday, 7 May 2013 20:26:41 UTC+1, Erland Sommarskog wrote:

> (rja.ca...@gmail.com) writes:
>
> > 2. Will SQL Server ever automatically create a separate statistics group
> > on columns that are already the key of a clustered or nonclustered
> > index? And, if so, should I consider improving the index so that
> > it provides a satisfactory statistics group, and, how can I do that?
>
> Not sure that I understand the question, but SQL Server always creates
> statistics on the index columns. I have never heard "statistics group"
> before, but maybe you mean multi-column statistics.

Thanks for your helpful advice. I'd better clear up that point first:
if I execute "CREATE STATISTICS Erland ON Accounts(Balance)", what is
Erland? That's to say, Erland is a what? ;-) I went back to the
SQL Server 2005 Books Online and found one or more pages that say
that Erland is "a statistics group (collection)" - usually with
all of those words. But going by the SQL Server 2012 manual,
I should say "Erland are statistics". In my own language, a name
usually belongs to a singular thing, something that there's one of.
But apparently SQL Server differs in this! :-)

I'm feeling more confident now that if I create an Erland on a
column, SQL Server won't create another auto-stat Erland on the
same column - which is liable to happen while a user is waiting
for their query to run. I don't know if it's important to
avoid that, but I'm more confident that my Erlands need to be
up-to-date, and for some reason that isn't always the case,
since performing "UPDATE STATISTICS" on every table improves
the queries. I /assume/ that auto-stats only happen when
a query wants to read from a table, so, ...hmm, I suppose
that a table update statement reads /and/ writes. And I was
looking today at a program that does that several times.
Okay. I probably should have paid more attention to that.
Our database designs, you see, are baroque. I think that's
the word? Or a word that sounds like it? ;-)

Well, my grand scheme includes (1) writing a stored procedure
that defines, say, ten or twenty indexes and statistics on a
table, but allows them to be created one by one, and,
(2) using the same interface to perform "index rebuild"
and "update statistics" on the target - that part isn't
done yet. And this is where the table may be called
Student_2011, Student_2012, Student_2013, and may be in
database UserStuff01, UserStuff02, UserStuff03, and so on:
the interface stored procedure addresses that complexity.

I also want to avoid having the server do some of this meta-work
more than once. Except that it deletes a lot of the data every
day and loads it in from where it came from again. So, make that
"more than necessary".

So, I guess I'm going to need some or all of those features in my
tool, which may mean, writing the parts that I haven't written yet.

Alternatively, I may finally convince my boss that a lot of things
will run acceptably fast if they install a bigger, better server,
and the way that they pay for it is to let me go - as they say.

And then it won't be my problem any more.

Erland Sommarskog

unread,

May 10, 2013, 5:05:44 AM5/10/13

to

(rja.ca...@gmail.com) writes:
> Well, my grand scheme includes (1) writing a stored procedure
> that defines, say, ten or twenty indexes and statistics on a
> table, but allows them to be created one by one, and,
> (2) using the same interface to perform "index rebuild"
> and "update statistics" on the target - that part isn't
> done yet. And this is where the table may be called
> Student_2011, Student_2012, Student_2013, and may be in
> database UserStuff01, UserStuff02, UserStuff03, and so on:
> the interface stored procedure addresses that complexity.
>
> I also want to avoid having the server do some of this meta-work
> more than once. Except that it deletes a lot of the data every
> day and loads it in from where it came from again. So, make that
> "more than necessary".

Not sure where you are heading here. Even if you add statistics manually,
SQL Server will still maintain them, just like it maintains autostats.
Or do you plan to turn off automatic updates of statistics for the
database?

rja.ca...@gmail.com

unread,

May 12, 2013, 1:01:16 PM5/12/13

to

On Friday, 10 May 2013 10:05:44 UTC+1, Erland Sommarskog wrote:
> Not sure where you are heading here. Even if you add
> statistics manually, SQL Server will still maintain them,
> just like it maintains autostats. Or do you plan to turn
> off automatic updates of statistics for the database?

I could do that for each statistic - but this is data
that is loaded once (once a night) into a database for
enterprise reporting. If it isn't updated after
that, then the statistics don't need to be refreshed
again, and won't be - right?

Erland Sommarskog

unread,

May 12, 2013, 1:22:54 PM5/12/13

to

> I could do that for each statistic - but this is data
> that is loaded once (once a night) into a database for
> enterprise reporting. If it isn't updated after
> that, then the statistics don't need to be refreshed
> again, and won't be - right?
>

Right.