To Joe Celko, question about GROUP BY, related to "SQL for Smarties", second edition

Lasse Vågsæther Karlsen

unread,

Oct 4, 2000, 3:00:00 AM10/4/00

to

"SQL for Smarties", second edition, page 352:

SELECT salary, COUNT(*) AS frequency
FROM Payroll
GROUP BY salary
HAVING COUNT(*) >= ALL (SELECT COUNT(*)
FROM Payroll
GROUP BY salary);

The nested select here allows a GROUP BY salary, even though salary isn't
listed in the SELECT list. In addition, the query can be executed by itself.

For some reason, I've always believed that I needed the same list of columns
in the SELECT list as in the GROUP BY clause, with the possible addition of
any aggregated functions, however the above example executes well on
Microsoft SQL Server 2000, both as part of the containing query on page 352
as well as a stand-alone query.

So I looked at the definition of GROUP BY you have in the book, and on page
300, the first paragraph contains the following sentence:

The order of the grouping columns in the GROUP BY clause does not matter,
but since they have to appear in the SELECT list, you should probably use
the same order in both lists to enhance readability.

The sentence states "have to appear", but your example, and MS SQL Server
2000 appear to show evidence to the contrary.

What's the correct definition? Although I'm currently only developing for MS
SQL Server 7.0/2000, and MS SQL Server 2000 supports what (to me) appears to
be broken syntax, I'd rather not get used to a syntax that is wrong.

My (,rather naive I would guess,) version of the query with what I thought
was the correct syntax would look like:

SELECT salary, COUNT(*) AS frequency
FROM Payroll
GROUP BY salary
HAVING COUNT(*) >= ALL (SELECT cnt FROM (SELECT salary, COUNT(*) AS cnt
FROM Payroll
GROUP BY salary));

--
Lasse Vågsæther Karlsen
Cintra Software Engineering AS
+47 35.93.22.03 Office / +47 35.93.22.11 Office fax
+47 90.86.82.73 Cellular
+47 35.55.83.63 Home / +47 35.55.83.64 Home fax
PGP Key: 0x8BF38D66 (ldap://certserver.pgp.com)

try...@aol.com

unread,

Oct 4, 2000, 3:00:00 AM10/4/00

to

The same 'broken' syntax works in Access.And since Access is the
yardstick from which all dbs are measured by who cares what Mr. Buster
thinks.

Steve Dassin

Sent via Deja.com http://www.deja.com/
Before you buy.

Tore

unread,

Oct 5, 2000, 2:09:04 AM10/5/00

to

Funny... :->

Tore.

<try...@aol.com> wrote in message news:8rg6as$kot$1...@nnrp1.deja.com...

Joe Celko

unread,

Oct 9, 2000, 3:00:00 AM10/9/00

to

>> For some reason, I've always believed that I needed the same list of
columns in the SELECT list as in the GROUP BY clause, with the possible
addition of any aggregated functions, however the above example
executes well on Microsoft SQL Server 2000, both as part of the

containing query on page 352as well as a stand-alone query. <<

Here is how a SELECT works in SQL ... at least in theory. Real
products will optimize things when they can.

a) Start in the FROM clause and build a working table from all of the
joins, unions, intersections, and whatever other table constructors are
there. The table expression> AS <correlation name> option allows you
give a name to this working table which you then have to use for the
rest of the containing query.

b) Go to the WHERE clause and remove rows that do not pass criteria;
that is, that do not test to TRUE (reject UNKNOWN and FALSE). The
WHERE clause is applied to the working in the FROM clause.

c) Go to the optional GROUP BY clause, make groups and reduce each
group to a single row, replacing the original working table with the
new grouped table. The rows of a grouped table must be group
characteristics: (1) a grouping column (2) a statistic about the group
(i.e. aggregate functions) (3) a function or (4) an expression made up
of the those three items.

d) Go to the optional HAVING clause and apply it against the grouped
working table; if there was no GROUP BY clause, treat the entire table
as one group.

e) Go to the SELECT clause and construct the expressions in the list.
This means that the scalar subqueries, function calls and expressions
in the SELECT are done after all the other clauses are done. The AS
operator can give a name to expressions in the SELECT list, too. These
new names come into existence all at once, but after the WHERE clause
has been executed; you cannot use them in the SELECT list or the WHERE
cluase for that reason.

f) Nested query expressions follow the usual scoping rules you would
expect from a block structured language like C, Pascal, Algol, etc.
Namely, the innermost queries can reference columns and tables in the
queries in which they are contained.

This means that a SELECT cannot have more columns than a GROUP BY; but
it certainly can have fewer columns.

>> So I looked at the definition of GROUP BY you have in the book, and
on page 300, the first paragraph contains the following sentence:

The order of the grouping columns in the GROUP BY clause does not
matter, but since they have to appear in the SELECT list, you should
probably use the same order in both lists to enhance readability. <<

Not all of them have to appear. That is a bad use of pronouns in
English -- "they" versus "they all"; I will fix that in the next
edition.

--CELKO--
Joe Celko, SQL and Database Consultant
When posting, inclusion of SQL (CREATE TABLE ..., INSERT ..., etc)
which can be cut and pasted into Query Analyzer is appreciated.

Joe Celko

unread,

Oct 9, 2000, 3:00:00 AM10/9/00

to