Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: With Rollup wrong results

0 views
Skip to first unread message

Björn Persson

unread,
Nov 15, 2005, 2:28:08 AM11/15/05
to
Noel Stratton:
> As you can see the summary total in the 'Total' column is incorrect. The
> summary total should be 36 but is spitting out 40. I am not even sure
> where it is getting 40. The only way to get 40 is if it is multiplying
> 5X8.

Isn't that what you told it to do?

> ROUND(price*count(log.product), 2) AS 'Total'

=46or the summary line that means "Pick a price field at random (as there a=
re=20
several rows to choose from) and multiply it with the count of all the=20
log.product fields."

Bj=F6rn Persson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dmy...@freebsd.csie.nctu.edu.tw

Gleb Paharenko

unread,
Nov 15, 2005, 2:50:48 AM11/15/05
to
Hello.

In my opinion, you have something wrong with you query in general.
product.price field is in SELECT part and not in GROUP BY, so the
results not-predictable. See:
http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html


Noel Stratton wrote:
> I am running the query below:
>
> SELECT products.product, products.price, count( log.product ) AS 'Count',
> ROUND(price*count(log.product), 2) AS 'Total'
> FROM products LEFT JOIN log ON products.product= log.product
> GROUP BY product with rollup
>
> Here are the results:
> +------------------------------+-------+-------+-------+
> | product | price | Count | Total |
> +------------------------------+-------+-------+-------+
> | ATM Card | 3.00 | 2 | 6.00 |
> | Audio Response | 3.00 | 0 | 0.00 |
> | Check Card | 5.00 | 1 | 5.00 |
> | Courtesy Pay | 5.00 | 2 | 10.00 |
> | Draft with Direct Deposit | 5.00 | 0 | 0.00 |
> | Draft without Direct Deposit | 3.00 | 0 | 0.00 |
> | E-statement | 5.00 | 2 | 10.00 |
> | Gap | 20.00 | 0 | 0.00 |
> | MBI | 10.00 | 0 | 0.00 |
> | Membersonline | 5.00 | 0 | 0.00 |
> | New Account | 5.00 | 1 | 5.00 |
> | New Loan | 5.00 | 0 | 0.00 |
> | New Mem"Bear" Account | 5.00 | 0 | 0.00 |
> | NULL | 5.00 | 8 | 40.00 |
> +------------------------------+-------+-------+-------+
> 14 rows in set (0.00 sec)


>
> As you can see the summary total in the 'Total' column is incorrect. The
> summary total should be 36 but is spitting out 40. I am not even sure where
> it is getting 40. The only way to get 40 is if it is multiplying 5X8.
>

> I am running MySQL 4.1.15. Also the 'price' column has the wrong total. I
> am not really concerned with that total. I just want the summary total in
> the 'Total' column to be correct.
>
> Any suggestions?
>
> Thank You,
>
> Noel Stratton
> Computer Specialist
> Members 1st Credit Union
>
>


--
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Gleb Paharenko
/ /|_/ / // /\ \/ /_/ / /__ Gleb.Pa...@ensita.net
/_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET
<___/ www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql

To unsubscribe: http://lists.mysql.com/mysql?unsub=myo...@freebsd.csie.nctu.edu.tw

Michael Stassen

unread,
Nov 15, 2005, 3:17:02 AM11/15/05
to

I think there are two issues here. First, there's the issue of how "WITH
ROLLUP" behaves. It doesn't simply add values in columns. Instead, it gives a
"super-aggregate" value, one in keeping with the kind of values in the column.
Consider the following:

CREATE TABLE rt (cat INT, val INT);

INSERT INTO rt VALUES
(1, 1), (1, 3), (1, 5), (1, 7),
(2, 2), (2, 4), (2, 6), (2, 8),
(3, 3), (3, 9), (3, 27);

SELECT cat, MIN(val), MAX(val), COUNT(val), SUM(val), AVG(val)
FROM rt GROUP BY cat WITH ROLLUP;
+------+----------+----------+------------+----------+----------+
| cat | MIN(val) | MAX(val) | COUNT(val) | SUM(val) | AVG(val) |
+------+----------+----------+------------+----------+----------+
| 1 | 1 | 7 | 4 | 16 | 4.0000 |
| 2 | 2 | 8 | 4 | 20 | 5.0000 |
| 3 | 3 | 27 | 3 | 39 | 13.0000 |
| NULL | 1 | 27 | 11 | 75 | 6.8182 |
+------+----------+----------+------------+----------+----------+
4 rows in set (0.01 sec)

Notice the last line. You get the smallest of the MIN() values, the largest of
the MAX() values, the average of the AVG() values, and the total of the COUNT()
and SUM() values. (And yes, the manual completely fails to explain this.)

This brings us to the second issue. How should price be interpreted in the
ROLLUP row? It's not an aggregate function, so what should we do with the
values in the price column. The answer to that isn't really defined, so I think
mysql is free to pick any value. In practice, it appears to simply use the
value from the immediately preceding row.

In fact, technically, price shouldn't even be in the query, because it is
neither a grouped column nor an aggregate function. Mysql allows this as a
convenience when you know what you are doing
<http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html>, but you
have to be careful. I expect there is a single price per product in your
products table, so adding price to the query would have been safe with a simple
GROUP BY, but adding "WITH ROLLUP" complicates matters. In particular, I think
your "price*COUNT(log.product)" is handled like this in the ROLLUP row:
COUNT(log.product) is summed, producing 8, which is then multiplied by the last
value of price found, 5, producing 40. Not what you had in mind.

Instead of multiplying price by the number of rows found, how about adding price
every time we find a row, like this:

SELECT p.product, p.price,
count(l.product) AS 'Count',
ROUND(SUM(p.price), 2) AS 'Total'
FROM products p
JOIN log l ON p.product= l.product
GROUP BY p.product WITH ROLLUP;
+--------------+-------+-------+-------+
| product | price | Count | Total |
+--------------+-------+-------+-------+


| ATM Card | 3.00 | 2 | 6.00 |

| Check Card | 5.00 | 1 | 5.00 |
| Courtesy Pay | 5.00 | 2 | 10.00 |

| E-statement | 5.00 | 2 | 10.00 |


| New Account | 5.00 | 1 | 5.00 |

| NULL | 5.00 | 8 | 36.00 |
+--------------+-------+-------+-------+
6 rows in set (0.00 sec)

Notice that the ROLLUP row for Total is now correct. Of course, you wanted to
see the 0 rows. We have to change the join to a LEFT JOIN, as you did in your
query, and we have to make sure we add 0 when the right side is NULL:


SELECT p.product, p.price, count(l.product ) AS 'Count',
ROUND(SUM(IF(l.product IS NOT NULL, p.price, 0)), 2) AS 'Total'
FROM products p
LEFT JOIN log l ON p.product= l.product
GROUP BY p.product
WITH ROLLUP;


+------------------------------+-------+-------+-------+
| product | price | Count | Total |
+------------------------------+-------+-------+-------+
| ATM Card | 3.00 | 2 | 6.00 |
| Audio Response | 3.00 | 0 | 0.00 |
| Check Card | 5.00 | 1 | 5.00 |
| Courtesy Pay | 5.00 | 2 | 10.00 |
| Draft with Direct Deposit | 5.00 | 0 | 0.00 |
| Draft without Direct Deposit | 3.00 | 0 | 0.00 |
| E-statement | 5.00 | 2 | 10.00 |
| Gap | 20.00 | 0 | 0.00 |
| MBI | 10.00 | 0 | 0.00 |
| Membersonline | 5.00 | 0 | 0.00 |
| New Account | 5.00 | 1 | 5.00 |
| New Loan | 5.00 | 0 | 0.00 |
| New Mem"Bear" Account | 5.00 | 0 | 0.00 |

| NULL | 5.00 | 8 | 36.00 |


+------------------------------+-------+-------+-------+
14 rows in set (0.00 sec)

I think that's the result you wanted. Of course, it still has nonsense in the
ROLLUP row for price, but you said you can ignore that. One possible
work-around, though, might be something like:

SELECT CONCAT(p.product, ' @ ', p.price) AS product,
COUNT(l.product ) AS 'Count',
ROUND(SUM(IF(l.product IS NOT NULL, p.price, 0)), 2) AS 'Total'
FROM products p
LEFT JOIN log l ON p.product= l.product
GROUP BY product
WITH ROLLUP;
+-------------------------------------+-------+-------+
| product | Count | Total |
+-------------------------------------+-------+-------+
| ATM Card @ 3.00 | 2 | 6.00 |
| Audio Response @ 3.00 | 0 | 0.00 |
| Check Card @ 5.00 | 1 | 5.00 |
| Courtesy Pay @ 5.00 | 2 | 10.00 |
| Draft with Direct Deposit @ 5.00 | 0 | 0.00 |
| Draft without Direct Deposit @ 3.00 | 0 | 0.00 |
| E-statement @ 5.00 | 2 | 10.00 |
| Gap @ 20.00 | 0 | 0.00 |
| MBI @ 10.00 | 0 | 0.00 |
| Membersonline @ 5.00 | 0 | 0.00 |
| New Account @ 5.00 | 1 | 5.00 |
| New Loan @ 5.00 | 0 | 0.00 |
| New Mem"Bear" Account @ 5.00 | 0 | 0.00 |
| NULL | 8 | 36.00 |
+-------------------------------------+-------+-------+
14 rows in set (0.01 sec)

Note we are grouping on the first column of the output (aliased to product),
rather than by the product column of the products table (p.product).

Michael

Michael Stassen

unread,
Nov 15, 2005, 10:03:38 AM11/15/05
to
Björn Persson wrote:
> Noel Stratton:

>
>>As you can see the summary total in the 'Total' column is incorrect. The
>>summary total should be 36 but is spitting out 40. I am not even sure
>>where it is getting 40. The only way to get 40 is if it is multiplying
>>5X8.
>
> Isn't that what you told it to do?

Well, that obviously wasn't his intent.

>>ROUND(price*count(log.product), 2) AS 'Total'
>
> For the summary line that means "Pick a price field at random (as there are


> several rows to choose from) and multiply it with the count of all the

> log.product fields."

How do you know that? Or more appropriately, how was he supposed to know that?
The manual gives the clear impression that WITH ROLLUP simply totals columns.
Based on that, it really wasn't unreasonable to expect that the ROLLUP row for
the price*COUNT(log.product) column would be the total of that column. You are
certainly right that it doesn't work that way, but that is undocumented.
Instead, it added up the COUNTs and then multiplied by a single price. That's
probably correct behavior, but it is certainly counterintuitive.

Gleb Paharenko wrote:
> Hello.
>
> In my opinion, you have something wrong with your query in general.


> product.price field is in SELECT part and not in GROUP BY, so the

> result is not-predictable. See:
>
> http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html

I think you've missed the point. I cannot be certain, but it seems clear that
there is a unique price for each product. In that case, the manual page you
reference makes it quite clear that selecting price is perfectly allowable *in
MySQL* with a GROUP BY on product, as there will be no ambiguity. That is, the
value of price for each group is entirely predictable.

The problem comes in the interaction between price and WITH ROLLLUP. It seems
that while MySQL allows extra, unique-valued columns with a GROUP BY, they turn
into nonsense in the ROLLUP results. (Although I note the result is still
predictable, as MySQL seems to simply use the last value found.) Again, that's
probably correct behavior, but it is utterly undocumented.

0 new messages