max(), sum(), next()

bearoph...@lycos.com

unread,

Sep 3, 2008, 8:48:23 AM9/3/08

to

Empty Python lists [] don't know the type of the items it will
contain, so this sounds strange:

>>> sum([])
0

Because that [] may be an empty sequence of someobject:

>>> sum(s for s in ["a", "b"] if len(s) > 2)
0

In a statically typed language in that situation you may answer the
initializer value of the type of the items of the list, as I do in the
sum() in D.

This sounds like a more correct/clean thing to do:

>>> max([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: max() arg is an empty sequence

So it may be better to make the sum([]) too raise a ValueError, in
Python 3/3.1 (if this isn't already true). On the other hand often
enough I have code like this:

>>> max(fun(x) for x in iterable if predicate(x))

This may raise the ValueError both if iterable is empty of if the
predicate on its items is always false, so instead of catching
exceptions, that I try to avoid, I usually end with a normal loop,
that's readable and fast:

max_value = smallvalue
for x in iterable:
if predicate(x):
max_value = max(max_value, fun(x))

Where running speed matters, I may even replace that max(max_value,
fun(x)) with a more normal if/else.

A possible alternative is to add a default to max(), like the next()
built-in of Python 2.6:

>>> max((fun(x) for x in iterable if predicate(x)), default=smallvalue)

This returns smallvalue if there are no items to compute the max of.

Bye,
bearophile

Sion Arrowsmith

unread,

Sep 3, 2008, 10:06:54 AM9/3/08

to

<bearoph...@lycos.com> wrote:
>Empty Python lists [] don't know the type of the items it will
>contain, so this sounds strange:
>
>>>> sum([])
>0

>>> help(sum)
sum(...)
sum(sequence, start=0) -> value

>>> sum(range(x) for x in range(5))

Traceback (most recent call last):
File "<stdin>", line 1, in <module>

TypeError: unsupported operand type(s) for +: 'int' and 'list'
>>> sum((range(x) for x in range(5)), [])
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3]

... so the list might not know what type it contains, but sum
does. And if you don't tell it, it makes a sensible guess. And
it *is* a case where refusing the temptation to guess is the
wrong thing: how many times would you use sum to do anything
other than sum numeric values? And how tedious would it be to
have to write sum(..., 0) for every other case? Particularly
bearing in mind:

>>> sum(["a", "b"], "")

Traceback (most recent call last):
File "<stdin>", line 1, in <module>

TypeError: sum() can't sum strings [use ''.join(seq) instead]

--
\S -- si...@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
"Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump

Laszlo Nagy

unread,

Sep 3, 2008, 3:18:15 PM9/3/08

to bearoph...@lycos.com, pytho...@python.org

bearoph...@lycos.com wrote:
> Empty Python lists [] don't know the type of the items it will
> contain, so this sounds strange:
>
>
>>>> sum([])
>>>>
> 0
>
> Because that [] may be an empty sequence of someobject:
>

You are right in that sum could be used to sum arbitrary objects.
However, in 99.99% of the cases, you will be summing numerical values.
When adding real numbers, the neutral element is zero. ( X + 0 = X) It
is very logical to return zero for empty sequences.

Same way, if we would have a prod() function, it should return one for
empty sequences because X*1 = X. The neutral element for this operation
is one.

Of course this is not good for summing other types of objects. But how
clumsy would it be to use

sum( L +[0] )

or

if L:
value = sum(L)
else:
value = 0

instead of sum(L).

Once again, this is what sum() is used for in most cases, so this
behavior is the "expected" one.

Another argument to convince you: the sum() function in SQL for empty
row sets returns zero in most relational databases.

But of course it could have been implemented in a different way... I
believe that there have been excessive discussions about this decision,
and the current implementation is very good, if not the best.

Best,

Laszlo

MRAB

unread,

Sep 3, 2008, 6:40:35 PM9/3/08

to

On Sep 3, 8:18 pm, Laszlo Nagy <gand...@shopzeus.com> wrote:

An alternative would be for the start value to default to None, which
would mean no start value. At the moment it starts with the start
value and then 'adds' the items in the sequence to it, but it could
start with the first item and then 'add' the following items to it.
So:

sum([1, 2, 3]) => 6
sum(["a", "b", "c"]) => "abc"

For backward compatibility, if the sequence is empty and the start
value is None then return 0.

bearoph...@lycos.com

unread,

Sep 3, 2008, 7:00:26 PM9/3/08

to

Laszlo Nagy:

> I believe that there have been excessive discussions about this
> decision, and the current implementation is very good, if not the best.

I see. But note that my post is mostly about the max()/min()
functions :-)

Bye,
bearophile

Mensanator

unread,

Sep 3, 2008, 7:20:39 PM9/3/08

to

On Sep 3, 2:18 pm, Laszlo Nagy <gand...@shopzeus.com> wrote:

> bearophileH...@lycos.com wrote:
> > Empty Python lists [] don't know the type of the items it will
> > contain, so this sounds strange:
>
> >>>> sum([])
>
> > 0
>
> > Because that [] may be an empty sequence of someobject:
>
> You are right in that sum could be used to sum arbitrary objects.
> However, in 99.99% of the cases, you will be summing numerical values.
> When adding real numbers, the neutral element is zero. ( X + 0 = X) It
> is very logical to return zero for empty sequences.

No it isn't. Nothing is not 0, check with MS-Access, for instance:

Null + 1 returns Null. Any arithmetic expression involving a
Null evaluates to Null. Adding something to an unknown returns
an unknown, as it should.

It is a logical fallacy to equate unknown with 0.

For example, the water table elevation in ft above Mean Sea Level
is WTE = TopOfCasing - DepthToWater.

TopOfCasing is usually known and constant (until resurveyed).
But DepthToWater may or may not exist for a given event (well
may be covered with fire ants, for example).

Now, if you equate Null with 0, then the WTE calculation says
the water table elevation is flush with the top of the well,
falsely implying that the site is underwater.

And, since this particular site is on the Mississippi River,
it sometimes IS underwater, but this is NEVER determined by
water table elevations, which, due to the CORRECT treatment
of Nulls by Access, never returns FALSE calculations.

>>> sum([])
0

is a bug, just as it's a bug in Excel to evaluate blank cells
as 0. It should return None or throw an exception like sum([None,1])
does.

castironpi

unread,

Sep 3, 2008, 7:34:28 PM9/3/08

to

Two thoughts:
1/ 'Reduce' has a 'default' argument-- they call it 'initial'.

>>> reduce( max, [ 0, 1, 2, 3 ] )
3
>>> reduce( max, [ 0, 1, 2, 'a' ] )
'a'
>>> reduce( max, [ 0, 1, 2, 'a', 'b' ] )
'b'

2/ Introduce a 'max' class object that takes a default type or default
argument. Query the default for an 'additive' identity, or query for
a 'comparitive' identity, comparisons to which always return true; or
call the constructor with no arguments to construct one.

Steven D'Aprano

unread,

Sep 3, 2008, 9:30:14 PM9/3/08

to

On Wed, 03 Sep 2008 16:20:39 -0700, Mensanator wrote:

>>>> sum([])
> 0
>
> is a bug, just as it's a bug in Excel to evaluate blank cells as 0. It
> should return None or throw an exception like sum([None,1]) does.

You're wrong, because 99.9% of the time when users leave a blank cell in
Excel, they want it to be treated as zero. Spreadsheet sum() is not the
same as mathematician's sum, which doesn't have a concept of "blank
cells". (But if it did, it would treat them as zero, since that's the
only useful thing and mathematicians are just as much pragmatists as
spreadsheet users.) The Excel code does the right thing, and your "pure"
solution would do the unwanted and unexpected thing and is therefore
buggy.

Bugs are defined by "does the code do what the user wants it to do?", not
"is it mathematically pure?". The current behaviour of sum([]) does the
right thing for the 99% of the time when users expect an integer. And the
rest of the time, they have to specify a starting value for the sum
anyway, and so sum([], initial_value) does the right thing *always*.

The only time it does the wrong thing[1] is when you forget to pass an
initial value but expect a non-numeric result. And that's the
programmer's error, not a function bug.

[1] I believe it also does the wrong thing by refusing to sum strings,
but that's another story.

--
Steven

Luis Zarrabeitia

unread,

Sep 3, 2008, 11:40:44 PM9/3/08

to Laszlo Nagy, pytho...@python.org, bearoph...@lycos.com

Quoting Laszlo Nagy <gan...@shopzeus.com>:

> bearoph...@lycos.com wrote:
> > Empty Python lists [] don't know the type of the items it will
> > contain, so this sounds strange:
> >
> >
> >>>> sum([])
> >>>>
> > 0
> >
> > Because that [] may be an empty sequence of someobject:
> >
>

> You are right in that sum could be used to sum arbitrary objects.
> However, in 99.99% of the cases, you will be summing numerical values.
> When adding real numbers, the neutral element is zero. ( X + 0 = X) It
> is very logical to return zero for empty sequences.

Even better:

help(sum) shows

===

sum(...)
sum(sequence, start=0) -> value

Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start'. When the sequence is empty, returns start.
===

so the fact that sum([]) returns zero is just because the start value is zero...
sum([],object()) would return an object().

BTW, the original code:

>>> sum(s for s in ["a", "b"] if len(s) > 2)

wouldn't work anyway... it seems that sum doesn't like to sum strings:

>>> sum(['a','b'],'')

<type 'exceptions.TypeError'>: sum() can't sum strings [use ''.join(seq) instead]

Cheers,

--
Luis Zarrabeitia
Facultad de Matemática y Computación, UH
http://profesores.matcom.uh.cu/~kyrie

Mensanator

unread,

Sep 4, 2008, 1:20:43 AM9/4/08

to

On Sep 3, 8:30�pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Wed, 03 Sep 2008 16:20:39 -0700, Mensanator wrote:
> >>>> sum([])
> > 0
>
> > is a bug, just as it's a bug in Excel to evaluate blank cells as 0. It
> > should return None or throw an exception like sum([None,1]) does.
>
> You're wrong, because 99.9% of the time when users leave a blank cell in
> Excel, they want it to be treated as zero.

Then 99.9% of users want the wrong thing. Microsoft knows that
this is a bug but refuses to fix it to prevent breaking legacy
documents (probably dating back to VisiCalc). When graphimg data,
a missing value should be interpreted as a hole in the graph

+------+ +--+------+------+-----+

and not evaluated as 0

+------+ +--+------+------+-----+
\ /
\ /
\ /
\ /
\ /
\+/

(depending on the context of the graph, of course).

And Microsoft provides a workaround for graphs to make 0's
appear as holes. Of course, this will cause legitimate 0
values to disappear, so the workaround is inconsistent.

> Spreadsheet sum() is not the
> same as mathematician's sum, which doesn't have a concept of "blank
> cells". (But if it did, it would treat them as zero, since that's the
> only useful thing and mathematicians are just as much pragmatists as
> spreadsheet users.) The Excel code does the right thing, and your "pure"
> solution would do the unwanted and unexpected thing and is therefore
> buggy.

Apparently, you don't use databases or make surface contours.
Contour programs REQUIRE that blanks are null, not 0, so that
the Kriging algorithm interpolates around the holes rather than
return false calculations. Excel's treatment of blank cells is
inconsistent with Access' treatment of Nulls and therefore wrong,
anyway you slice it. Math isn't a democracy, what most people want
is irrelevant.

I don't pull these things out of my ass, it's real world stuff
I observe when I help CAD operators and such debug problems.

Maybe you want to say a bug is when it doesn't do what the
author intended, but I say if what the intention was is wrong,
then a perfect implentation is still a bug because it doesn't
do what it's supposed to do.

>
> Bugs are defined by "does the code do what the user wants it to do?", not
> "is it mathematically pure?".

ReallY? So you think math IS a democracy? There is no reason to
violate
mathematical purity. If I don't get EXACTLY the same answer from
Excel,
Access, Mathematica and Python, then SOMEBODY is wrong. It would be a
shame if that somebody was Python.

> The current behaviour of sum([]) does the
> right thing for the 99% of the time when users expect an integer.

Why shouldn't the users expect an exception? Isn't that why we have
try:except? Maybr 99% of users expect sum([])==0, but _I_ expect to
be able to distinguish an empty list from [4,-4].

> And the
> rest of the time, they have to specify a starting value for the sum
> anyway, and so sum([], initial_value) does the right thing *always*.

So if you really want [] to be 0, why not say sum([],0)?

Why shouldn't nothing added to nothing return nothing?
Having it evaluate to 0 is wrong 99.9% of the time.

Mensanator

unread,

Sep 4, 2008, 1:57:19 AM9/4/08

to

I just checked and I mis-remembered how this works.
The option is for blanks to plot as holes or 0 or
be interpolated. 0 always plots as 0. The inconsistency
is that blanks are still evaluated as 0 in formulae
and macros.

> > Steven- Hide quoted text -
>
> - Show quoted text -

Fredrik Lundh

unread,

Sep 4, 2008, 2:24:56 AM9/4/08

to pytho...@python.org

Mensanator wrote:

> No it isn't. Nothing is not 0, check with MS-Access, for instance:
>
> Null + 1 returns Null. Any arithmetic expression involving a
> Null evaluates to Null. Adding something to an unknown returns
> an unknown, as it should.
>
> It is a logical fallacy to equate unknown with 0.

http://en.wikipedia.org/wiki/Empty_sum

"In mathematics, the empty sum, or nullary sum, is the result of adding
no numbers, in summation for example. Its numerical value is zero."

</F>

Steven D'Aprano

unread,

Sep 4, 2008, 2:26:06 AM9/4/08

to

On Wed, 03 Sep 2008 22:20:43 -0700, Mensanator wrote:

> On Sep 3, 8:30�pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Wed, 03 Sep 2008 16:20:39 -0700, Mensanator wrote:
>> >>>> sum([])
>> > 0
>>
>> > is a bug, just as it's a bug in Excel to evaluate blank cells as 0.
>> > It should return None or throw an exception like sum([None,1]) does.
>>
>> You're wrong, because 99.9% of the time when users leave a blank cell
>> in Excel, they want it to be treated as zero.
>
> Then 99.9% of users want the wrong thing.

It is to laugh.

> Microsoft knows that this is a bug

Says you.

> but refuses to fix it to prevent breaking legacy documents (probably
> dating back to VisiCalc). When graphimg data, a missing value should be
> interpreted as a hole in the graph

"Graphing data" is not sum(). I don't expect graphing data to result in
the same result as sum(), why would I expect them to interpret input the
same way?

> +------+ +--+------+------+-----+

Why should the graphing application ignore blanks ("missing data"), but
sum() treat missing data as an error? That makes no sense at all.

> and not evaluated as 0
>

> And Microsoft provides a workaround for graphs to make 0's appear as
> holes. Of course, this will cause legitimate 0 values to disappear, so
> the workaround is inconsistent.

I'm not aware of any spreadsheet that treats empty cells as zero for the
purpose of graphing, and I find your claim that Excel can't draw graphs
with zero in them implausible, but I don't have a copy of Excel to test
it.

>> Spreadsheet sum() is not the
>> same as mathematician's sum, which doesn't have a concept of "blank
>> cells". (But if it did, it would treat them as zero, since that's the
>> only useful thing and mathematicians are just as much pragmatists as
>> spreadsheet users.) The Excel code does the right thing, and your
>> "pure" solution would do the unwanted and unexpected thing and is
>> therefore buggy.
>
> Apparently, you don't use databases or make surface contours.

Neither databases nor surface contours are sum(). What possible relevance
are they to the question of what sum() should do?

Do you perhaps imagine that there is only "ONE POSSIBLE CORRECT WAY" to
deal with missing data, and every function and program must deal with it
the same way?

> Contour programs REQUIRE that blanks are null, not 0

Lucky for them that null is not 0 then.

> so that the Kriging
> algorithm interpolates around the holes rather than return false
> calculations. Excel's treatment of blank cells is inconsistent with
> Access' treatment of Nulls and therefore wrong, anyway you slice it.

No no no, you messed that sentence up. What you *really* meant was:

"Access' treatment of Nulls is inconsistent with Excel's treatment of
blank cells and therefore wrong, anyway you slice it."

No of course not. That would be stupid, just as stupid as your sentence.
Excel is not Access. They do different things. Why should they
necessarily interpret data the same way?

> Maybe you want to say a bug is when it doesn't do what the author
> intended, but I say if what the intention was is wrong, then a perfect
> implentation is still a bug because it doesn't do what it's supposed to
> do.

Who decides what it is supposed to do if not the author? You, in your
ivory tower who doesn't care a fig for what people want the software to
do?

Bug report: "Software does what users want it to do."
Fix: "Make the software do something that users don't want."

Great.

>> Bugs are defined by "does the code do what the user wants it to do?",
>> not "is it mathematically pure?".
>
> ReallY? So you think math IS a democracy? There is no reason to violate
> mathematical purity.

You've given a good example yourself: the Kriging algorithm needs a Null
value which is not zero. There is no mathematical "null" which is
distinct from zero, so there's an excellent violation of mathematical
purity right there.

If I am given the job of adding up the number of widgets inside a box,
and the box is empty, I answer that there are 0 widgets inside it. If I
were to follow your advice and declare that "An error occurred, can't
determine the number of widgets inside an empty box!" people would treat
me as an idiot, and rightly so.

> If I don't get EXACTLY the same answer from Excel,
> Access, Mathematica and Python, then SOMEBODY is wrong. It would be a
> shame if that somebody was Python.

Well Excel, Python agree that the sum of an empty list is 0. What do
Access and Mathematica do?

>> The current behaviour of sum([]) does the right thing for the 99% of
>> the time when users expect an integer.
>
> Why shouldn't the users expect an exception? Isn't that why we have
> try:except? Maybr 99% of users expect sum([])==0, but _I_ expect to be
> able to distinguish an empty list from [4,-4].

The way to distinguish lists is NOT to add them up and compare the sums:

>>> sum([4, -4]) == sum([0]) == sum([1, 2, 3, -6]) == sum([-1, 2, -1])
True

The correct way is by comparing the lists themselves:

>>> [] == [4, -4]
False

>> And the
>> rest of the time, they have to specify a starting value for the sum
>> anyway, and so sum([], initial_value) does the right thing *always*.
>
> So if you really want [] to be 0, why not say sum([],0)?

I don't want [] == 0. That's foolish. I want the sum of an empty list to
be 0, which is a very different thing.

And I don't need to say sum([],0) because the default value for the
second argument is 0.

> Why shouldn't nothing added to nothing return nothing? Having it
> evaluate to 0 is wrong 99.9% of the time.

It is to laugh.

What's the difference between having 0 widgets in a box and having an
empty box with, er, no widgets in it?

--
Steven

Mensanator

unread,

Sep 4, 2008, 4:28:46 AM9/4/08

to

On Sep 4, 1:26 am, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Wed, 03 Sep 2008 22:20:43 -0700, Mensanator wrote:
> > On Sep 3, 8:30 pm, Steven D'Aprano <st...@REMOVE-THIS-
> > cybersource.com.au> wrote:
> >> On Wed, 03 Sep 2008 16:20:39 -0700, Mensanator wrote:
> >> >>>> sum([])
> >> > 0
>
> >> > is a bug, just as it's a bug in Excel to evaluate blank cells as 0.
> >> > It should return None or throw an exception like sum([None,1]) does.
>
> >> You're wrong, because 99.9% of the time when users leave a blank cell
> >> in Excel, they want it to be treated as zero.
>
> > Then 99.9% of users want the wrong thing.
>
> It is to laugh.
>
> > Microsoft knows that this is a bug
>
> Says you.
>
> > but refuses to fix it to prevent breaking legacy documents (probably
> > dating back to VisiCalc). When graphimg data, a missing value should be
> > interpreted as a hole in the graph
>
> "Graphing data" is not sum(). I don't expect graphing data to result in
> the same result as sum(), why would I expect them to interpret input the
> same way?
>
> > +------+ +--+------+------+-----+
>
> Why should the graphing application ignore blanks ("missing data"), but
> sum() treat missing data as an error? That makes no sense at all.

Maybe it's important to know data is missing. You can see
the holes in a graph. You can't see the holes in a sum.

>
> > and not evaluated as 0
>
> > And Microsoft provides a workaround for graphs to make 0's appear as
> > holes. Of course, this will cause legitimate 0 values to disappear, so
> > the workaround is inconsistent.
>
> I'm not aware of any spreadsheet that treats empty cells as zero for the
> purpose of graphing, and I find your claim that Excel can't draw graphs
> with zero in them implausible, but I don't have a copy of Excel to test
> it.

That was a mistake. I made a followup correction, but
you probably didn't see it.

>
> >> Spreadsheet sum() is not the
> >> same as mathematician's sum, which doesn't have a concept of "blank
> >> cells". (But if it did, it would treat them as zero, since that's the
> >> only useful thing and mathematicians are just as much pragmatists as
> >> spreadsheet users.) The Excel code does the right thing, and your
> >> "pure" solution would do the unwanted and unexpected thing and is
> >> therefore buggy.
>
> > Apparently, you don't use databases or make surface contours.
>
> Neither databases nor surface contours are sum(). What possible relevance
> are they to the question of what sum() should do?

Because a sum that includes Nulls isn't valid. If you treated
Nulls as 0, then not only would your sum be wrong, but so
would your count and the average based on those. Now you
can EXPLICITLY tell the database to only consider non-Null
values, which doesn't change the total, but DOES change
the count.

>
> Do you perhaps imagine that there is only "ONE POSSIBLE CORRECT WAY" to
> deal with missing data, and every function and program must deal with it
> the same way?

But that's what sum() is doing now, treating sum([]) the same
as sum([],0). Why isn't sum() defined such that "...if list
is empty, return start, IF SPECIFIED, otherwise raise exception."
Then, instead of "ONE POSSIBLE CORRECT WAY", the user could
specify whether he wants Excel compatible behaviour or
Access compatible behaviour.

>
> > Contour programs REQUIRE that blanks are null, not 0
>
> Lucky for them that null is not 0 then.

No, but blank cells are 0 as far as Excel is concerned.
That behaviour causes nothing but trouble and I am
saddened to see Python emulate such nonsense.

>
> > so that the Kriging
> > algorithm interpolates around the holes rather than return false
> > calculations. Excel's treatment of blank cells is inconsistent with
> > Access' treatment of Nulls and therefore wrong, anyway you slice it.
>
> No no no, you messed that sentence up. What you *really* meant was:
>
> "Access' treatment of Nulls is inconsistent with Excel's treatment of
> blank cells and therefore wrong, anyway you slice it."
>
> No of course not. That would be stupid, just as stupid as your sentence.
> Excel is not Access. They do different things. Why should they
> necessarily interpret data the same way?

Because you want consistent results?

>
> > Maybe you want to say a bug is when it doesn't do what the author
> > intended, but I say if what the intention was is wrong, then a perfect
> > implentation is still a bug because it doesn't do what it's supposed to
> > do.
>
> Who decides what it is supposed to do if not the author?

The author can't change math on a whim.

> You, in your ivory tower who doesn't care a fig for
> what people want the software to do?

True, I could care less what peole want to do...

...as long as they do it consistently.

>
> Bug report: "Software does what users want it to do."
> Fix: "Make the software do something that users don't want."

What the users want doesn't carry any weight with respect
to what the database wants. The user must conform to the
needs of the database because the other way ain't ever gonna
happen.

>
> Great.

If only. But then, I probably wouldn't have a job.

>
> >> Bugs are defined by "does the code do what the user wants it to do?",
> >> not "is it mathematically pure?".
>
> > ReallY? So you think math IS a democracy? There is no reason to violate
> > mathematical purity.
>
> You've given a good example yourself: the Kriging algorithm needs a Null
> value which is not zero. There is no mathematical "null" which is
> distinct from zero, so there's an excellent violation of mathematical
> purity right there.

Hey, I was talking databases, you brought up mathematical purity.

>
> If I am given the job of adding up the number of widgets inside a box,
> and the box is empty, I answer that there are 0 widgets inside it.

Right. it has a known quantity and that quantity is 0.
Just because the box is empty doesn't mean the quantity
is Null.

> If I
> were to follow your advice and declare that "An error occurred, can't
> determine the number of widgets inside an empty box!" people would treat
> me as an idiot, and rightly so.

Right. But a better analogy is when a new shipment is due
but hasn't arrived yet so the quantity is unknown. Now the
boss comes up and says he needs to ship 5 widgets tomorrow
and asks how many you have. You say 0. Now the boss runs
out to Joe's Widget Emporium and pays retail only to discover
when he gets back that the shipment has arrived containing
12 widgets. Because you didn't say "I don't know, today's
shipment isn't here yet", the boss not only thinks you're
an idiot, but he fires you as well.

>
> > If I don't get EXACTLY the same answer from Excel,
> > Access, Mathematica and Python, then SOMEBODY is wrong. It would be a
> > shame if that somebody was Python.
>
> Well Excel, Python agree that the sum of an empty list is 0. What do
> Access and Mathematica do?

I don't know abaout Mathmatica, but if you EXPLICITLY
tell Access to sum only the non-Null values, you'll get the
same answer Excel does. Otherwise, any expression that
includes a Null evaluates to Null, which certainly isn't
the same answer Excel gives.

>
> >> The current behaviour of sum([]) does the right thing for the 99% of
> >> the time when users expect an integer.
>
> > Why shouldn't the users expect an exception? Isn't that why we have
> > try:except? Maybr 99% of users expect sum([])==0, but _I_ expect to be
> > able to distinguish an empty list from [4,-4].
>
> The way to distinguish lists is NOT to add them up and compare the sums:
>
> >>> sum([4, -4]) == sum([0]) == sum([1, 2, 3, -6]) == sum([-1, 2, -1])
>
> True
>
> The correct way is by comparing the lists themselves:
>
> >>> [] == [4, -4]
>
> False
>
> >> And the
> >> rest of the time, they have to specify a starting value for the sum
> >> anyway, and so sum([], initial_value) does the right thing *always*.
>
> > So if you really want [] to be 0, why not say sum([],0)?
>
> I don't want [] == 0. That's foolish. I want the sum of an empty list to
> be 0, which is a very different thing.

In certain circumstances. In others, an empty list summing
to 0 is just as foolish. That's why sum([]) should be an
error, so you can have it either way.

Isn't one of Python's slogans "Explicit is better than implicit"?

>
> And I don't need to say sum([],0) because the default value for the
> second argument is 0.

That's the problem. There is no justification for assuming
that unknown quantities are 0.

>
> > Why shouldn't nothing added to nothing return nothing? Having it
> > evaluate to 0 is wrong 99.9% of the time.
>
> It is to laugh.
>
> What's the difference between having 0 widgets in a box and having an
> empty box with, er, no widgets in it?

There are no "empty" boxes. There are only boxes with
known quantities and those with unknown quantities.
I hope that's not too ivory tower.

>
> --
> Steven

Thomas Bellman

unread,

Sep 4, 2008, 3:05:19 AM9/4/08

to

Mensanator <mensa...@aol.com> wrote:

> No, but blank cells are 0 as far as Excel is concerned.
> That behaviour causes nothing but trouble and I am
> saddened to see Python emulate such nonsense.

Then you should feel glad that the Python sum() function *does*
signal an error for the closest equivalent of "blank cells" in
a list:

>>> sum([1, 2, 3, None, 5, 6])

Traceback (most recent call last):
File "<stdin>", line 1, in <module>

TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

Summing the elements of an empty list is *not* the same thing as
summing elements of a list where one element is None.

> There are no "empty" boxes. There are only boxes with
> known quantities and those with unknown quantities.
> I hope that's not too ivory tower.

The sum() function in Python requires exactly one box. That box
can be empty, can contain "known quantities" (numbers, presumably),
or "unknown quantities" (non-numbers, e.g., None). But you can't
give it zero boxes, or three boxes.

I don't have a strong view of whether sum([]) should return 0 or
raise an error, but please do not mix that question up with what
a sum over empty cells or over NULL values should yield. They
are very different questions.

As it happens, the SQL sum() function (at least in MySQL; I don't
have any other database easily available, nor any SQL standard to
read) does return NULL for a sum over the empty sequence, so you
could argue that that would be the correct behaviour for the
Python sum() function as well, but you can't argue that because a
sum *involving* a NULL value returns NULL.

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"This isn't right. This isn't even wrong." ! bellman @ lysator.liu.se
-- Wolfgang Pauli ! Make Love -- Nicht Wahr!

David C. Ullrich

unread,

Sep 4, 2008, 12:13:12 PM9/4/08

to

In article
<719910b1-3776-4bf2...@25g2000prz.googlegroups.com>,
Mensanator <mensa...@aol.com> wrote:

> On Sep 3, 2:18 pm, Laszlo Nagy <gand...@shopzeus.com> wrote:
> > bearophileH...@lycos.com wrote:
> > > Empty Python lists [] don't know the type of the items it will
> > > contain, so this sounds strange:
> >
> > >>>> sum([])
> >
> > > 0
> >
> > > Because that [] may be an empty sequence of someobject:
> >
> > You are right in that sum could be used to sum arbitrary objects.
> > However, in 99.99% of the cases, you will be summing numerical values.
> > When adding real numbers, the neutral element is zero. ( X + 0 = X) It
> > is very logical to return zero for empty sequences.
>
> No it isn't. Nothing is not 0, check with MS-Access, for instance:
>
> Null + 1 returns Null. Any arithmetic expression involving a
> Null evaluates to Null. Adding something to an unknown returns
> an unknown, as it should.
>
> It is a logical fallacy to equate unknown with 0.

Which has nothing to do with the "right" value for an
empty sum. If they hear about what you said here in
sci.math they're gonna kick you out - what do you
imagine the universally accepted value of \sum_{j=1}^0
is?

--
David C. Ullrich

David C. Ullrich

unread,

Sep 4, 2008, 12:17:37 PM9/4/08

to

In article
<24061e7d-935a-442f...@34g2000hsh.googlegroups.com>,
bearoph...@lycos.com wrote:

> Empty Python lists [] don't know the type of the items it will
> contain, so this sounds strange:
>
> >>> sum([])
> 0
>
> Because that [] may be an empty sequence of someobject:
>
> >>> sum(s for s in ["a", "b"] if len(s) > 2)
> 0
>
> In a statically typed language in that situation you may answer the
> initializer value of the type of the items of the list, as I do in the
> sum() in D.
>
> This sounds like a more correct/clean thing to do:
>
> >>> max([])
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: max() arg is an empty sequence
>
> So it may be better to make the sum([]) too raise a ValueError,

I don't see why you feel the two should act the same.
At least in mathematics, the sum of the elements of
the empty set _is_ 0, while the maximum element of the
empty set is undefined.

And both for good reason:

(i) If A and B are disjoint sets we certainly want to
have sum(A union B) = sum(A) + sum(B). This requires
sum(empty set) = 0.

(ii) If A is a subset of B then we should have
max(A) <= max(B). This requires that max(empty set)
be something that's smaller than everything else.
So we give up on that.

> in
> Python 3/3.1 (if this isn't already true). On the other hand often
> enough I have code like this:
>
> >>> max(fun(x) for x in iterable if predicate(x))
>
> This may raise the ValueError both if iterable is empty of if the
> predicate on its items is always false, so instead of catching
> exceptions, that I try to avoid, I usually end with a normal loop,
> that's readable and fast:
>
> max_value = smallvalue
> for x in iterable:
> if predicate(x):
> max_value = max(max_value, fun(x))
>
> Where running speed matters, I may even replace that max(max_value,
> fun(x)) with a more normal if/else.
>
> A possible alternative is to add a default to max(), like the next()
> built-in of Python 2.6:
>
> >>> max((fun(x) for x in iterable if predicate(x)), default=smallvalue)
>
> This returns smallvalue if there are no items to compute the max of.
>
> Bye,
> bearophile

--
David C. Ullrich

Mensanator

unread,

Sep 4, 2008, 1:26:01 PM9/4/08

to

On Sep 4, 11:13 am, "David C. Ullrich" <dullr...@sprynet.com> wrote:
> In article
> <719910b1-3776-4bf2-a0b6-236f3167e...@25g2000prz.googlegroups.com>,

>
>
>
>
>
> Mensanator <mensana...@aol.com> wrote:
> > On Sep 3, 2:18 pm, Laszlo Nagy <gand...@shopzeus.com> wrote:
> > > bearophileH...@lycos.com wrote:
> > > > Empty Python lists [] don't know the type of the items it will
> > > > contain, so this sounds strange:
>
> > > >>>> sum([])
>
> > > > 0
>
> > > > Because that [] may be an empty sequence of someobject:
>
> > > You are right in that sum could be used to sum arbitrary objects.
> > > However, in 99.99% of the cases, you will be summing numerical values.
> > > When adding real numbers, the neutral element is zero. ( X + 0 = X) It
> > > is very logical to return zero for empty sequences.
>
> > No it isn't. Nothing is not 0, check with MS-Access, for instance:
>
> > Null + 1 returns Null. Any arithmetic expression involving a
> > Null evaluates to Null. Adding something to an unknown returns
> > an unknown, as it should.
>
> > It is a logical fallacy to equate unknown with 0.
>
> Which has nothing to do with the "right" value for an
> empty sum.

I'm less concerned about the "right" value than a consistent
value. I'm fairly certain you can't get 0 from a query that
returns no records, so I don't like seeing empty being
treated as 0, even if it means that in set theory because
databases aren't sets.

> If they hear about what you said here in
> sci.math they're gonna kick you out

They usually don't kick me out, just kick me.

> - what do you
> imagine the universally accepted value of \sum_{j=1}^0
> is?

I can't follow your banter, so I'm not sure what it should be.

Mensanator

unread,

Sep 4, 2008, 1:57:35 PM9/4/08

to

On Sep 4, 2:05 am, Thomas Bellman <bell...@lysator.liu.se> wrote:

> Mensanator <mensana...@aol.com> wrote:
> > No, but blank cells are 0 as far as Excel is concerned.
> > That behaviour causes nothing but trouble and I am
> > saddened to see Python emulate such nonsense.
>
> Then you should feel glad that the Python sum() function *does*
> signal an error for the closest equivalent of "blank cells" in
> a list:
>
> >>> sum([1, 2, 3, None, 5, 6])
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

Yes, I am in fact happy to see that behaviour.

>
> Summing the elements of an empty list is *not* the same thing as
> summing elements of a list where one element is None.

So,

>>> sum([1, 2, 3, None, 5, 6])
Traceback (most recent call last):

File "<pyshell#0>", line 1, in <module>

sum([1, 2, 3, None, 5, 6])

TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

gives me an error.

As does

>>> sum([None, None, None, None, None, None])

Traceback (most recent call last):

File "<pyshell#1>", line 1, in <module>
sum([None, None, None, None, None, None])

TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

Why then, doesn't

>>> sum([A for A in [None, None, None, None, None, None] if A != None])
0

give me an error?

Ok, it's not a bug.

"This behaviour is by design." - Microsoft Knowledge Base

I don't like it, but I guess I'll just have to live with it.

>
> > There are no "empty" boxes. There are only boxes with
> > known quantities and those with unknown quantities.
> > I hope that's not too ivory tower.
>
> The sum() function in Python requires exactly one box. That box
> can be empty, can contain "known quantities" (numbers, presumably),
> or "unknown quantities" (non-numbers, e.g., None). But you can't
> give it zero boxes, or three boxes.
>
> I don't have a strong view of whether sum([]) should return 0 or
> raise an error, but please do not mix that question up with what
> a sum over empty cells or over NULL values should yield. They
> are very different questions.

Ok, but I don't understand why an empty list is a valid sum
whereas a list containing None is not.

>
> As it happens, the SQL sum() function (at least in MySQL; I don't
> have any other database easily available, nor any SQL standard to
> read) does return NULL for a sum over the empty sequence, so you
> could argue that that would be the correct behaviour for the
> Python sum() function as well, but you can't argue that because a
> sum *involving* a NULL value returns NULL.

I'm not following that. Are you saying a query that returns no
records doesn't have a specific field containg a Null so there
are no Nulls to poison the sum? ...tap...tap...tap. Ok, I can see
that, but you don't get 0 either.

Wojtek Walczak

unread,

Sep 4, 2008, 2:09:14 PM9/4/08

to

On Thu, 4 Sep 2008 10:57:35 -0700 (PDT), Mensanator wrote:

> Why then, doesn't
>
>>>> sum([A for A in [None, None, None, None, None, None] if A != None])
> 0
>
> give me an error?

Because "[A for A in [None, None, None, None, None, None] if A != None]"
returns an empty list, and sum([]) doesn't return an error. What did you
expect?

--
Regards,
Wojtek Walczak,
http://tosh.pl/gminick/

bearoph...@lycos.com

unread,

Sep 4, 2008, 3:42:21 PM9/4/08

to

David C. Ullrich:

> At least in mathematics, the sum of the elements of
> the empty set _is_ 0, while the maximum element of the
> empty set is undefined.

What do you think about my idea of adding that 'default' argument to
the max()/min() functions?

Bye,
bearophile

castironpi

unread,

Sep 4, 2008, 4:25:05 PM9/4/08

to

For max and min, why can't you just add your argument to the set
itself?

The reason max([]) is undefined is that max( S ) is in S. The reason
sum([]) is 0 is that sum( [ x ] ) - x = 0.

bearoph...@lycos.com

unread,

Sep 4, 2008, 4:43:43 PM9/4/08

to

castironpi:

> For max and min, why can't you just add your argument to the set
> itself?

Sometimes that can be done, but in many other situations it's less
easy, like in the example I have shown in my first post:

max((fun(x) for x in iterable if predicate(x)))

There are some ways to add the max there, for example using an
itertools.chain to chan the default value to the end of the iterable,
but most of the time I just write a for loop.

Bye,
bearophile

Thomas Bellman

unread,

Sep 4, 2008, 1:31:20 PM9/4/08

to

Mensanator <mensa...@aol.com> wrote:

> Ok, but I don't understand why an empty list is a valid sum
> whereas a list containing None is not.

You can't conclude the behaviour of the one from the behaviour
of the other, because the two situations have nothing at all in
common.

>> As it happens, the SQL sum() function (at least in MySQL; I don't
>> have any other database easily available, nor any SQL standard to
>> read) does return NULL for a sum over the empty sequence, so you
>> could argue that that would be the correct behaviour for the
>> Python sum() function as well, but you can't argue that because a
>> sum *involving* a NULL value returns NULL.

> I'm not following that. Are you saying a query that returns no
> records doesn't have a specific field containg a Null so there
> are no Nulls to poison the sum? ...tap...tap...tap. Ok, I can see
> that,

Exactly.

> but you don't get 0 either.

That's because the SQL sum() has a special case for "no rows
returned". A *different* special case than the one that taint's
the sum when encountering a NULL. It does the equivalent of

if len(rows_returned) == 0:
# Special case for no rows returned
return NULL
total = 0
for row in rows_returned:
value = row[column]
if value is NULL:
# Special case for encountering a NULL value
return NULL
total += value
return total

Two different special cases for the two different situations. If
you were to remove the special case for no rows returned, you
would get zero when the SELECT statement finds no rows, but the
sum would still be tainted when a NULL value is encountered..

The definition of sum in mathematics *does* do away with that
special case. The sum of zero terms is zero. And the Python
sum() function follows the mathematics definition in this
respect, not the SQL definition.

You can argue that Python sum() should have special cased the
empty sequence. It's not an illogical stance to take. It's just
a totally different issue from encountering a non-numeric element
in the sequence. In some cases it might actually make sense to
treat the empty sequence as an error, but just ignore non-numeric
elements (i.e, treat them as if they were zero). And in some
cases both should be an error, and in some neither should be an
error.

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden

"You are in a twisty little passage of ! bellman @ lysator.liu.se
standards, all conflicting." ! Make Love -- Nicht Wahr!

Mensanator

unread,

Sep 4, 2008, 9:09:49 PM9/4/08

to

On Sep 4, 12:31 pm, Thomas Bellman <bell...@lysator.liu.se> wrote:

> Mensanator <mensana...@aol.com> wrote:
> > Ok, but I don't understand why an empty list is a valid sum
> > whereas a list containing None is not.
>
> You can't conclude the behaviour of the one from the behaviour
> of the other, because the two situations have nothing at all in
> common.

I wouldn't say they have nothing in common. After all, neither []
nor [None,None] contain any integers, yet summing the first gives
us an integer result whereas summing the second does not. Yes, for
different unrelated reasons, but sometimes reasons aren't as important
as results.

Too bad. I brought this up because I use Python a lot with
database work and rarely for proving theorms in ZFC.

Guess I have to work around it. Just one more 'gotcha' to
keep track of.

>
> You can argue that Python sum() should have special cased the
> empty sequence.

I did.

> It's not an illogical stance to take.

I didn't think so.

> It's just
> a totally different issue from encountering a non-numeric element
> in the sequence.

Ok, I was paying more attention to the outcome.

> In some cases it might actually make sense to
> treat the empty sequence as an error, but just ignore non-numeric
> elements (i.e, treat them as if they were zero).

Ouch. Sounds like Excel and we don't want to go there.

Message has been deleted

Manu Hack

unread,

Sep 5, 2008, 4:28:44 AM9/5/08

to pytho...@python.org

On Thu, Sep 4, 2008 at 4:25 PM, castironpi <casti...@gmail.com> wrote:
> On Sep 4, 2:42 pm, bearophileH...@lycos.com wrote:
>> David C. Ullrich:
>>
>> > At least in mathematics, the sum of the elements of
>> > the empty set _is_ 0, while the maximum element of the
>> > empty set is undefined.
>>
>> What do you think about my idea of adding that 'default' argument to
>> the max()/min() functions?
>>
>> Bye,
>> bearophile
>
> For max and min, why can't you just add your argument to the set
> itself?
>
> The reason max([]) is undefined is that max( S ) is in S.

It makes sense.

>The reason sum([]) is 0 is that sum( [ x ] ) - x = 0.

It doesn't make sense to me. What do you set x to?

Ken Starks

unread,

Sep 5, 2008, 5:19:59 AM9/5/08

to

I suppose the following is accepted by statisticians. Here,
for reference, here is the what the 'R' statistic package
says on the subject 9if you type 'help(sum)'

<quote>
Sum of Vector Elements
Description
sum returns the sum of all the values present in its arguments.

Usage
sum(..., na.rm = FALSE)

Arguments
... numeric or complex or logical vectors.
na.rm logical. Should missing values be removed?

Details
This is a generic function: methods can be defined for it directly or
via the Summary group generic. For this to work properly, the arguments
... should be unnamed, and dispatch is on the first argument.

If na.rm is FALSE an NA value in any of the arguments will cause a value
of NA to be returned, otherwise NA values are ignored.

Logical true values are regarded as one, false values as zero. For
historical reasons, NULL is accepted and treated as if it were integer(0).

Value
The sum. If all of ... are of type integer or logical, then the sum is
integer, and in that case the result will be NA (with a warning) if
integer overflow occurs. Otherwise it is a length-one numeric or complex
vector.
NB: the sum of an empty set is zero, by definition.

References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S
Language. Wadsworth & Brooks/Cole.

Ken Starks

unread,

Sep 5, 2008, 8:09:28 AM9/5/08

to

David C. Ullrich wrote:

>
> I don't see why you feel the two should act the same.
> At least in mathematics, the sum of the elements of
> the empty set _is_ 0, while the maximum element of the
> empty set is undefined.
>
> And both for good reason:
>
> (i) If A and B are disjoint sets we certainly want to
> have sum(A union B) = sum(A) + sum(B). This requires
> sum(empty set) = 0.
>
> (ii) If A is a subset of B then we should have
> max(A) <= max(B). This requires that max(empty set)
> be something that's smaller than everything else.
> So we give up on that.

Do we give up? Really ?

From wikipedia: http://en.wikipedia.org/wiki/Empty_set
(Uses wikipedia's LaTeX notation -- I hope those interested
are OK with that )

<quote>
Mathematics

[edit] Extended real numbers

Since the empty set has no members, when it is considered as a subset of
any ordered set, then any member of that set will be an upper bound and
lower bound for the empty set. For example, when considered as a subset
of the real numbers, with its usual ordering, represented by the real
number line, every real number is both an upper and lower bound for the
empty set.[3] When considered as a subset of the extended reals formed
by adding two "numbers" or "points" to the real numbers, namely negative
infinity, denoted -\infty\!\,, which is defined to be less than every
other extended real number, and positive infinity, denoted +\infty\!\,,
which is defined to be greater than every other extended real number, then:

\sup\varnothing=\min(\{-\infty, +\infty \} \cup \mathbb{R})=-\infty,

and

\inf\varnothing=\max(\{-\infty, +\infty \} \cup \mathbb{R})=+\infty.

That is, the least upper bound (sup or supremum) of the empty set is
negative infinity, while the greatest lower bound (inf or infimum) is
positive infinity. By analogy with the above, in the domain of the
extended reals, negative infinity is the identity element for the
maximum and supremum operators, while positive infinity is the identity
element for minimum and infimum.

David C. Ullrich

unread,

Sep 5, 2008, 11:16:38 AM9/5/08

to

In article <g9r7hq$ri6$1$8302...@news.demon.co.uk>,
Ken Starks <str...@lampsacos.demon.co.uk> wrote:

> David C. Ullrich wrote:
>
> >
> > I don't see why you feel the two should act the same.
> > At least in mathematics, the sum of the elements of
> > the empty set _is_ 0, while the maximum element of the
> > empty set is undefined.
> >
> > And both for good reason:
> >
> > (i) If A and B are disjoint sets we certainly want to
> > have sum(A union B) = sum(A) + sum(B). This requires
> > sum(empty set) = 0.
> >
> > (ii) If A is a subset of B then we should have
> > max(A) <= max(B). This requires that max(empty set)
> > be something that's smaller than everything else.
> > So we give up on that.
>
> Do we give up? Really ?

Erm, thanks. I was aware of all that below. If we're
being technical what's below is talking about the sup
and inf, which are not the same as max and min. More
relevant to the present context, I didn't mention what's
below because it doesn't seem likely that saying max([])
= -infinity and min([]) = +infinity is going to make the
OP happy...

--
David C. Ullrich

David C. Ullrich

unread,

Sep 5, 2008, 11:22:22 AM9/5/08

to

In article
<18c765e0-1dcb-4e40...@n38g2000prl.googlegroups.com>,
bearoph...@lycos.com wrote:

How the Python max and min functions should work has to
do with how people want them to work and how people expect
them to work. I wouldn't know about most people, but I
would have been surprised if min([]) was not an error,
and I would have been disappointed if sum([]) was not 0.

From a mathematical point of view, not that that's directly
relevant, it doesn't make much sense to me to add that default
argument. The max of a set is supposed to be the largest
element of that set. If the set is empty there's no such
thing.

In Python you'd better make sure that S is nonempty before
asking for max(S). That's not just Python - in math you need
to make certain that S is nonempty and also other conditions
before you're allowed to talk about max(S). That's just the
way it is.

Think about all the previously elected female or black
presidents of the US. Which one was the tallest?

bearoph...@lycos.com

unread,

Sep 5, 2008, 11:26:25 AM9/5/08

to

David C. Ullrich:

> I didn't mention what's below because it doesn't seem
> likely that saying max([]) = -infinity and
> min([]) = +infinity is going to make the OP happy...

Well, it sounds cute having Neginfinite and Infinite as built-int
objects that can be compared to any other type and are < of or > of
everything else but themselves. Probably they can be useful as
sentinels, but in Python I nearly never use sentinels anymore, and
they can probably give some other problems...

Bye,
bearophile

castironpi

unread,

Sep 5, 2008, 1:04:08 PM9/5/08

to

On Sep 5, 3:28 am, "Manu Hack" <manuh...@gmail.com> wrote:

For all x.

Mensanator

unread,

Sep 5, 2008, 1:51:02 PM9/5/08

to

On Sep 5, 1:08 am, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Thu, 4 Sep 2008 18:09:49 -0700 (PDT), Mensanator <mensana...@aol.com>
> declaimed the following in comp.lang.python:

>
> > Too bad. I brought this up because I use Python a lot with
> > database work and rarely for proving theorms in ZFC.
>

> As a by-stander... let the DBMS do its work, don't try to make
> Python do what DBMS SQL does...

Sure, and in most cases I use Visual Basic for Applications
when I need functionality I can't get directly from SQL.

But anybody who's used VBA with Access must know what a PITA
it is. And even when you get it working, you sometimes wish you
hadn't. I have a Mann-Kendall trend analysis that must be done
quarterly on over 150 combinations of well:analyte. It takes
over 6 hours to process this (and I don't know how much is due to
VBA, Access, server, network, etc.). It's something I'd love to
try in Python (if I can find the time to translate it).

But I'm wary of things that Python might do (such as return 0
when summing an empty list) that SQL/VBA does not.

> --
> Wulfraed Dennis Lee Bieber KD6MOG
> wlfr...@ix.netcom.com wulfr...@bestiaria.com
> HTTP://wlfraed.home.netcom.com/
> (Bestiaria Support Staff: web-a...@bestiaria.com)
> HTTP://www.bestiaria.com/

Ken Starks

unread,

Sep 5, 2008, 1:54:52 PM9/5/08

to

David C. Ullrich wrote:
> In article <g9r7hq$ri6$1$8302...@news.demon.co.uk>,
> Ken Starks <str...@lampsacos.demon.co.uk> wrote:
>
>> David C. Ullrich wrote:
>>
>>> I don't see why you feel the two should act the same.
>>> At least in mathematics, the sum of the elements of
>>> the empty set _is_ 0, while the maximum element of the
>>> empty set is undefined.
>>>
>>> And both for good reason:
>>>
>>> (i) If A and B are disjoint sets we certainly want to
>>> have sum(A union B) = sum(A) + sum(B). This requires
>>> sum(empty set) = 0.
>>>
>>> (ii) If A is a subset of B then we should have
>>> max(A) <= max(B). This requires that max(empty set)
>>> be something that's smaller than everything else.
>>> So we give up on that.
>> Do we give up? Really ?
>
> Erm, thanks. I was aware of all that below. If we're
> being technical what's below is talking about the sup
> and inf, which are not the same as max and min. More
> relevant to the present context, I didn't mention what's
> below because it doesn't seem likely that saying max([])
> = -infinity and min([]) = +infinity is going to make the
> OP happy...
>

Of course you were aware, I have seen enough of your posts
to know that. And I agree that, whatever Wikipedia seems to
imply, max and supremum should be distiguished.

It was your prelude, "At least in mathematics ..." that
made me prick up my ears. So I couldn't resist responding,
without _any_ malice I assure you.

Cheers,
Ken.

Steven D'Aprano

unread,

Sep 5, 2008, 9:23:56 PM9/5/08

to

On Fri, 05 Sep 2008 10:22:22 -0500, David C. Ullrich wrote about why max
and min shouldn't accept a default argument:

> Think about all the previously elected female or black presidents of the
> US. Which one was the tallest?

I know the answer to that one:

All of them!

--
Steven

Manu Hack

unread,

Sep 5, 2008, 10:20:06 PM9/5/08

to pytho...@python.org

But then how can you conclude sum([]) = 0 from there? It's way far
from obvious.

Steven D'Aprano

unread,

Sep 5, 2008, 11:45:21 PM9/5/08

to

On Fri, 05 Sep 2008 22:20:06 -0400, Manu Hack wrote:

> On Fri, Sep 5, 2008 at 1:04 PM, castironpi <casti...@gmail.com> wrote:

...

>>> >The reason sum([]) is 0 is that sum( [ x ] ) - x = 0.
>>>
>>> It doesn't make sense to me. What do you set x to?
>>
>> For all x.
>
> But then how can you conclude sum([]) = 0 from there? It's way far from
> obvious.

I think Castironpi's reasoning is to imagine taking sum([x])-x for *any*
possible x (where subtraction and addition is defined). Naturally you
always get 0.

Now replace x by *nothing at all* and you get:

sum([]) "subtract nothing at all" = 0

I think that this is a reasonable way to *informally* think about the
question, but it's not mathematically sound, because if you replace x
with "nothing at all" you either get:

sum([]) - = 0

which is invalid (only one operand to the subtraction operator), or you
get:

sum([0]) - 0 = 0

which doesn't involve an empty list. What castironpi seems to be doing is
replacing "nothing at all" with, er, nothing at all in one place, and
zero in the other. And that's what makes it unsound and only suitable as
an informal argument.

[The rest of this is (mostly) aimed at Mensanator, so others can stop
reading if they like.]

Fundamentally, the abstract function "sum" and the concrete Python
implementation of sum() are both human constructs. It's not like there is
some pure Platonic[1] "Ideal Sum" floating in space that we can refer to.
Somewhere, sometime, some mathematician had to *define* sum(), and other
mathematicians had to agree to use the same definition.

They could have decided that sum must take at least two arguments,
because addition requires two arguments and it's meaningless to talk
about adding a single number without talking about adding it to something
else. But they didn't. Similarly, they might have decided that sum must
take at least one argument, and therefore prohibit sum([]), but they
didn't: it's more useful for sum of the empty list to give zero than it
is for it to be an error. As I mentioned earlier, mathematicians are
nothing if not pragmatists.

[1] Or was it Aristotle who believed in Ideal Forms? No, I'm sure it was
Plato.

--
Steven

Manu Hack

unread,

Sep 6, 2008, 12:33:25 AM9/6/08

to pytho...@python.org

Actually it's even more natural to state sum([x]) = x, and this way
you can never conclude that sum([]) = 0 from there.

castironpi

unread,

Sep 6, 2008, 12:57:38 AM9/6/08

to

On Sep 5, 9:20 pm, "Manu Hack" <manuh...@gmail.com> wrote:

You can define sum([a1,a2,...,aN]) recursively as
sum([a1,a2,...a(N-1)])+aN. Call the sum sum([a1,a2,...,aN]) "X", then
subtract aN.

sum([a1,a2,...a(N-1)])+aN=X
sum([a1,a2,...a(N-1)])+aN-aN=X-aN

For N=2, we have:

sum([a1,a2])=X
sum([a1,a2])-a2=X-a2
sum([a1,a2])-a2-a1=X-a2-a1

Since X= a1+ a2, replace X.

sum([a1,a2])-a2-a1=(a1+a2)-a2-a1

Or,

sum([a1,a2])-a2-a1=0

Apply the recursive definition:

sum([a1])+a2-a2-a1=0

And again:

sum([])+a1+a2-a2-a1=0

And we have:

sum([])=0.

Steven D'Aprano

unread,

Sep 6, 2008, 3:49:27 AM9/6/08

to

On Sat, 06 Sep 2008 00:33:25 -0400, Manu Hack wrote:

> Actually it's even more natural to state sum([x]) = x, and this way you
> can never conclude that sum([]) = 0 from there.

But what you can say is that for any list L, sum(L) = sum(L + [0]).

Therefore sum([]) = sum([] +[0]) = 0

--
Steven

Manu Hack

unread,

Sep 6, 2008, 4:08:02 AM9/6/08

to castironpi, pytho...@python.org

It makes more sense now, I just wanted to point out that only with
sum([x]) = x, you can't get sum([]) = 0.

Ken Starks

unread,

Sep 6, 2008, 8:37:09 AM9/6/08

to

This is not necessarily so.

The flaw is that you provide a recursive definition with no start value,
which is to say it is not a recursive definition at all.

A recursive definition should be (for lists where elements
can be added, and ignoring pythonic negative indexing):

Define 'sum(L)' by
a. sum(L[0]) = L[0]
b. sum(L[0:i]) = sum(L[0:i-1]) + L[i] ... if i > 0

From this you can prove the reverse recursion
sum{L[0:k]) = sum(L[0:k+1]) - L[k+1]
__only__ if k >= 0

It says nothing about the empty list.

You could add, as part of the definition, that sum{[]) = 0, or any other
value.

A rather different approach, not quite simple recursion, would be to
start with

A. a slicing axiom, something like:

for all non-negative integers, a,b,c with a <=b <= c:

sum(L[a:c]) = sum(L[a:b]) + sum(L[b:c])

B. a singleton axiom:

for all integers a where L[a] exists:
sum(L[a:a]) = L[a]

2a. sum{

Ken Starks

unread,

Sep 6, 2008, 8:42:29 AM9/6/08

to

This is not necessarily so.

The flaw is that you provide a recursive definition with no start value,
which is to say it is not a recursive definition at all.

A recursive definition should be (for lists where elements
can be added, and ignoring pythonic negative indexing):

Define 'sum(L)' by
a. sum(L[0:1]) = L[0]
b. sum(L[0:i]) = sum(L[0:i-1]) + L[i] ... if i > 1

Mel

unread,

Sep 6, 2008, 11:03:10 AM9/6/08

to

Steven D'Aprano wrote:

Yep. The way it is preserves the distributive property

sum(a+b) = sum(a) + sum(b)

This would matter in cases like (untested code..)

suvsales = sum (sum (s.price for s in d.sales if s.class='suv') for d in
districts)

Mel.

Mensanator

unread,

Sep 6, 2008, 2:22:07 PM9/6/08

to

On Sep 5, 10:45�pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Fri, 05 Sep 2008 22:20:06 -0400, Manu Hack wrote:

> > On Fri, Sep 5, 2008 at 1:04 PM, castironpi <castiro...@gmail.com> wrote:

<snip>

> [The rest of this is (mostly) aimed at Mensanator,

Ok, I see where you're coming from.

> Fundamentally, the abstract function "sum" and the concrete Python
> implementation of sum() are both human constructs. It's not like there is
> some pure Platonic[1] "Ideal Sum" floating in space that we can refer to.
> Somewhere, sometime, some mathematician had to *define* sum(), and other
> mathematicians had to agree to use the same definition.
>
> They could have decided that sum must take at least two arguments,
> because addition requires two arguments and it's meaningless to talk
> about adding a single number without talking about adding it to something
> else. But they didn't.

Ok. But the problem is they DID in SQL: x + Null = Null.

Earlier, you said that an empty box contains 0 widgets.
Fine, empty means 0. But Null doesn't mean empty. Say
your widget supplier just delivers a box and you haven't
opened it yet. Is the box likely to be empty? Probably
not, or they wouldn't have shipped it. In this case,
Null means "unknown", not 0. The number of widgets you
have on hand is Null (unknown) because inventory + Null = Null.

SQL will correctly tell you that the amount on hand is unknown,
whereas Python will tell you the amount on hand is inventory,
which is incorrect.

> Similarly, they might have decided that sum must
> take at least one argument, and therefore prohibit sum([]), but they
> didn't: it's more useful for sum of the empty list to give zero than it
> is for it to be an error. As I mentioned earlier, mathematicians are
> nothing if not pragmatists.
>

Here's a real world example (no ivory tower stuff):

An oil refinery client has just excavated a big pile of
dirt to lay a new pipeline. Due to the volume of the
pipe, there's dirt left over. Ideally, the client
would like to use that dirt as landfill (free), but it
must be tested for HAPS (by summing the concentrations of
organic constituents) to see whether it is considered
hazardous waste, it which cas it must be taken off site
and incinerated (costly).

In MOST cases, a HAPS sum of 0 would be illegal because
0's generally cannot be reported in analytical tests,
you can't report a result less than it's legal reporting
limit. If ALL the consituents were undetected, the sum
should be that of the sum of the reporting limits, thus,
it cannot be 0.

Can't I just use a sum of 0 to tell me when data is missing?
No, because in some cases the reporting limit of undetected
compounds is set to 0.

In which case, a 0 HAPS score means we can confidently
reccomend that the dirt is clean and can be freely reused.

But if the analysis information is missing (hasn'r arrived
yet or still pending validation) we WANT the result to be
UNKNOWN so that we don't reccomend to the client that he take
an illegal course of action.

In this case, SQL does the correct thing and Python would
return a false result.

> --
> Steven

Hendrik van Rooyen

unread,

Sep 6, 2008, 5:20:15 PM9/6/08

to pytho...@python.org

"David C. Ullrich" <dullr...rynet.com> wrote:

>Think about all the previously elected female or black
>presidents of the US. Which one was the tallest?

The current King of France?

- Hendrik

Steven D'Aprano

unread,

Sep 7, 2008, 12:05:40 AM9/7/08

to

On Sat, 06 Sep 2008 11:22:07 -0700, Mensanator wrote:

[...]

>> They could have decided that sum must take at least two arguments,
>> because addition requires two arguments and it's meaningless to talk
>> about adding a single number without talking about adding it to
>> something else. But they didn't.
>
> Ok. But the problem is they DID in SQL: x + Null = Null.

Sheesh. That's not a problem, because Python is not trying to be a
dialect of SQL.

If you want a NULL object, then there are recipes on the web that will
give you one. Then all you need to do is call sum(alist or [NULL]) and it
will give you the behaviour you want.

[...]

> Here's a real world example (no ivory tower stuff):
>
> An oil refinery client has just excavated a big pile of dirt to lay a
> new pipeline.

[snip details]

> Can't I just use a sum of 0 to tell me when data is missing? No, because
> in some cases the reporting limit of undetected compounds is set to 0.

You can't use a sum of 0 to indicate when data is missing, full stop. The
data may require 15 tests when only 3 have actually been done:

sum([1.2e-7, 9.34e-6, 2.06e-8])

Missing data and a non-zero sum. How should sum() deal with that?

The answer is that sum() can't deal with that. You can't expect sum() to
read your mind, know that there should be 15 items instead of 3, and
raise an error. So why do you expect sum() to read your mind and
magically know that zero items is an error, especially when for many
applications it is NOT an error?

The behaviour you want for this specific application is unwanted,
unnecessary and even undesirable for many other applications. The
solution is for *you* to write application-specific code to do what your
application needs, instead of relying on a general purpose function
magically knowing what you want.

--
Steven

Mensanator

unread,

Sep 7, 2008, 1:30:09 PM9/7/08

to

On Sep 6, 11:05�pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Sat, 06 Sep 2008 11:22:07 -0700, Mensanator wrote:
>
> [...]
>
> >> They could have decided that sum must take at least two arguments,
> >> because addition requires two arguments and it's meaningless to talk
> >> about adding a single number without talking about adding it to
> >> something else. But they didn't.
>
> > Ok. But the problem is they DID in SQL: x + Null = Null.
>
> Sheesh. That's not a problem, because Python is not trying to be a
> dialect of SQL.

And yet, they added a Sqlite3 module.

>
> If you want a NULL object, then there are recipes on the web that will
> give you one. Then all you need to do is call sum(alist or [NULL]) and it
> will give you the behaviour you want.

Actualy, I already get the behaviour I want. sum([1,None])
throws an exception. I don't see why sum([]) doesn't throw
an exception also (I understand that behaviour is by design,
I'm merely pointing out that the design doesn't cover every
situation).

>
> [...]
>
> > Here's a real world example (no ivory tower stuff):
>
> > An oil refinery client has just excavated a big pile of dirt to lay a
> > new pipeline.
> [snip details]
> > Can't I just use a sum of 0 to tell me when data is missing? No, because
> > in some cases the reporting limit of undetected compounds is set to 0.
>
> You can't use a sum of 0 to indicate when data is missing, full stop.

Exactly. That's why I would prefer sum([]) to raise an
exception instead of giving a false positive.

> The
> data may require 15 tests when only 3 have actually been done:
>
> sum([1.2e-7, 9.34e-6, 2.06e-8])

Biggest problem here is that it is often unknown just
how many records you're supposed to get from the query,
so we can't tell that a count of 3 is supposed to be 15.

>
> Missing data and a non-zero sum. How should sum() deal with that?

That's a seperate issue and I'm not saying it should as
long as the list contains actual numbers to sum.
sum([1.2e-7, 9.34e-6, 2.06e-8, None]) will raise an
exception, as it should. But what types are contained
in []?

>
> The answer is that sum() can't deal with that. You can't expect sum() to
> read your mind, know that there should be 15 items instead of 3, and
> raise an error. So why do you expect sum() to read your mind and
> magically know that zero items is an error, especially when for many
> applications it is NOT an error?

For the simple reason it doesn't have to read your mind,
a mechanism has already been built into the function: start
value. For those situations where an empty list is desired
to sum to 0, you could use sum(alist,0) and use sum(alist) for
those cases where summing an empty list is meaningless.
Shouldn't you have to explicitly tell sum() how deal with
situations like empty lists rather than have it implicitly
assume a starting value of 0 when you didn't ask for it?

>
> The behaviour you want for this specific application is unwanted,
> unnecessary and even undesirable for many other applications. The
> solution is for *you* to write application-specific code to do what your
> application needs, instead of relying on a general purpose function
> magically knowing what you want.

Does division magically know what you want? No, it raises an
exception when you do something like divide by 0. Isn't it
Pythonic to not write a litany of tests to cover every
possible case, but instead use try:except?

But try:except only works if the errors are recognized.
And sum() says that summing an empty list is NEVER an error
under ANY circumstance. That may be true in MOST cases, but
it certainly isn't true in ALL cases.

>
> --
> Steven

Patrick Maupin

unread,

Sep 7, 2008, 2:17:04 PM9/7/08

to

On Sep 7, 12:30 pm, Mensanator <mensana...@aol.com> wrote:
> On Sep 6, 11:05 pm, Steven D'Aprano <st...@REMOVE-THIS-

> > Sheesh. That's not a problem, because Python is not trying to be a
> > dialect of SQL.
>
> And yet, they added a Sqlite3 module.

Does that mean that, because there is an 'os' module, Python is trying
to compete with Linux and Windows?

This is starting to feel like a troll, but JUST IN CASE you are really
serious about wanting to get work done with Python, rather than
complaining about how it is not perfect, I offer the following snippet
which will show you how you can test the results of a sum() to see if
there were any items in the list:

>>> class MyZero(int):
... pass
...
>>> zero = MyZero()
>>> x=sum([], zero)
>>> isinstance(x,MyZero)
True
>>> x = sum([1,2,3], zero)
>>> isinstance(x,MyZero)
False
>>>

Message has been deleted

Gabriel Genellina

unread,

Sep 7, 2008, 4:13:11 PM9/7/08

to pytho...@python.org

En Sun, 07 Sep 2008 14:30:09 -0300, Mensanator <mensa...@aol.com> escribió:

> Actualy, I already get the behaviour I want. sum([1,None])
> throws an exception. I don't see why sum([]) doesn't throw
> an exception also (I understand that behaviour is by design,
> I'm merely pointing out that the design doesn't cover every
> situation).
[...]

> Exactly. That's why I would prefer sum([]) to raise an
> exception instead of giving a false positive.

The built in behavior can't be good for every usage. Nobody prevents you from defining yoru own function tailored to your own specs, like this:

def strict_sum(items):
items = iter(items)
try:
first = items.next()
except StopIteration:
raise ValueError, "strict_sum with empty argument"
return sum(items, first)

Tweak as needed. Based on other posts I believe your Python skills are enough to write it on your own, so I don't see why you're complaining so hard about the current behavior.

--
Gabriel Genellina

Mensanator

unread,

Sep 7, 2008, 8:22:21 PM9/7/08

to

On Sep 7, 3:13�pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar> wrote:
> En Sun, 07 Sep 2008 14:30:09 -0300, Mensanator <mensana...@aol.com> escribi�:

I'm not complaining about the behaviour anymore, I just don't like
being told I'm wrong when I'm not.

But I think I've made my point, so there's no point in harping on
this anymore.

>
> --
> Gabriel Genellina

Mensanator

unread,

Sep 7, 2008, 8:36:58 PM9/7/08

to

On Sep 7, 1:17�pm, Patrick Maupin <pmau...@gmail.com> wrote:
> On Sep 7, 12:30�pm, Mensanator <mensana...@aol.com> wrote:
>
> > On Sep 6, 11:05 pm, Steven D'Aprano <st...@REMOVE-THIS-
> > > Sheesh. That's not a problem, because Python is not trying to be a
> > > dialect of SQL.
>
> > And yet, they added a Sqlite3 module.
>
> Does that mean that, because there is an 'os' module, Python is trying
> to compete with Linux and Windows?

I wasn't thinking "compete", rather "complement". Python obviously
wants to be a player in the SQL market, so you would think it
would be in Python's interest to know how SQL behaves, just as it's in
Python's interest for the os module to know how BOTH Linnux and
Windows work.

>
> This is starting to feel like a troll,

It wasn't intended to be.

> but JUST IN CASE you are really
> serious about wanting to get work done with Python, rather than
> complaining about how it is not perfect,

Things never change if no one ever speaks up.

> I offer the following snippet
> which will show you how you can test the results of a sum() to see if
> there were any items in the list:

Thanks. I'll drop this from this point on.

>
> >>> class MyZero(int):
>
> ... � � pass
> ...
>
>
>
> >>> zero = MyZero()
> >>> x=sum([], zero)
> >>> isinstance(x,MyZero)
> True
> >>> x = sum([1,2,3], zero)
> >>> isinstance(x,MyZero)

> False- Hide quoted text -
>
> - Show quoted text -

Mensanator

unread,

Sep 7, 2008, 8:49:17 PM9/7/08

to

On Sep 7, 2:17�pm, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Sun, 7 Sep 2008 10:30:09 -0700 (PDT), Mensanator <mensana...@aol.com>

> declaimed the following in comp.lang.python:
>

> > On Sep 6, 11:05?pm, Steven D'Aprano <st...@REMOVE-THIS-

> > cybersource.com.au> wrote:
>
> > > Sheesh. That's not a problem, because Python is not trying to be a
> > > dialect of SQL.
>
> > And yet, they added a Sqlite3 module.
>

> � � � � Which is an interface TO an embedded/stand-alone SQL-based RDBM
> engine; it does not turn Python into a dialect of SQL -- Python does not
> process the SQL, it gets passed to the engine for SQL data processing.

But that's only half the story. The other half is data returned
as a result of SQL queries. And that's something Python DOES process.
And sometimes that processed data has to be inserted back into the
database. We certainly don't want Python to process the data in a way
that the database doesn't expect.

When I see a potential flaw (such as summing an empty list to 0),
should I just keep quiet about it, or let everyone know?

Well, now they know, so I'll shut up about this from now on, ok?

Message has been deleted

Boris Borcic

unread,

Sep 8, 2008, 9:54:19 AM9/8/08

to pytho...@python.org

David C. Ullrich wrote:
>
> (ii) If A is a subset of B then we should have
> max(A) <= max(B). This requires that max(empty set)
> be something that's smaller than everything else.
> So we give up on that.
>

Er, what about instances of variations/elaborations on

class Smaller(object) : __cmp__ = lambda *_ : -1

?

Cheers, BB

castironpi

unread,

Sep 8, 2008, 3:08:08 PM9/8/08

to

You still don't have the property max(X) is in X.

And it's the equivalent of a special builtin constant for max on the
empty set.

Boris Borcic

unread,

Sep 9, 2008, 9:47:20 AM9/9/08

to pytho...@python.org

castironpi wrote:
> On Sep 8, 8:54 am, Boris Borcic <bbor...@gmail.com> wrote:
>> David C. Ullrich wrote:
>>
>>> (ii) If A is a subset of B then we should have
>>> max(A) <= max(B). This requires that max(empty set)
>>> be something that's smaller than everything else.
>>> So we give up on that.
>> Er, what about instances of variations/elaborations on
>>
>> class Smaller(object) : __cmp__ = lambda *_ : -1
>>
>> ?
>>
>> Cheers, BB
>
> You still don't have the property max(X) is in X.

Frankly, I would favor order-independence over that property.

compare max(X) for

1) X = [set([1]),set([2])]

and

2) X = [set([2]),set([1])]

Shouldn't then max and min in fact return lub and glb, despite their names ? In
the case X is a non-empty finite set/list of totally ordered values,
max(X)==lub(X) and min(X)=glb(X) in any case.

>
> And it's the equivalent of a special builtin constant for max on the
> empty set.

Of course (except the object might have other uses, who knows). So what ?

Cheers, BB

David C. Ullrich

unread,

Sep 9, 2008, 11:58:06 AM9/9/08

to

In article
<b4f287a7-8e1f-4057...@m73g2000hsh.googlegroups.com>,
bearoph...@lycos.com wrote:

> David C. Ullrich:

> > I didn't mention what's below because it doesn't seem
> > likely that saying max([]) = -infinity and
> > min([]) = +infinity is going to make the OP happy...
>

> Well, it sounds cute having Neginfinite and Infinite as built-int
> objects that can be compared to any other type and are < of or > of
> everything else but themselves.

Like I said, I'm not going to say anything about how Python
should be. If I were going to comment on that I'd say it would
be cute but possibly silly to actually add to the core.

But in the math library I made some time ago there was an
AbsoluteZero with the property that when you added it to
x you got x for any x whatever (got used as the default
additive identity for classes that didn't have an
add_id defined...)

> Probably they can be useful as
> sentinels, but in Python I nearly never use sentinels anymore, and
> they can probably give some other problems...
>
> Bye,
> bearophile

--
David C. Ullrich

David C. Ullrich

unread,

Sep 9, 2008, 11:58:49 AM9/9/08

to

In article <00d1d60c$0$20302$c3e...@news.astraweb.com>,

Heh. Mysteries of the empty set.

--
David C. Ullrich

Luis Zarrabeitia

unread,

Sep 7, 2008, 4:38:06 PM9/7/08

to pytho...@python.org

Quoting Mensanator <mensa...@aol.com>:

> Actualy, I already get the behaviour I want. sum([1,None])
> throws an exception. I don't see why sum([]) doesn't throw
> an exception also

If you take a "start value" and add to it every element of a list, should the
process fail if the list is empty? If you don't add anything to the start value,
you should get back the start value.

Python's sum is defined as sum(sequence, start=0). If sum were to throw an
exception with sum([]), it should also throw it with sum([], start=0), wich
makes no sense.

--
Luis Zarrabeitia
Facultad de Matemática y Computación, UH
http://profesores.matcom.uh.cu/~kyrie

Mensanator

unread,

Sep 10, 2008, 1:48:17 PM9/10/08

to

On Sep 7, 3:38 pm, Luis Zarrabeitia <ky...@uh.cu> wrote:

> Quoting Mensanator <mensana...@aol.com>:
>
> > Actualy, I already get the behaviour I want. sum([1,None])
> > throws an exception. I don't see why sum([]) doesn't throw
> > an exception also
>
> If you take a "start value" and add to it every element of a list, should the
> process fail if the list is empty?

No.

> If you don't add anything to the start value,
> you should get back the start value.

Agree.

>
> Python's sum is defined as sum(sequence, start=0).

That's the issue.

> If sum were to throw an
> exception with sum([]), it should also throw it with sum([], start=0), wich
> makes no sense.

Given that definition, yes. But is the definition correct
in ALL cases? Are there situations where the sum of an empty
list should NOT be 0? Of course there are.

Can sum() handle those cases? No, it can't, I have to write
my own definition if I want that behaviour. There's no reason
why sum([]) and sum([],0) have to mean the same thing at the
exclusion of a perfectly valid alternative definition.

But that's the way it is, so I have to live with it.

But that's not conceeding that I'm wrong.

Terry Reedy

unread,

Sep 10, 2008, 6:36:46 PM9/10/08

to pytho...@python.org

Mensanator wrote:
> Are there situations where the sum of an empty
> list should NOT be 0? Of course there are.

Python Philosopy (my version, for this discussion):
Make normal things easy; make unusual or difficult things possible.

Application:
Sum([]) == 0 is normal (90+% of cases). Make that easy (as it is).
For anything else:
if seq: s = sum(s, base)
else: <whatever, including like raise your desired exception>
which is certainly pretty easy.

> Can sum() handle those cases?

The developers choose what they thought would be most useful across the
spectrum of programmers and programs after some non-zero amount of
debate and discussion.

> No, it can't, I have to write
> my own definition if I want that behaviour.

Or wrap your calls. In any case, before sum was added as a convenience
for summing numbers, *everyone* has to write their own or use reduce.

Sum(s) replaces reduce(lambda x,y: x+y, s, 0), which was thought to be
the most common use of reduce. Sum(s,start) replaces the much less
common reduce(lambda x,y: x+y, s, start).

Reduce(S, s), where S = sum function, raises an exception on empty s.
So use that and you are no worse off than before.

However, a problem with reduce(S,s) is that it is *almost* the same as
reduce(S,s,0). So people are sometimes tempted to omit 0, especially if
they are not sure if the call might be reduce(S,0,s) (as one argument
says it should be -- but that is another post). But if they do, the
program fails, even if it should not, if and when s happens to be empty.

> There's no reason
> why sum([]) and sum([],0) have to mean the same thing at the
> exclusion of a perfectly valid alternative definition.

'Have to', no reason. 'Should', yes there are at least three reasons.
1. Python functions generally return an answer rather than raise an
exception where there is a perfectly valid answer to return.
2. As a general principle, something that is almost always true should
not need to be stated over and over again. This is why, for instance,
we have default args.
3. As I remember, part of the reason for adding sum was to eliminate the
need (with reduce) to explicitly say 'start my sum at 0' in order to
avoid buggy code. In other words, I believe part of the reason for
sum's existence is to avoid the very bug-inviting behavior you want.

Terry Jan Reedy

Mensanator

unread,

Sep 10, 2008, 8:12:07 PM9/10/08

to

What am I doing wrong?

>>> S = sum

>>> S
<built-in function sum>

>>> s = [1,2,3]
>>> type(s)
<type 'list'>

>>> reduce(S,s)
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
reduce(S,s)
TypeError: 'int' object is not iterable

>>> reduce(S,s,0)
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
reduce(S,s,0)
TypeError: 'int' object is not iterable

>>> reduce(lambda x,y:x+y,s)
6

>>> s=[]
>>> reduce(lambda x,y:x+y,s)
Traceback (most recent call last):
File "<pyshell#17>", line 1, in <module>
reduce(lambda x,y:x+y,s)
TypeError: reduce() of empty sequence with no initial value

This is supposed to happen. But doesn't reduce(S,s) work
when s isn't empty?

Terry Reedy

unread,

Sep 11, 2008, 12:44:07 AM9/11/08

to pytho...@python.org

Mensanator wrote:
> On Sep 10, 5:36 pm, Terry Reedy <tjre...@udel.edu> wrote:

>> Sum(s) replaces reduce(lambda x,y: x+y, s, 0), which was thought to be
>> the most common use of reduce. Sum(s,start) replaces the much less
>> common reduce(lambda x,y: x+y, s, start).
>>
>> Reduce(S, s), where S = sum function, raises an exception on empty s.
>> So use that and you are no worse off than before.

> What am I doing wrong?
>>>> S = sum

[snip]

Taking me too literally out of context. I meant the sum_of_2 function
already given in the example above, as you eventually tried.

def S(x,y): return x+y

Sorry for the confusion.

...

>>>> reduce(lambda x,y:x+y,s)
> 6
>
>>>> s=[]
>>>> reduce(lambda x,y:x+y,s)
> Traceback (most recent call last):
> File "<pyshell#17>", line 1, in <module>
> reduce(lambda x,y:x+y,s)
> TypeError: reduce() of empty sequence with no initial value

These two are exactly what I meant.

> This is supposed to happen. But doesn't reduce(S,s) work
> when s isn't empty?

It did. You got 6 above. The built-in 'sum' takes an iterable, not a
pair of numbers.

tjr

Boris Borcic

unread,

Sep 13, 2008, 6:00:42 AM9/13/08

to pytho...@python.org

Tino Wildenhain wrote:
> Hi,
>
> Luis Zarrabeitia wrote:
>> Quoting Laszlo Nagy <gan...@shopzeus.com>:
>>
> ...
>> Even better:
>>
>> help(sum) shows
>>
>> ===
>> sum(...)
>> sum(sequence, start=0) -> value
>> Returns the sum of a sequence of numbers (NOT strings) plus
>> the value
>> of parameter 'start'. When the sequence is empty, returns start.
>> ===
>>
>> so the fact that sum([]) returns zero is just because the start value
>> is zero...
>> sum([],object()) would return an object().
>>
>> BTW, the original code:
>>
>>>>> sum(s for s in ["a", "b"] if len(s) > 2)
>>
>> wouldn't work anyway... it seems that sum doesn't like to sum strings:
>>
>>>>> sum(['a','b'],'')
>>
>> <type 'exceptions.TypeError'>: sum() can't sum strings [use
>> ''.join(seq) instead]
>
> Yes which is a bit bad anyway. I don't think hard wiring it is such a
> nice idea. You know, walks like a duck, smells like a duck...
> If it makes sense to handle things differently for performance, then
> please have it doing it silently, e.g. when it detects strings just
> use join() internally.
>
> Cheers
> Tino

+1

''.join is horrible. And it adds insult to injury that S.join(S.split(T)) != T
as a rule. The interpreter has no business to patronize us into this shamefully
contorted neighborhood while it understands what we want.

Cheers, BB

Boris Borcic

unread,

Sep 13, 2008, 7:06:43 AM9/13/08

to pytho...@python.org

I wrote:
> Tino Wildenhain wrote:
[...]

>>>>>> sum(['a','b'],'')
>>>
>>> <type 'exceptions.TypeError'>: sum() can't sum strings [use
>>> ''.join(seq) instead]
>>
>> Yes which is a bit bad anyway. I don't think hard wiring it is such a
>> nice idea. You know, walks like a duck, smells like a duck...
>> If it makes sense to handle things differently for performance, then
>> please have it doing it silently, e.g. when it detects strings just
>> use join() internally.
>>
>> Cheers
>> Tino
>
> +1
>
> ''.join is horrible. And it adds insult to injury that
> S.join(S.split(T)) != T as a rule. The interpreter has no business to
> patronize us into this shamefully contorted neighborhood while it
> understands what we want.

What makes ''.join particularly horrible is that we find ourselves forced to use
it not only for concatenating arbitrary-length strings in a list, but also to
convert to a str what's already a sequence of single characters. IOW string
types fail to satisfy a natural expectation for any S of sequence type :

S == type(S)(item for item in S) == type(S)(list(S))

And this, even though strings are sequence types deep-down-ly enough that they
achieve to act as such in far-fetched corner cases like

(lambda *x : x)(*'abc')==('a','b','c')

...and even though strings offer not one but two distinct constructors that play
nicely in back-and-forth conversions with types to which they are much less
closely related, ie.

'1j' == repr(complex('1j') == str(complex('1j'))
1j == complex(repr(1j)) == complex(str(1j))

Not-so-cheerfully-yours, BB

Scott David Daniels

unread,

Nov 24, 2008, 4:20:59 PM11/24/08

to

_and_, as it turns out, sets of cardinality 1.

--Scott David Daniels (pleased about the change in cardinality)
Scott....@Acm.Org