Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

default behavior

9 views
Skip to first unread message

wheres pythonmonks

unread,
Jul 29, 2010, 2:12:09 PM7/29/10
to Python List
Why is the default value of an int zero?

>>> x = int
>>> print x
<type 'int'>
>>> x()
0
>>>

How do I build an "int1" type that has a default value of 1?
[Hopefully no speed penalty.]
I am thinking about applications with collections.defaultdict.
What if I want to make a defaultdict of defaultdicts of lists? [I
guess my Perl background is showing -- I miss auto-vivification.]

W

Paul Rubin

unread,
Jul 29, 2010, 2:18:39 PM7/29/10
to
wheres pythonmonks <wherespy...@gmail.com> writes:
> How do I build an "int1" type that has a default value of 1?
> [Hopefully no speed penalty.]
> I am thinking about applications with collections.defaultdict.

You can supply an arbitary function to collections.defaultdict.
It doesn't have to be a class. E.g.

d = collections.defaultdict(lambda: 1)

will do what you are asking.

wheres pythonmonks

unread,
Jul 29, 2010, 2:35:06 PM7/29/10
to pytho...@python.org
Thanks. I presume this will work for my nested example as well. Thanks again.

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Nick Raptis

unread,
Jul 29, 2010, 2:43:05 PM7/29/10
to pytho...@python.org
On 07/29/2010 09:12 PM, wheres pythonmonks wrote:
> How do I build an "int1" type that has a default value of 1?
You mean something like:
>>> x = int()
>>> x
0
>>> def myint(value=1):
... return int(value)
...
>>> myint()
1
>>>

That's ugly on so many levels..

Anyway, basic types (and almost everything else) in Python are classes.
You can always subclass them to do whatever you like

> [Hopefully no speed penalty.]
> I am thinking about applications with collections.defaultdict.

> What if I want to make a defaultdict of defaultdicts of lists? [I
> guess my Perl background is showing -- I miss auto-vivification.]
>
>
>

Ah, python is no perl. Then again, perl is no python either.
----- Random pseudo-Confucius quote

Have fun,
Nick

John Nagle

unread,
Jul 29, 2010, 3:28:18 PM7/29/10
to
On 7/29/2010 11:12 AM, wheres pythonmonks wrote:
> Why is the default value of an int zero?
>
>>>> x = int
>>>> print x
> <type 'int'>
>>>> x()
> 0
>>>>
>
> How do I build an "int1" type that has a default value of 1?


>>> class int1(object) :
... def __init__(self) :
... self.val = 1
... def __call__(self) :
... return(self.val)
...
>>> x = int1()
>>> x()
1

This isn't useful; you'd also have to define all the numeric operators
for this type. And then there are mixed-type conversion issues.

Inheriting from "int" is not too helpful, because you can't assign
to the value of the base class. "self=1" won't do what you want.

[Hopefully no speed penalty.]
In your dreams. Although all numbers in CPython are "boxed",
so there's more of a speed penalty with "int" itself than you
might expect. There are some C libraries for handling large
arrays if you really need to crunch numbers.

John Nagle

Christian Heimes

unread,
Jul 29, 2010, 3:40:16 PM7/29/10
to pytho...@python.org
> Inheriting from "int" is not too helpful, because you can't assign
> to the value of the base class. "self=1" won't do what you want.

It's useful if you remember that you can set the default value by
overwriting __new__.

>>> class int1(int):
... def __new__(cls, value=1):
... return super(int1, cls).__new__(cls, value)
...
>>> int1()
1
>>> int1(1)
1
>>> int1(2)
2
>>> int1(0)
0

Steven D'Aprano

unread,
Jul 29, 2010, 6:43:43 PM7/29/10
to
On Thu, 29 Jul 2010 21:43:05 +0300, Nick Raptis wrote:

> On 07/29/2010 09:12 PM, wheres pythonmonks wrote:
>> How do I build an "int1" type that has a default value of 1?
> You mean something like:
> >>> x = int()
> >>> x
> 0
> >>> def myint(value=1):
> ... return int(value)
> ...
> >>> myint()
> 1
> >>>
> >>>
> That's ugly on so many levels..

Why do you think it's ugly? It's a function that returns an int, and it
provides a default value which is different from the default value of the
int constructor. It's a really simple function, and it has an equally
simple implementation, and it's an obvious way to do it. Not the *one*
obvious way, because subclassing int is equally obvious, but still
obvious.

--
Steven

Peter Otten

unread,
Jul 30, 2010, 4:40:51 AM7/30/10
to
wheres pythonmonks wrote:

> How do I build an "int1" type that has a default value of 1?
> [Hopefully no speed penalty.]
> I am thinking about applications with collections.defaultdict.

>>> from collections import defaultdict
>>> d = defaultdict(1 .conjugate)
>>> d["x"] += 2
>>> d["x"]
3

Isn't that beautiful? Almost like home;)

It is also fast:

$ python -m timeit -s"one = lambda: 1" "one()"
1000000 loops, best of 3: 0.213 usec per loop
$ python -m timeit -s"one = 1 .conjugate" "one()"
10000000 loops, best of 3: 0.0972 usec per loop

Micro-optimisation, the best excuse for ugly code...

Peter

Duncan Booth

unread,
Jul 30, 2010, 5:57:46 AM7/30/10
to
Peter Otten <__pet...@web.de> wrote:

>>>> from collections import defaultdict
>>>> d = defaultdict(1 .conjugate)
>>>> d["x"] += 2
>>>> d["x"]
> 3
>
> Isn't that beautiful? Almost like home;)
>
> It is also fast:
>
> $ python -m timeit -s"one = lambda: 1" "one()"
> 1000000 loops, best of 3: 0.213 usec per loop
> $ python -m timeit -s"one = 1 .conjugate" "one()"
> 10000000 loops, best of 3: 0.0972 usec per loop
>
> Micro-optimisation, the best excuse for ugly code...
>

Nice one, but if you are going to micro-optimise why not save a few
keystrokes while you're at it and use '1 .real' instead?

--
Duncan Booth http://kupuguy.blogspot.com

Peter Otten

unread,
Jul 30, 2010, 6:21:39 AM7/30/10
to
Duncan Booth wrote:

>>> 1 .real
1
>>> 1 .conjugate
<built-in method conjugate of int object at 0x1734298>
>>> 1 .conjugate()

real is a property, not a method. conjugate() was the first one that worked
that was not __special__. I think it has the added benefit that it's likely
to confuse the reader...

Peter

Duncan Booth

unread,
Jul 30, 2010, 6:56:23 AM7/30/10
to
Peter Otten <__pet...@web.de> wrote:
> real is a property, not a method. conjugate() was the first one that
> worked that was not __special__. I think it has the added benefit that
> it's likely to confuse the reader...
>
Ah, silly me, I should have realised that.

Yes, micro-optimisations that are also micro-obfuscations are always the
best. :^)

wheres pythonmonks

unread,
Jul 30, 2010, 7:59:52 AM7/30/10
to Python List
Instead of defaultdict for hash of lists, I have seen something like:


m={}; m.setdefault('key', []).append(1)

Would this be preferred in some circumstances?
Also, is there a way to upcast a defaultdict into a dict? I have also
heard some people use exceptions on dictionaries to catch key
existence, so passing in a defaultdict (I guess) could be hazardous to
health. Is this true?

W

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Steven D'Aprano

unread,
Jul 30, 2010, 8:19:52 AM7/30/10
to
On Fri, 30 Jul 2010 07:59:52 -0400, wheres pythonmonks wrote:

> Instead of defaultdict for hash of lists, I have seen something like:
>
>
> m={}; m.setdefault('key', []).append(1)
>
> Would this be preferred in some circumstances?

Sure, why not? Whichever you prefer.

setdefault() is a venerable old technique, dating back to Python 2.0, and
not a newcomer like defaultdict.


> Also, is there a way to upcast a defaultdict into a dict?

"Upcast"? Surely it is downcasting. Or side-casting. Or type-casting.
Whatever. *wink*

Whatever it is, the answer is Yes:

>>> from collections import defaultdict as dd
>>> x = dd(int)
>>> x[1] = 'a'
>>> x
defaultdict(<type 'int'>, {1: 'a'})
>>> dict(x)
{1: 'a'}

> I have also heard some people use
> exceptions on dictionaries to catch key existence, so passing in a
> defaultdict (I guess) could be hazardous to health. Is this true?

Yes, it is true that some people use exceptions on dicts to catch key
existence. The most common reason to do so is to catch the non-existence
of a key so you can add it:

try:
mydict[x] = mydict[x] + 1
except KeyError:
mydict[x] = 1


If mydict is a defaultdict with the appropriate factory, then the change
is perfectly safe because mydict[x] will not raise an exception when x is
missing, but merely return 0, so it will continue to work as expected and
all is good.

Of course, if you pass it an defaultdict with an *inappropriate* factory,
you'll get an error. So don't do that :) Seriously, you can't expect to
just randomly replace a variable with some arbitrarily different variable
and expect it to work. You need to know what the code is expecting, and
not break those expectations too badly.

And now you have at least three ways of setting missing values in a dict.
And those wacky Perl people say that Python's motto is "only one way to
do it" :)

--
Steven

Peter Otten

unread,
Jul 30, 2010, 8:33:49 AM7/30/10
to pytho...@python.org
wheres pythonmonks wrote:

> Instead of defaultdict for hash of lists, I have seen something like:
>
>
> m={}; m.setdefault('key', []).append(1)
>
> Would this be preferred in some circumstances?

In some circumstances, sure. I just can't think of them at the moment.
Maybe if your code has to work in Python 2.4.

> Also, is there a way to upcast a defaultdict into a dict?

dict(some_defaultdict)

> I have also
> heard some people use exceptions on dictionaries to catch key
> existence, so passing in a defaultdict (I guess) could be hazardous to
> health. Is this true?

A problem could arise when you swap a "key in dict" test with a
"try...except KeyError". This would be an implementation detail for a dict
but affect the contents of a defaultdict:

>>> from collections import defaultdict
>>> def update(d):
... for c in "abc":
... try: d[c]
... except KeyError: d[c] = c
...
>>> d = defaultdict(lambda:"-")
>>> update(d)
>>> d
defaultdict(<function <lambda> at 0x7fd4ce32a320>, {'a': '-', 'c': '-', 'b':
'-'})
>>> def update2(d):
... for c in "abc":
... if c not in d:
... d[c] = c
...
>>> d = defaultdict(lambda:"-")
>>> update2(d)
>>> d
defaultdict(<function <lambda> at 0x7fd4ce32a6e0>, {'a': 'a', 'c': 'c', 'b':
'b'})

Peter

wheres pythonmonks

unread,
Jul 30, 2010, 8:34:52 AM7/30/10
to Python List, Steven D'Aprano
Sorry, doesn't the following make a copy?

>>>> from collections import defaultdict as dd
>>>> x = dd(int)
>>>> x[1] = 'a'
>>>> x
> defaultdict(<type 'int'>, {1: 'a'})
>>>> dict(x)
> {1: 'a'}
>
>


I was hoping not to do that -- e.g., actually reuse the same
underlying data. Maybe dict(x), where x is a defaultdict is smart? I
agree that a defaultdict is safe to pass to most routines, but I guess
I could imagine that a try/except block is used in a bit of code where
on the key exception (when the value is absent) populates the value
with a random number. In that application, a defaultdict would have
no random values.


Besides a slightly different favor, does the following have
applications not covered by defaultdict?

m.setdefault('key', []).append(1)

I think I am unclear on the difference between that and:

m['key'] = m.get('key',[]).append(1)

Except that the latter works for immutable values as well as containers.

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Steven D'Aprano

unread,
Jul 30, 2010, 11:47:49 PM7/30/10
to
On Fri, 30 Jul 2010 08:34:52 -0400, wheres pythonmonks wrote:

> Sorry, doesn't the following make a copy?
>
>>>>> from collections import defaultdict as dd x = dd(int)
>>>>> x[1] = 'a'
>>>>> x
>> defaultdict(<type 'int'>, {1: 'a'})
>>>>> dict(x)
>> {1: 'a'}
>>
>>
>>
>
> I was hoping not to do that -- e.g., actually reuse the same underlying
> data.


It does re-use the same underlying data.

>>> from collections import defaultdict as dd

>>> x = dd(list)
>>> x[1].append(1)
>>> x
defaultdict(<type 'list'>, {1: [1]})
>>> y = dict(x)
>>> x[1].append(42)
>>> y
{1: [1, 42]}

Both the defaultdict and the dict are referring to the same underlying
key:value pairs. The data itself isn't duplicated. If they are mutable
items, a change to one will affect the other (because they are the same
item). An analogy for C programmers would be that creating dict y from
dict y merely copies the pointers to the keys and values, it doesn't copy
the data being pointed to.

(That's pretty much what the CPython implementation does. Other
implementations may do differently, so long as the visible behaviour
remains the same.)

> Maybe dict(x), where x is a defaultdict is smart? I agree that a
> defaultdict is safe to pass to most routines, but I guess I could
> imagine that a try/except block is used in a bit of code where on the
> key exception (when the value is absent) populates the value with a
> random number. In that application, a defaultdict would have no random
> values.

If you want a defaultdict with a random default value, it is easy to
provide:

>>> import random
>>> z = dd(random.random)
>>> z[2] += 0
>>> z
defaultdict(<built-in method random of Random object at 0xa01e4ac>, {2:
0.30707092626033605})


The point which I tried to make, but obviously failed, is that any piece
of code has certain expectations about the data it accepts. If take a
function that expects an int between -2 and 99, and instead decide to
pass a Decimal between 100 and 150, then you'll have problems: if you're
lucky, you'll get an exception, if you're unlucky, it will silently give
the wrong results. Changing a dict to a defaultdict is no different.

If you have code that *relies* on getting a KeyError for missing keys:

def who_is_missing(adict):
for person in ("Fred", "Barney", "Wilma", "Betty"):
try:
adict[person]
except KeyError:
print person, "is missing"

then changing adict to a defaultdict will cause the function to
misbehave. That's not unique to dicts and defaultdicts.

> Besides a slightly different favor, does the following have applications
> not covered by defaultdict?
>
> m.setdefault('key', []).append(1)

defaultdict calls a function of no arguments to provide a default value.
That means, in practice, it almost always uses the same default value for
any specific dict.

setdefault takes an argument when you call the function. So you can
provide anything you like at runtime.


> I think I am unclear on the difference between that and:
>
> m['key'] = m.get('key',[]).append(1)

Have you tried it? I guess you haven't, or you wouldn't have thought they
did the same thing.

Hint -- what does [].append(1) return?


--
Steven

wheres pythonmonks

unread,
Jul 31, 2010, 1:02:47 AM7/31/10
to pytho...@python.org
>
> Hint -- what does [].append(1) return?
>

Again, apologies from a Python beginner. It sure seems like one has
to do gymnastics to get good behavior out of the core-python:

Here's my proposed fix:

m['key'] = (lambda x: x.append(1) or x)(m.get('key',[]))

Yuck! So I guess I'll use defaultdict with upcasts to dict as needed.

On a side note: does up-casting always work that way with shared
(common) data from derived to base? (I mean if the data is part of
base's interface, will b = base(child) yield a new base object that
shares data with the child?)

Thanks again from a Perl-to-Python convert!

W

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Steven D'Aprano

unread,
Jul 31, 2010, 5:55:22 AM7/31/10
to
On Sat, 31 Jul 2010 01:02:47 -0400, wheres pythonmonks wrote:


>> Hint -- what does [].append(1) return?
>>
>>
> Again, apologies from a Python beginner. It sure seems like one has to
> do gymnastics to get good behavior out of the core-python:
>
> Here's my proposed fix:
>
> m['key'] = (lambda x: x.append(1) or x)(m.get('key',[]))
>
> Yuck!

Yuk is right. What's wrong with the simple, straightforward solution?

L = m.get('key', [])
L.append(1)
m['key'] = L


Not everything needs to be a one-liner. But if you insist on making it a
one-liner, that's what setdefault and defaultdict are for.

> So I guess I'll use defaultdict with upcasts to dict as needed.

You keep using that term "upcast". I have no idea what you think it
means, so I have no idea whether or not Python does it. Perhaps you
should explain what you think "upcasting" is.


> On a side note: does up-casting always work that way with shared
> (common) data from derived to base? (I mean if the data is part of
> base's interface, will b = base(child) yield a new base object that
> shares data with the child?)

Of course not. It depends on the implementation of the class.


--
Steven

wheres pythonmonks

unread,
Jul 31, 2010, 11:00:53 AM7/31/10
to pytho...@python.org
I think of an upcast as casting to the base-class (casting up the
inheritance tree).
http://en.wiktionary.org/wiki/upcast
But really, what I am thinking of doing is overriding the virtual
methods of a derived class with the base class behavior in an object
that I can then pass into methods that are base/derived agnostic.

defaultdict is the way to go.

W

<ps>
<rant>

Sadly, there are guidelines that I program by that are perhaps anti-pythonic:

1. Don't use "extra" variables in code. Don't use global variables.
Keep the scopes of local variables at a minimum to reduce state (the
exception being for inner loops) or variables explicitly identified as
part of the algorithm before implementation. [In python, just about
everything is a variable which is terrifying to me. I never want the
Alabama version of math.pi i.e.,
http://www.snopes.com/religion/pi.asp, or math.sin being "666".]

2. Use built-in functions/features as much as possible, as this are
the most tested. Don't roll your own -- you're not that good, instead
master the language. (How often do I invent a noun in English? Not
even "upcast"!) [Plus, guys with phds probably already did what you
need.] Use only very well known libraries -- numpy is okay (I hope!)
for example. An exception can be made while interfacing external
data, because others who create data may not have abided by rule #2.
In most cases (except gui programming, which again tackles the
external interfacing program) the more heavy-weight your API, the more
wrong you are.

3. In interpreted languages, avoid function calls, unless the
function does something significant. [e.g., Functional call overhead
tends to be worse that a dictionary lookup -- and yes I used timeit,
the overhead can be 100%.] Small functions and methods (and
callbacks) hamper good interpreted code. When writing functions, make
them operate on lists/dicts.

It is because of the above that I stopped writing object-oriented Perl.

So I want "big" functions that do a lot of work with few variable
names. Ideally, I'd create only variables that are relevant based on
the description of the algorithm. [Oh yeah, real programming is done
before the implementation in python or C++.]

My problems are compounded by the lack of indention-based scope, but I
see this as simply enforcing the full use of functional-programming
approaches.

</rant>
</ps>

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Christian Heimes

unread,
Jul 31, 2010, 11:08:11 AM7/31/10
to pytho...@python.org
Am 30.07.2010 14:34, schrieb wheres pythonmonks:
> I was hoping not to do that -- e.g., actually reuse the same
> underlying data. Maybe dict(x), where x is a defaultdict is smart? I

> agree that a defaultdict is safe to pass to most routines, but I guess
> I could imagine that a try/except block is used in a bit of code where
> on the key exception (when the value is absent) populates the value
> with a random number. In that application, a defaultdict would have
> no random values.

defaultdict not only behaves like an ordinary dict except for missing
keys, it's also a subclass of Python's builtin dict type. You are able
to use a defaultdict just like a normal dict all over Python. You can
also provide your own custom implementation of a defaultdict that fits
your needs. All you have to do is subclass dict and implement a
__missing__ method. See
http://docs.python.org/library/stdtypes.html?highlight=__missing__#mapping-types-dict

Christian

John Posner

unread,
Jul 31, 2010, 1:31:35 PM7/31/10
to Christian Heimes, pytho...@python.org
On 7/31/2010 11:08 AM, Christian Heimes wrote:

> ... All you have to do is subclass dict and implement a

Caveat -- there's another description of defaultdict here:

http://docs.python.org/library/collections.html#collections.defaultdict

... and it's bogus. This other description claims that __missing__ is a
method of defaultdict, not of dict.

This might cause considerable confusion, leading the reader to suspect
that __missing__ and default_factory fight it out for the right to
supply a default value. (__missing__ would win -- I tried it.)

The truth, as Christian says above and as Raymond Hettinger recently
pointed out [1], is that __missing__ is used to *define* defaultdict as
a subclass of dict -- it's not used *by* defaultdict.

-John

[1] http://mail.python.org/pipermail/python-list/2010-July/1248896.html

Christian Heimes

unread,
Jul 31, 2010, 2:00:08 PM7/31/10
to pytho...@python.org
> The truth, as Christian says above and as Raymond Hettinger recently
> pointed out [1], is that __missing__ is used to *define* defaultdict as
> a subclass of dict -- it's not used *by* defaultdict.

Your answer is confusing even me. ;)

Let me try an easier to understand explanation. defaultdict *implements*
__missing__() to provide the default dict behavior. You don't have to
subclass from defaultdict to get the __missing__() feature. It's part of
the dict interface since Python 2.5.

Christian

John Posner

unread,
Aug 1, 2010, 2:37:34 PM8/1/10
to Christian Heimes, pytho...@python.org
On 7/31/2010 2:00 PM, Christian Heimes wrote:
>
> Your answer is confusing even me. ;)

Yeah, I get that a lot. :-)


> Let me try an easier to understand explanation. defaultdict *implements*
> __missing__() to provide the default dict behavior.

In my experience, the word *implements* is commonly used in two ways,
nearly opposite to each other. Ex:

My company just implemented a version-control system.

Did your company (1) write the code for the version-control system, or
did it (2) put the system in use, by downloading an installer from the
Web and executing it?

-John

John Posner

unread,
Aug 2, 2010, 11:00:51 PM8/2/10
to pytho...@python.org
On 7/31/2010 1:31 PM, John Posner wrote:
>
> Caveat -- there's another description of defaultdict here:
>
> http://docs.python.org/library/collections.html#collections.defaultdict
>
> ... and it's bogus. This other description claims that __missing__ is a
> method of defaultdict, not of dict.

Following is a possible replacement for the bogus description. Comments
welcome. I intend to submit a Python doc bug, and I'd like to have a
clean alternative to propose.

--------------

class collections.defaultdict([default_factory[, ...]])

defaultdict is a dict subclass that can guarantee success on key
lookups: if a key does not currently exist in a defaultdict object, a
"default value factory" is called to provide a value for that key. The
"default value factory" is a callable object (typically, a function)
that takes no arguments. You specify this callable as the first argument
to defaultdict(). Additional defaultdict() arguments are the same as for
dict().

The "default value factory" callable is stored as an attribute,
default_factory, of the newly created defaultdict object. If you call
defaultdict() with no arguments, or with None as the first argument, the
default_factory attribute is set to None. You can reassign the
default_factory attribute of an existing defaultdict object to another
callable, or to None.

When a lookup of a non-existent key is performed in a defaultdict
object, its default_factory attribute is evaluated, and the resulting
object is called:

* If the call produces a value, that value is returned as the result of
the lookup. In addition, the key-value pair is inserted into the
defaultdict.

* If the call raises an exception, it is propagated unchanged.

* If the default_factory attribute evaluates to None, a KeyError
exception is raised, with the non-existent key as its argument. (The
defaultdict behaves exactly like a standard dict in this case.)

Ethan Furman

unread,
Aug 3, 2010, 12:54:46 PM8/3/10
to pytho...@python.org, John Posner
John Posner wrote:
> On 7/31/2010 1:31 PM, John Posner wrote:
>>
>> Caveat -- there's another description of defaultdict here:
>>
>> http://docs.python.org/library/collections.html#collections.defaultdict
>>
>> ... and it's bogus. This other description claims that __missing__ is a
>> method of defaultdict, not of dict.
>

I think mentioning how __missing__ plays into all this would be helpful.
Perhaps in the first paragraph, after the colon:

if a key does not currently exist in a defaultdict object, __missing__
will be called with that key, which in turn will call a "default value
factory" to provide a value for that key.

~Ethan~

John Posner

unread,
Aug 3, 2010, 5:24:53 PM8/3/10
to Ethan Furman, pytho...@python.org, Christian Heimes
On 8/3/2010 12:54 PM, Ethan Furman wrote:

<snip>

> I think mentioning how __missing__ plays into all this would be helpful.
> Perhaps in the first paragraph, after the colon:
>
> if a key does not currently exist in a defaultdict object, __missing__
> will be called with that key, which in turn will call a "default value
> factory" to provide a value for that key.

Thanks, Ethan. As I said (or at least implied) to Christian earlier in
this thread, I don't want to repeat the mistake of the current
description: confusing the functionality provided *by* the defaultdict
class with underlying functionality (the dict type's __missing__
protocol) that is used in the definition of the class.

So I'd rather not mention __missing__ in the first paragraph, which
describes the functionality provided *by* the defaultdict class. How
about adding this para at the end:

defaultdict is defined using functionality that is available to *any*
subclass of dict: a missing-key lookup automatically causes the
subclass's __missing__ method to be called, with the non-existent key
as its argument. The method's return value becomes the result of the
lookup.

BTW, I couldn't *find* the coding of defaultdict in the Python 2.6
library. File collections.py contains this code:

from _abcoll import *
import _abcoll
__all__ += _abcoll.__all__

from _collections import deque, defaultdict

... but I ran into a dead end after that. :-( I believe that the
following *could be* the definition of defaultdict:

class defaultdict(dict):
def __init__(self, factory, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.default_factory = factory

def __missing__(self, key):
"""provide value for missing key"""
value = self.default_factory() # call factory with no args
self[key] = value
return value

-John

Christian Heimes

unread,
Aug 3, 2010, 5:47:23 PM8/3/10
to pytho...@python.org
> So I'd rather not mention __missing__ in the first paragraph, which
> describes the functionality provided *by* the defaultdict class. How
> about adding this para at the end:
>
> defaultdict is defined using functionality that is available to *any*
> subclass of dict: a missing-key lookup automatically causes the
> subclass's __missing__ method to be called, with the non-existent key
> as its argument. The method's return value becomes the result of the
> lookup.

Your proposal sounds like a good idea.

By the way do you have a CS degree? Your wording sounds like you are
used write theses on a CS degree level. No offense. ;)

> BTW, I couldn't *find* the coding of defaultdict in the Python 2.6
> library. File collections.py contains this code:
>
> from _abcoll import *
> import _abcoll
> __all__ += _abcoll.__all__
>
> from _collections import deque, defaultdict

defaultdict is implemented in C. You can read up the source code at
http://svn.python.org/view/python/trunk/Modules/_collectionsmodule.c?revision=81029&view=markup
. Search for "defaultdict type". The C code isn't complicated. You
should understand the concept even if you are not familiar with the C
API of Python.

> class defaultdict(dict):
> def __init__(self, factory, *args, **kwargs):
> dict.__init__(self, *args, **kwargs)
> self.default_factory = factory
>
> def __missing__(self, key):
> """provide value for missing key"""
> value = self.default_factory() # call factory with no args
> self[key] = value
> return value

The type also implements __repr__(), copy() and __reduce__(). The latter
is used by the pickle protocol. Without a new __reduce__ method, the
default_factory would no survive a pickle/unpickle cycle. For a pure
Python implementation you'd have to add __slots__ = "default_factory",
too. Otherwise every defaultdict instance would gain an unncessary
__dict__ attribute, too.

Christian

John Posner

unread,
Aug 3, 2010, 6:04:27 PM8/3/10
to Christian Heimes, pytho...@python.org
On 8/3/2010 5:47 PM, Christian Heimes wrote:
>> So I'd rather not mention __missing__ in the first paragraph, which
>> describes the functionality provided *by* the defaultdict class. How
>> about adding this para at the end:
>>
>> defaultdict is defined using functionality that is available to *any*
>> subclass of dict: a missing-key lookup automatically causes the
>> subclass's __missing__ method to be called, with the non-existent key
>> as its argument. The method's return value becomes the result of the
>> lookup.
>
> Your proposal sounds like a good idea.

Tx.

> By the way do you have a CS degree? Your wording sounds like you are
> used write theses on a CS degree level. No offense. ;)

No CS degree (coulda, woulda, shoulda). I think what you're hearing is
30+ years of tech writing for computer software companies.

-John

Ethan Furman

unread,
Aug 3, 2010, 6:07:01 PM8/3/10
to pytho...@python.org
John Posner wrote:
> On 8/3/2010 12:54 PM, Ethan Furman wrote:
>
> <snip>
>
>> I think mentioning how __missing__ plays into all this would be helpful.
>> Perhaps in the first paragraph, after the colon:
>>
>> if a key does not currently exist in a defaultdict object, __missing__
>> will be called with that key, which in turn will call a "default value
>> factory" to provide a value for that key.
>
> Thanks, Ethan. As I said (or at least implied) to Christian earlier in
> this thread, I don't want to repeat the mistake of the current
> description: confusing the functionality provided *by* the defaultdict
> class with underlying functionality (the dict type's __missing__
> protocol) that is used in the definition of the class.

I just went and read the entry that had the bogus claim -- personally, I
didn't see any confusion. I would like to point out the __missing__ is
*not* part of dicts (tested on 2.5 and 2.6 -- don't have 2.7 installed yet).

Having said that, I think your final paragraph is better than my first
paragraph edit.

> So I'd rather not mention __missing__ in the first paragraph, which
> describes the functionality provided *by* the defaultdict class. How
> about adding this para at the end:
>
> defaultdict is defined using functionality that is available to *any*
> subclass of dict: a missing-key lookup automatically causes the
> subclass's __missing__ method to be called, with the non-existent key

> as its argument. The method's return value becomes the result of the
> lookup.
>

> BTW, I couldn't *find* the coding of defaultdict in the Python 2.6
> library. File collections.py contains this code:
>
> from _abcoll import *
> import _abcoll
> __all__ += _abcoll.__all__
>
> from _collections import deque, defaultdict
>

> ... but I ran into a dead end after that. :-( I believe that the
> following *could be* the definition of defaultdict:
>

> class defaultdict(dict):
> def __init__(self, factory, *args, **kwargs):
> dict.__init__(self, *args, **kwargs)
> self.default_factory = factory
>
> def __missing__(self, key):
> """provide value for missing key"""
> value = self.default_factory() # call factory with no args
> self[key] = value
> return value

I think it's more along these lines:

class defaultdict(dict):
def __init__(self, factory=None, *args, **kwargs):


dict.__init__(self, *args, **kwargs)
self.default_factory = factory

def __missing__(self, key):
"provide value for missing key"

if self.default_factory is None:
raise KeyError("blah blah blah")
value = self.default_factory()


self[key] = value
return value

~Ethan~

Ethan Furman

unread,
Aug 3, 2010, 6:15:40 PM8/3/10
to pytho...@python.org
John Posner wrote:
> On 7/31/2010 1:31 PM, John Posner wrote:
>>
>> Caveat -- there's another description of defaultdict here:
>>
>> http://docs.python.org/library/collections.html#collections.defaultdict
>>
>> ... and it's bogus. This other description claims that __missing__ is a
>> method of defaultdict, not of dict.

__missing__ isn't a method of dict:

--> print dir(dict())
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__',
'__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__',
'__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__',
'__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy',
'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys',
'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update',
'values']

I will agree that the current defaultdict description does not make it
clear that __missing__ can be defined for *any* subclass of dict,
although the dict description does go over this... is that the confusion
you are talking about? If not, could you explain?

~Ethan~

Christian Heimes

unread,
Aug 3, 2010, 6:35:30 PM8/3/10
to pytho...@python.org
> I just went and read the entry that had the bogus claim -- personally, I
> didn't see any confusion. I would like to point out the __missing__ is
> *not* part of dicts (tested on 2.5 and 2.6 -- don't have 2.7 installed yet).

I beg your pardon but you are wrong. __missing__ is available for all
*subclasses* of dict since Python 2.5. See
http://svn.python.org/view/python/branches/release25-maint/Objects/dictobject.c?revision=81031&view=markup

>>> class mydict(dict):
... def __missing__(self, key):
... print "__missing__", key
... raise KeyError(key)
...
>>> m = mydict()
>>> m[1]
__missing__ 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in __missing__
KeyError: 1

Christian

Ethan Furman

unread,
Aug 3, 2010, 6:48:02 PM8/3/10
to pytho...@python.org

Perhaps punctuation will help clarify my intent:

__missing__ is *not* part of (dict)s, as shown by dir(dict()):

['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__',
'__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__',
'__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__',
'__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy',
'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys',
'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update',
'values']

And, just to state what is hopefully obvious, if you don't create
__missing__ yourself, it still isn't in the subclass:

--> class somedict(dict):
... "Is __missing__ defined if I don't define it? Nope."
...
--> sd = somedict()
--> sd[1]


Traceback (most recent call last):
File "<stdin>", line 1, in <module>

KeyError: 1
--> dir(sd)


['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__',

'__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__',


'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',

'__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__',


'__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__',

'__weakref__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items',


'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem',
'setdefault', 'update', 'values']

~Ethan~

Christian Heimes

unread,
Aug 3, 2010, 8:16:32 PM8/3/10
to pytho...@python.org
> Perhaps punctuation will help clarify my intent:
>
> __missing__ is *not* part of (dict)s, as shown by dir(dict()):

Indeed, that's correct. Can we agree, that __missing__ is an optional
feature of the dict interface, that can be implemented in subclasses of
dict?

Christian

Ethan Furman

unread,
Aug 3, 2010, 9:09:32 PM8/3/10
to pytho...@python.org

Absolutely.

~Ethan~

John Posner

unread,
Aug 3, 2010, 10:49:06 PM8/3/10
to Ethan Furman, pytho...@python.org

Right, a __missing__ method does not magically appear in the subclass.
Rather, the subclass is allowed (but not required) to define a method
named __missing__, which will "magically" be called in certain situations.

Here's a dict subclass that uses the "magic" for a purpose that has
nothing to do with default values:

class BadKeyTrackerDict(dict):
def __init__(self, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.bad_keys = set([])

def __missing__(self, key):
"""
add missing key to "bad keys" set
"""
self.bad_keys.add(key)
raise KeyError

Note that "defaultdict" is nowhere in sight here. It's the dict class
(or type) itself that provides the magic -- but only for its subclasses.

>
> --> class somedict(dict):
> ... "Is __missing__ defined if I don't define it? Nope."
> ...
> --> sd = somedict()
> --> sd[1]
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> KeyError: 1
> --> dir(sd)
> ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__',
> '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__',
> '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',
> '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__',
> '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__',
> '__weakref__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items',
> 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem',
> 'setdefault', 'update', 'values']
>
> ~Ethan~

-John

John Posner

unread,
Aug 6, 2010, 4:07:42 PM8/6/10
to pytho...@python.org
On 8/2/2010 11:00 PM, John Posner wrote:
> On 7/31/2010 1:31 PM, John Posner wrote:
>>
>> Caveat -- there's another description of defaultdict here:
>>
>> http://docs.python.org/library/collections.html#collections.defaultdict
>>
>> ... and it's bogus. This other description claims that __missing__ is a
>> method of defaultdict, not of dict.
>
> Following is a possible replacement for the bogus description. Comments
> welcome. I intend to submit a Python doc bug, and I'd like to have a
> clean alternative to propose.


After some off-list discussion with Ethan Furman (many thanks!), the
Python Doc bug is submitted: #9536 at bugs.python.org.

-John

Wolfram Hinderer

unread,
Aug 6, 2010, 6:24:21 PM8/6/10
to

This is probably nitpicking, but the patch calls __missing__ a special
method. However, unlike special methods, it is not invoked by "special
syntax" but by the dict's __getitem__ method. (len() invokes __len__
on any object - you can't do something similar with __missing__.)

__missing__ is also not listed as a special method on
http://docs.python.org/py3k/reference/datamodel.html#special-method-names

However, "normal" special method lookup seems to be used.

John Posner

unread,
Aug 6, 2010, 7:29:22 PM8/6/10
to wolfram....@googlemail.com
On 8/6/2010 6:24 PM, Wolfram Hinderer wrote:
>
> This is probably nitpicking, but the patch calls __missing__ a special
> method. However, unlike special methods, it is not invoked by "special
> syntax" but by the dict's __getitem__ method. (len() invokes __len__
> on any object - you can't do something similar with __missing__.)
>
> __missing__ is also not listed as a special method on
> http://docs.python.org/py3k/reference/datamodel.html#special-method-names
>
> However, "normal" special method lookup seems to be used.

Fair enough. Please add your comment to #9536 at bugs.python.org.

Tx,
John

Stefan Schwarzer

unread,
Aug 7, 2010, 6:41:12 AM8/7/10
to
On 2010-07-31 05:47, Steven D'Aprano wrote:
> On Fri, 30 Jul 2010 08:34:52 -0400, wheres pythonmonks wrote:
> It does re-use the same underlying data.
>
> >>> from collections import defaultdict as dd
> >>> x = dd(list)
> >>> x[1].append(1)
> >>> x
> defaultdict(<type 'list'>, {1: [1]})
> >>> y = dict(x)
> >>> x[1].append(42)
> >>> y
> {1: [1, 42]}

One thing to keep in mind: dict(some_defaultdict) doesn't
store a reference to the defaultdict; instead it makes a
shallow copy, so key/value pairs added _after_ the "cast"
aren't included in the new dict:

>>> y[2] = 17
>>> y
{1: [1, 42], 2: 17}
>>> x
defaultdict(<type 'list'>, {1: [1, 42]})

Stefan

David Niergarth

unread,
Aug 12, 2010, 4:28:26 PM8/12/10
to
Peter Otten <__pete...@web.de> wrote:
>
> >>> 1 .conjugate()
>

This is a syntax I never noticed before. My built-in complier (eyes)
took one look and said: "that doesn't work." Has this always worked in
Python but I never noticed? I see other instance examples also work.

>>> '1' .zfill(2)
'01'
>>> 1.0 .is_integer()
True

and properties

>>> 1.0 .real
1.0

Curiously, this works

David Niergarth

unread,
Aug 12, 2010, 4:37:04 PM8/12/10
to
[Oops, now complete...]

and properties

  >>> 1.0 .real
  1.0

Curiously, a float literal works without space

>>> 1.0.conjugate()
1.0

but not an int.

>>> 1.conjugate()
File "<stdin>", line 1
1.conjugate()
^
SyntaxError: invalid syntax

Anyway, I didn't realize int has a method you can call.

--David

Peter Otten

unread,
Aug 12, 2010, 4:52:03 PM8/12/10
to
David Niergarth wrote:

> [Oops, now complete...]
> Peter Otten <__pete...@web.de> wrote:
>>
>> > >>> 1 .conjugate()
>>
> This is a syntax I never noticed before. My built-in complier (eyes)
> took one look and said: "that doesn't work."

(1).conjugate may hurt a little less. Anyway, the space is only needed for
the tokenizer that without it would produce a float immediately followed by
a name.

> Has this always worked in
> Python but I never noticed?

Probably.

Steven D'Aprano

unread,
Aug 13, 2010, 1:51:40 AM8/13/10
to
On Thu, 12 Aug 2010 13:28:26 -0700, David Niergarth wrote:

> Peter Otten <__pete...@web.de> wrote:
>>
>> >>> 1 .conjugate()
>>
>>
> This is a syntax I never noticed before. My built-in complier (eyes)
> took one look and said: "that doesn't work." Has this always worked in
> Python but I never noticed?


Yes. Here is is working in Python 2.2:

[steve@sylar ~]$ python2.2
Python 2.2.3 (#1, Aug 12 2010, 01:08:27)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2 .__add__(3)
5


Syntactically, it also worked as far back as Python 1.5, although it is
rather pointless since int objects didn't gain any methods until 2.2:

[steve@sylar ~]$ python1.5
Python 1.5.2 (#1, Apr 1 2009, 22:55:54) [GCC 4.1.2 20070925 (Red Hat
4.1.2-27)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> 2 .__add__(3)
Traceback (innermost last):
File "<stdin>", line 1, in ?
AttributeError: 'int' object has no attribute '__add__'


> I see other instance examples also work.
>
> >>> '1' .zfill(2)
> '01'

You don't need the space between strings and the attribute access:
"1".zfill(2) is fine. You only need it for numbers, due to the ambiguity
between the decimal point and dotted attribute access.


--
Steven

Piet van Oostrum

unread,
Dec 7, 2010, 11:04:36 PM12/7/10
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

> You don't need the space between strings and the attribute access:
> "1".zfill(2) is fine. You only need it for numbers, due to the ambiguity
> between the decimal point and dotted attribute access.

Personally I prefer parentheses: (1).conjugate
--
Piet van Oostrum <pi...@vanoostrum.org>
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]
Nu Fair Trade woonartikelen op http://www.zylja.com

0 new messages