Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

defaultdict.fromkeys returns a surprising defaultdict

21 views
Skip to first unread message

Matthew Wilson

unread,
Jun 3, 2008, 4:11:06 PM6/3/08
to
I used defaultdict.fromkeys to make a new defaultdict instance, but I
was surprised by behavior:

>>> b = defaultdict.fromkeys(['x', 'y'], list)

>>> b
defaultdict(None, {'y': <type 'list'>, 'x': <type 'list'>})

>>> b['x']
<type 'list'>

>>> b['z']
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
KeyError: 'z'

I think that what is really going on is that fromdict makes a regular
dictionary, and then hands it off to the defaultdict class.

I find this confusing, because now I have a defaultdict that raises a
KeyError.

Do other people find this intuitive?

Would it be better if defaultdict.fromkeys raised a
NotImplementedException?

Or would it be better to redefine how defaultdict.fromkeys works, so
that it first creates the defaultdict, and then goes through the keys?

All comments welcome. If I get some positive feedback, I'm going to try
to submit a patch.

Matt

Chris

unread,
Jun 3, 2008, 4:18:59 PM6/3/08
to

To me it's intuitive for it to raise a KeyError, afterall the Key
isn't in the dictionary.

MRAB

unread,
Jun 3, 2008, 8:02:54 PM6/3/08
to
The statement:

b = defaultdict.fromkeys(['x', 'y'], list)

is equivalent to:

b = defaultdict()
for i in ['x', 'y']:
b[i] = list

so there's no default_factory and therefore the defaultdict will
behave like a dict. Perhaps there could be an optional third argument
to provide a default_factory.

Gabriel Genellina

unread,
Jun 3, 2008, 10:59:34 PM6/3/08
to pytho...@python.org
En Tue, 03 Jun 2008 17:18:59 -0300, Chris <cwi...@gmail.com> escribió:
> On Jun 3, 10:11 pm, Matthew Wilson <m...@tplus1.com> wrote:

>> I used defaultdict.fromkeys to make a new defaultdict instance, but I
>> was surprised by behavior:
>>
>>     >>> b = defaultdict.fromkeys(['x', 'y'], list)
>>
>>     >>> b
>>     defaultdict(None, {'y': <type 'list'>, 'x': <type 'list'>})
>>
>>     >>> b['x']
>>     <type 'list'>
>>
>>     >>> b['z']
>>     ------------------------------------------------------------
>>     Traceback (most recent call last):
>>       File "<ipython console>", line 1, in <module>
>>     KeyError: 'z'
>>

>> I find this confusing, because now I have a defaultdict that raises a
>> KeyError.
>>

> To me it's intuitive for it to raise a KeyError, afterall the Key
> isn't in the dictionary.

But the idea behind a defaultdict is to *not* raise a KeyError but use the
default_factory to create missing values. (Unfortunately there is no way
to provide a default_factory when using fromkeys).

--
Gabriel Genellina

Gabriel Genellina

unread,
Jun 3, 2008, 11:09:41 PM6/3/08
to pytho...@python.org
En Tue, 03 Jun 2008 17:11:06 -0300, Matthew Wilson <ma...@tplus1.com>
escribió:

That looks reasonable. It appears there is currently no way to do what you
want (apart from using a for loop to set each key)

--
Gabriel Genellina

Thomas Bellman

unread,
Jun 4, 2008, 1:11:50 AM6/4/08
to
"Gabriel Genellina" <gags...@yahoo.com.ar> wrote:

> That looks reasonable. It appears there is currently no way to do what you
> want (apart from using a for loop to set each key)

You can do this:

>>> d = defaultdict.fromkeys(['x', 'y'], 0)
>>> d.default_factory = list
>>> d
defaultdict(<type 'list'>, {'y': 0, 'x': 0})
>>> d['z']
[]
>>> d
defaultdict(<type 'list'>, {'y': 0, 'x': 0, 'z': []})

The keys you give to the fromkeys() method will all be set to the
same object (the integer zero, in the case above), though, which
might not be what you want.


--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"Don't tell me I'm burning the candle at both ! bellman @ lysator.liu.se
ends -- tell me where to get more wax!!" ! Make Love -- Nicht Wahr!

Raymond Hettinger

unread,
Jun 4, 2008, 7:24:43 AM6/4/08
to
On Jun 3, 1:11 pm, Matthew Wilson <m...@tplus1.com> wrote:
> I used defaultdict.fromkeys to make a new defaultdict instance, but I
> was surprised by behavior:
>
>     >>> b = defaultdict.fromkeys(['x', 'y'], list)
>
>     >>> b
>     defaultdict(None, {'y': <type 'list'>, 'x': <type 'list'>})
>
>     >>> b['x']
>     <type 'list'>
>
>     >>> b['z']
>     ------------------------------------------------------------
>     Traceback (most recent call last):
>       File "<ipython console>", line 1, in <module>
>     KeyError: 'z'
>
> I think that what is really going on is that fromdict makes a regular
> dictionary, and then hands it off to the defaultdict class.

Nope. It works like this:

def fromkeys(iterable, value):
d = defaultdict(None)
for k in iterable:
d[k] = value
return d

Use fromkeys() to set all entries to a single specific value.
Remember that setting an entry never triggers the default factory;
instead, it is triggered on the *lookup* of a missing key. To do what
you likely intended, don't use fromkeys(). Instead write something
like:

d = defaultdict(list)
for elem in iterable:
d[elem] # factory triggered by lookup of missing key

> I find this confusing, because now I have a defaultdict that raises a
> KeyError.

It is only confusing because you had already convinced yourself that
the second argument in fromkeys() was the default factory function.
Of course, its real meaning is the single value assigned by fromkeys
for every insertion (the same way it works for regular dicts). Once
you realize that the second argument was the single assigned value, it
is easy to understand that the default factory function is None and
that its documented behavior is to raise a KeyError when a missing key
is looked-up.

The confusion is mostly based on a misunderstanding of how
defaultdicts work (calling a factory whenever a missing key is looked-
up, not when it is inserted). It is alo based on a misunderstanding
of what fromkeys does (assign the same value over and over for each
key in the input interable). Given those two misconceptions,
confusion over the result was inevitable.

> Would it be better if defaultdict.fromkeys raised a
> NotImplementedException?

No. It is a dict subclass and should provide all of those methods.
Also, it is useful in its current form.


> Or would it be better to redefine how defaultdict.fromkeys works, so
> that it first creates the defaultdict, and then goes through the keys?
>
> All comments welcome.  If I get some positive feedback, I'm going to try
> to submit a patch.

No need. The patch would be rejected. It would break existing code
that uses default.fromkeys() as designed and documented.


Raymond

Raymond Hettinger

unread,
Jun 4, 2008, 7:36:51 AM6/4/08
to
On Jun 3, 1:11 pm, Matthew Wilson <m...@tplus1.com> wrote:
> I used defaultdict.fromkeys to make a new defaultdict instance, but I
> was surprised by behavior:
>     >>> b = defaultdict.fromkeys(['x', 'y'], list)
>     >>> b
>     defaultdict(None, {'y': <type 'list'>, 'x': <type 'list'>})

One other thought: Even after correcting the code as shown in other
posts, I don't think you're on the right track. If you know the full
population of keys at the outset and just want them to all have an
initial value, the defaultdict isn't the right tool. Instead, just
populate a regular dict in usual way:

>>> dict((k, []) for k in 'xyz')
{'y': [], 'x': [], 'z': []}

The time to use defaultdict is when you *don't* want to pre-populate a
dict. The factory gets run during the lookup phase and creates your
default just-in-time for use:

>>> d = defaultdict(list)
>>> for k in 'zyzygy':
d[k].append(1) # empty list created on lookup if needed
>>> d
defaultdict(<type 'list'>, {'y': [1, 1, 1], 'z': [1, 1], 'g': [1]})


Raymond

cokof...@gmail.com

unread,
Jun 4, 2008, 7:50:47 AM6/4/08
to
>
> No need. The patch would be rejected. It would break existing code
> that uses default.fromkeys() as designed and documented.
>

Perhaps that could be useful, so that future questions or posts on the
matter could instantly be directed to the rejected patch?

0 new messages