Python 3: dict & dict.keys()

Ethan Furman

unread,

Jul 23, 2013, 9:16:08 PM7/23/13

to Python

Back in Python 2.x days I had a good grip on dict and dict.keys(), and when to use one or the other.

Then Python 3 came on the scene with these things called 'views', and while range couldn't be bothered, dict jumped up
and down shouting, "I want some!"

So now, in Python 3, .keys(), .values(), even .items() all return these 'view' thingies.

And everything I thought I knew about when to use one or the other went out the window.

For example, if you need to modify a dict while iterating over it, use .keys(), right? Wrong:

--> d = {1: 'one', 2:'two', 3:'three'}
--> for k in d.keys():
... if k == 1:
... del d[k]
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

If you need to manipulate the keys (maybe adding some, maybe deleting some) before doing something else with final key
collection, use .keys(), right? Wrong:

--> dk = d.keys()
--> dk.remove(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'dict_keys' object has no attribute 'remove'

I understand that the appropriate incantation in Python 3 is:

--> for k in list(d)
... ...

or

--> dk = list(d)
--> dk.remove(2)

which also has the added benefit of working the same way in Python 2.

So, my question boils down to: in Python 3 how is dict.keys() different from dict? What are the use cases?

--
~Ethan~

Steven D'Aprano

unread,

Jul 23, 2013, 10:11:47 PM7/23/13

to

On Tue, 23 Jul 2013 18:16:08 -0700, Ethan Furman wrote:

> Back in Python 2.x days I had a good grip on dict and dict.keys(), and
> when to use one or the other.
>
> Then Python 3 came on the scene with these things called 'views', and
> while range couldn't be bothered, dict jumped up and down shouting, "I
> want some!"
>
> So now, in Python 3, .keys(), .values(), even .items() all return these
> 'view' thingies.
>
> And everything I thought I knew about when to use one or the other went
> out the window.

Surely not. The fundamental behaviour of Python's data model hasn't
changed. Lists are lists, views are views, and iterators are iterators.
Only the way you get each has changed.

- If in Python 2, you used the viewkeys() method, that's been renamed
keys() in Python 3. So d.viewkeys() => d.keys().

- If in Python 2, you used the keys() method, it returns a list, and
like any function that has been made lazy instead of eager in Python 3
(e.g. map, zip, filter) if you want the same behaviour, simply call
list manually. So d.keys() => list(d.keys()).

- If in Python 2, you used the iterkeys() methods, it returns a simple
iterator, not a view. So d.iterkeys() => iter(d.keys()).

None of these distinctions really matter if all you are doing is
iterating over the keys, without modifying the dict. Not in Python 2, nor
in Python 3.

And naturally the same applies to the various flavours of *items and
*values.

> For example, if you need to modify a dict while iterating over it, use
> .keys(), right? Wrong:
>
> --> d = {1: 'one', 2:'two', 3:'three'} --> for k in d.keys():
> ... if k == 1:
> ... del d[k]
> ...
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> RuntimeError: dictionary changed size during iteration

Fundamentally, this behaviour has not changed from Python 2: you should
not iterate over a data structure while changing it, instead you should
make a copy of the data you iterate over. In Python 2, d.keys() makes a
copy and returns a list, so in Python 3 you would call list(d.keys()).

> If you need to manipulate the keys (maybe adding some, maybe deleting
> some) before doing something else with final key collection, use
> .keys(), right? Wrong:
>
> --> dk = d.keys()
> --> dk.remove(2)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> AttributeError: 'dict_keys' object has no attribute 'remove'

Repeat after me: "In Python 2, d.keys() returns a list of keys, so if I
want a list of keys in Python 3, call list explicitly list(d.keys())."

> I understand that the appropriate incantation in Python 3 is:
>
> --> for k in list(d)
> ... ...
>
> or
>
> --> dk = list(d)
> --> dk.remove(2)
>
> which also has the added benefit of working the same way in Python 2.
>
> So, my question boils down to: in Python 3 how is dict.keys() different
> from dict? What are the use cases?

*shrug* For most purposes, there is no difference, especially when merely
iterating over the dict. Such differences as exist are trivial:

- if you need an actual callable function or method, say to pass to some
other function, you can do this:

for method in (d.items, d.keys, d.values):
process(method)

instead of this:

# untested
for method in (d.items, d.keys, lambda d=d: iter(d)):
process(method)

- d.keys() is a view, not the dict itself. That's a pretty fundamental
difference: compare dir(d.keys()) with dir(d).

Basically, views are set-like, not list-like.

--
Steven

Peter Otten

unread,

Jul 24, 2013, 2:23:08 AM7/24/13

to pytho...@python.org

Ethan Furman wrote:

> So, my question boils down to: in Python 3 how is dict.keys() different
> from dict? What are the use cases?

I just grepped through /usr/lib/python3, and could not identify a single
line where some_object.keys() wasn't either wrapped in a list (or set,
sorted, max) call, or iterated over.

To me it looks like views are a solution waiting for a problem.

Oscar Benjamin

unread,

Jul 24, 2013, 8:51:34 AM7/24/13

to Peter Otten, pytho...@python.org

What do you mean? Why would you want to create a temporary list just to iterate over it explicitly or implicitly (set, sorted, max,...)?

Oscar

Neil Cerutti

unread,

Jul 24, 2013, 8:54:36 AM7/24/13

to

Here's a case of using keys as a set-like view from my own
"production" code (i.e., I used it once for one important job):

seen = set()
students = {}
dates = []

for fname in sorted(glob.glob("currentterm201320?.csv")):
print(fname, end="\n\t")
date = get_date(fname)
dates.append(date)
term = fname[-11:-4]
r = reg.registration(term, path=".")
regs = r.keys()
for alt_id in regs & seen:
students[alt_id].append(r[alt_id])
for alt_id in seen - regs:
students[alt_id].append(None)
for alt_id in regs - seen:
students[alt_id] = [None]*(len(dates)-1) + [r[alt_id]]
seen.add(alt_id)

It was a very nice way to to do three different things depending
on the student sin the set I was working with, compared to a
registration list:

Granted the line was originally "regs = set(regs.keys())" before
it occurred to me that it sucked to take what must be equivalent
to a set, convert to a list, and then back to set again.

Thanks to the set-like view of dict.keys it worked just like one
might hope.

Looking at it again "seen" might be a redundant parallel version
of students.keys().

--
Neil Cerutti

Peter Otten

unread,

Jul 24, 2013, 9:25:14 AM7/24/13

to pytho...@python.org

Oscar Benjamin wrote:

> On Jul 24, 2013 7:25 AM, "Peter Otten" <__pet...@web.de> wrote:
>>

>> Ethan Furman wrote:
>>
>> > So, my question boils down to: in Python 3 how is dict.keys()
>> > different
>> > from dict? What are the use cases?
>>
>> I just grepped through /usr/lib/python3, and could not identify a single
>> line where some_object.keys() wasn't either wrapped in a list (or set,
>> sorted, max) call, or iterated over.
>>
>> To me it looks like views are a solution waiting for a problem.
>

> What do you mean? Why would you want to create a temporary list just to
> iterate over it explicitly or implicitly (set, sorted, max,...)?

I mean I don't understand the necessity of views when all actual usecases
need iterators. The 2.x iterkeys()/iteritems()/itervalues() methods didn't
create lists either.

Do you have 2.x code lying around where you get a significant advantage by
picking some_dict.viewkeys() over some_dict.iterkeys()? I could construct
one

>>> d = dict(a=1, b=2, c=3)
>>> e = dict(b=4, c=5, d=6)
>>> d.viewkeys() & e.viewkeys()
set(['c', 'b'])

but have not seen it in the wild.

My guess is that most non-hardcore users don't even know about viewkeys().
By the way, my favourite idiom to iterate over the keys in both Python 2 and
3 is -- for example -- max(some_dict) rather than
max(some_dict.whateverkeys()).

Skip Montanaro

unread,

Jul 24, 2013, 10:58:21 AM7/24/13

to Oscar Benjamin, pytho...@python.org

> What do you mean? Why would you want to create a temporary list just to
> iterate over it explicitly or implicitly (set, sorted, max,...)?

Because while iterating over the keys, he might also want to add or
delete keys to/from the dict. You can't do that while iterating over
them in-place.

This example demonstrates the issue and also shows that the
modification actually takes place:

>>> d = dict(zip(range(10), range(10, 0, -1)))
>>> d
{0: 10, 1: 9, 2: 8, 3: 7, 4: 6, 5: 5, 6: 4, 7: 3, 8: 2, 9: 1}
>>> for k in d:
... if k == 3:
... del d[k+1]

...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

>>> for k in list(d):
... if k == 3:
... del d[k+1]

...
Traceback (most recent call last):

File "<stdin>", line 3, in <module>
KeyError: 4
>>> d.keys()
dict_keys([0, 1, 2, 3, 5, 6, 7, 8, 9])

Skip

Ian Kelly

unread,

Jul 24, 2013, 11:02:29 AM7/24/13

to Python

On Tue, Jul 23, 2013 at 8:11 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> Basically, views are set-like, not list-like.

The keys and items views are set-like. The values view is not.

Ian Kelly

unread,

Jul 24, 2013, 11:15:06 AM7/24/13

to Python

On Wed, Jul 24, 2013 at 8:58 AM, Skip Montanaro <sk...@pobox.com> wrote:
>> What do you mean? Why would you want to create a temporary list just to
>> iterate over it explicitly or implicitly (set, sorted, max,...)?
>
> Because while iterating over the keys, he might also want to add or
> delete keys to/from the dict. You can't do that while iterating over
> them in-place.

None of the (set, sorted, max, ...) cases will add or delete keys
while iterating. For the occasional for loop where the programmer
does want to do that, you can still explicitly create a temporary list
with list().

Ethan Furman

unread,

Jul 24, 2013, 11:57:11 AM7/24/13

to pytho...@python.org

On 07/24/2013 05:51 AM, Oscar Benjamin wrote:

>
> On Jul 24, 2013 7:25 AM, "Peter Otten" <__pet...@web.de <mailto:pet...@web.de>> wrote:
>>
>> Ethan Furman wrote:
>>
>> > So, my question boils down to: in Python 3 how is dict.keys() different
>> > from dict? What are the use cases?
>>
>> I just grepped through /usr/lib/python3, and could not identify a single
>> line where some_object.keys() wasn't either wrapped in a list (or set,
>> sorted, max) call, or iterated over.
>>
>> To me it looks like views are a solution waiting for a problem.
>

> What do you mean? Why would you want to create a temporary list just to iterate over it explicitly or implicitly (set,
> sorted, max,...)?

You wouldn't. But you don't need .keys() for that either as you can just use the dict itself.

My point is that in 2.x .keys() did something different from the dict, while in 3.x it appears to me that they are the same.

Peter's point is that in the stdlib the new functionality of .keys() is never used, not even once.

--
~Ethan~

Oscar Benjamin

unread,

Jul 24, 2013, 12:32:01 PM7/24/13

to Peter Otten, pytho...@python.org

On Jul 24, 2013 2:27 PM, "Peter Otten" <__pet...@web.de> wrote:
>
> Oscar Benjamin wrote:
>

> > On Jul 24, 2013 7:25 AM, "Peter Otten" <__pet...@web.de> wrote:
> >>
> >> Ethan Furman wrote:
> >>
> >> > So, my question boils down to: in Python 3 how is dict.keys()
> >> > different
> >> > from dict? What are the use cases?
> >>
> >> I just grepped through /usr/lib/python3, and could not identify a single
> >> line where some_object.keys() wasn't either wrapped in a list (or set,
> >> sorted, max) call, or iterated over.
> >>
> >> To me it looks like views are a solution waiting for a problem.
> >
> > What do you mean? Why would you want to create a temporary list just to
> > iterate over it explicitly or implicitly (set, sorted, max,...)?
>

> I mean I don't understand the necessity of views when all actual usecases
> need iterators. The 2.x iterkeys()/iteritems()/itervalues() methods didn't
> create lists either.

Oh, okay. I see what you mean.

>
> Do you have 2.x code lying around where you get a significant advantage by
> picking some_dict.viewkeys() over some_dict.iterkeys()?

No. I don't think I've ever used viewkeys. I noticed it once, didn't see an immediate use and forgot about it but...

> I could construct
> one
>
> >>> d = dict(a=1, b=2, c=3)
> >>> e = dict(b=4, c=5, d=6)
> >>> d.viewkeys() & e.viewkeys()
> set(['c', 'b'])

that might be useful.

>
> but have not seen it in the wild.

> My guess is that most non-hardcore users don't even know about viewkeys().
> By the way, my favourite idiom to iterate over the keys in both Python 2 and
> 3 is -- for example -- max(some_dict) rather than
> max(some_dict.whateverkeys()).

Agreed.

Earlier I saw that I had list(some_dict) in some code. Not sure why but maybe because it's the same in Python 2 and 3.

Oscar

Chris Angelico

unread,

Jul 24, 2013, 12:34:50 PM7/24/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 1:57 AM, Ethan Furman <et...@stoneleaf.us> wrote:
> On 07/24/2013 05:51 AM, Oscar Benjamin wrote:

>> What do you mean? Why would you want to create a temporary list just to
>> iterate over it explicitly or implicitly (set,
>> sorted, max,...)?
>

> You wouldn't. But you don't need .keys() for that either as you can just
> use the dict itself.

Side point: Why is iterating over a dict equivalent to .keys() rather
than .items()? It feels odd that, with both options viable, the
implicit version iterates over half the dict instead of all of it.
Obviously it can't be changed now, even if .items() were the better
choice, but I'm curious as to the reason for the decision.

ChrisA

Stefan Behnel

unread,

Jul 24, 2013, 1:16:40 PM7/24/13

to pytho...@python.org

Chris Angelico, 24.07.2013 18:34:

> On Thu, Jul 25, 2013 at 1:57 AM, Ethan Furman wrote:
>> On 07/24/2013 05:51 AM, Oscar Benjamin wrote:
>>> What do you mean? Why would you want to create a temporary list just to
>>> iterate over it explicitly or implicitly (set,
>>> sorted, max,...)?
>>
>> You wouldn't. But you don't need .keys() for that either as you can just
>> use the dict itself.
>
> Side point: Why is iterating over a dict equivalent to .keys() rather
> than .items()? It feels odd that, with both options viable, the
> implicit version iterates over half the dict instead of all of it.
> Obviously it can't be changed now, even if .items() were the better
> choice, but I'm curious as to the reason for the decision.

The reason is that you can easily get at the values when iterating over the
keys, or simply decide not to care about them and be happy with the keys
only. Note that there are also many use cases that need all keys but not
all values. If iteration always returned an item tuple by default, many use
cases would have to resort to using .keys() in order to be efficient. And
for the simple case, you'd have to type more, either the additional .keys()
or the useless tuple unpacking. So, the reasoning is that iteration should
do the basic thing that still allows you to do everything, instead of doing
everything and pushing unnecessary work on the users by default.

Stefan

Terry Reedy

unread,

Jul 24, 2013, 1:17:12 PM7/24/13

to pytho...@python.org

On 7/24/2013 12:34 PM, Chris Angelico wrote:

> Side point: Why is iterating over a dict equivalent to .keys() rather
> than .items()? It feels odd that, with both options viable, the
> implicit version iterates over half the dict instead of all of it.
> Obviously it can't be changed now, even if .items() were the better
> choice, but I'm curious as to the reason for the decision.

Both were considered and I think there were and are two somewhat-linked
practical reasons. First, iterating over keys in more common than
iterating over items. The more common one should be the default.

Second, people ask much more often if 'key' is in dict than if 'key,
value' is in dict. This is true as well for keyed reference books such
as phone books, dictionaries, encyclopedias, and for the same reason.
This is coupled with the fact that the default meaning of 'item in
collection' is that iterating over 'collection' eventually produces
'item' or a value equal to 'item'.

--
Terry Jan Reedy

Stefan Behnel

unread,

Jul 24, 2013, 1:23:47 PM7/24/13

to pytho...@python.org

Peter Otten, 24.07.2013 08:23:

> Ethan Furman wrote:
>> So, my question boils down to: in Python 3 how is dict.keys() different
>> from dict? What are the use cases?
>
> I just grepped through /usr/lib/python3, and could not identify a single
> line where some_object.keys() wasn't either wrapped in a list (or set,
> sorted, max) call, or iterated over.

In case that directory mainly consists of the standard library, then you
shouldn't forget that most of the code in there predates Python 3 by ages
and was only adapted to work with the new syntax/semantics, not rewritten
in a "better" way.

Even if it's not just the stdlib, that argument still holds. There is still
fairly little code out there that was specifically written for Py2.6+, as
opposed to just being adapted.

> To me it looks like views are a solution waiting for a problem.

They reduce the API overhead. Previously, you needed values() and
itervalues(), with values() being not more than a special case of what
itervalues() provides anyway. Now it's just one method that gives you
everything. It simply has corrected the tradeoff from two special purpose
APIs to one general purpose API, that's all.

Stefan

Chris Angelico

unread,

Jul 24, 2013, 1:58:55 PM7/24/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 3:17 AM, Terry Reedy <tjr...@udel.edu> wrote:
> On 7/24/2013 12:34 PM, Chris Angelico wrote:
>
>> Side point: Why is iterating over a dict equivalent to .keys() rather
>> than .items()? It feels odd that, with both options viable, the
>> implicit version iterates over half the dict instead of all of it.
>> Obviously it can't be changed now, even if .items() were the better
>> choice, but I'm curious as to the reason for the decision.
>

> This is
> coupled with the fact that the default meaning of 'item in collection' is
> that iterating over 'collection' eventually produces 'item' or a value equal
> to 'item'.

Ahh, that makes sense. I never thought of iteration and 'in' being
connected like that, but yes, that's a solid reason for doing it that
way.

ChrisA

Ethan Furman

unread,

Jul 24, 2013, 2:31:58 PM7/24/13

to pytho...@python.org

On 07/24/2013 10:23 AM, Stefan Behnel wrote:
> Peter Otten, 24.07.2013 08:23:
>> Ethan Furman wrote:
>>>
>>> So, my question boils down to: in Python 3 how is dict.keys() different
>>> from dict? What are the use cases?
>>

>> To me it looks like views are a solution waiting for a problem.
>
> They reduce the API overhead. Previously, you needed values() and
> itervalues(), with values() being not more than a special case of what
> itervalues() provides anyway. Now it's just one method that gives you
> everything. It simply has corrected the tradeoff from two special purpose
> APIs to one general purpose API, that's all.

I started this thread for two reasons:

1) Increase awareness that using `list(dict)` is a cross-version replacement for `dict.keys()`

2) Hopefully learn something about when a view is useful.

So far #2 is pretty much a failure. Only one use-case so far (and it feels pretty rare). But hey, I have learned that
while some set operations are allowed (&, ^, |, .isdisjoint()), others are not (.remove(), .discard(), .union(), etc.).

The old .keys(), .values(), and .items() (and their .iter...() variations) did something commonly useful. Of what
common use are these views?

--
~Ethan~

Stefan Behnel

unread,

Jul 24, 2013, 3:59:31 PM7/24/13

to pytho...@python.org

Ethan Furman, 24.07.2013 20:31:

> On 07/24/2013 10:23 AM, Stefan Behnel wrote:
>> Peter Otten, 24.07.2013 08:23:
>>> Ethan Furman wrote:
>>>>
>>>> So, my question boils down to: in Python 3 how is dict.keys() different
>>>> from dict? What are the use cases?
>>>
>>> To me it looks like views are a solution waiting for a problem.
>>
>> They reduce the API overhead. Previously, you needed values() and
>> itervalues(), with values() being not more than a special case of what
>> itervalues() provides anyway. Now it's just one method that gives you
>> everything. It simply has corrected the tradeoff from two special purpose
>> APIs to one general purpose API, that's all.
>
> I started this thread for two reasons:
>
> 1) Increase awareness that using `list(dict)` is a cross-version
> replacement for `dict.keys()`
>
> 2) Hopefully learn something about when a view is useful.
>
> So far #2 is pretty much a failure.

I think the question is: how else would you implement an interface that
doesn't restrict itself to returning a list? I mean, previously, the
following was totally inefficient in terms of memory:

value in d.values()

It now avoids creating an intermediate list copy of the values, thus
running with no additional memory overhead (well, a constant, ok, but
definitely not linear) and keeps users from resorting to the much more
unfriendly

for v in d.itervalues():
if v == value:
return True
else:
return False

in order to achieve the same thing. You can now even efficiently do this
for items, i.e.

(key, value) in d.items()

That's equivalent to "d[key] == value", but uses a different protocol,
meaning that you don't have to make a copy of the dict items in order to
pass it into something that works on a set or iterable of 2-tuples (which
is a way more generic interface than requiring a dict as input). These
things chain much more cleanly now, without first having to explain the
difference between items() and iteritems() and when to use which.

It's all about replacing the old copy-to-list interface by something that
is efficiently processable step by step. All of this started back when
iterators became a part of the language, then generators, and now dict
views. They may not be the hugest feature ever, but they definitely fit
into the language much better and much more cleanly than the old
copy-to-list way.

Ask yourself, if they had been there in Python 1.x, would you even have
thought about making the iter*() methods a part of the language? Would you
really have wanted a shorter way to create a list of dict values than
list(d.values())?

Stefan

Ethan Furman

unread,

Jul 24, 2013, 4:16:54 PM7/24/13

to pytho...@python.org

Thank you. :)

--
~Ethan~

Prasad, Ramit

unread,

Jul 24, 2013, 4:34:28 PM7/24/13

to pytho...@python.org

Stefan Behnel wrote:
> Ethan Furman, 24.07.2013 20:31:
> > On 07/24/2013 10:23 AM, Stefan Behnel wrote:
> >> Peter Otten, 24.07.2013 08:23:
> >>> Ethan Furman wrote:
> >>>>
> >>>> So, my question boils down to: in Python 3 how is dict.keys() different
> >>>> from dict? What are the use cases?
> >>>
> >>> To me it looks like views are a solution waiting for a problem.
> >>
> >> They reduce the API overhead. Previously, you needed values() and
> >> itervalues(), with values() being not more than a special case of what
> >> itervalues() provides anyway. Now it's just one method that gives you
> >> everything. It simply has corrected the tradeoff from two special purpose
> >> APIs to one general purpose API, that's all.
> >
> > I started this thread for two reasons:
> >
> > 1) Increase awareness that using `list(dict)` is a cross-version
> > replacement for `dict.keys()`
> >
> > 2) Hopefully learn something about when a view is useful.
> >
> > So far #2 is pretty much a failure.
>

> Ask yourself, if they had been there in Python 1.x, would you even have
> thought about making the iter*() methods a part of the language? Would you
> really have wanted a shorter way to create a list of dict values than
> list(d.values())?
>
> Stefan
>

I am still not clear on the advantage of views vs. iterators. What
makes d.viewkeys() better than d.iterkeys()? Why did they decide
not to rename d.iterkeys() to d.keys() and instead use d.viewkeys()?
Is the iteration over a set operation on keys really that common a
use case?

Ramit

This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.

Christian Heimes

unread,

Jul 24, 2013, 5:06:56 PM7/24/13

to pytho...@python.org

Am 24.07.2013 18:34, schrieb Chris Angelico:
> Side point: Why is iterating over a dict equivalent to .keys() rather
> than .items()? It feels odd that, with both options viable, the
> implicit version iterates over half the dict instead of all of it.
> Obviously it can't be changed now, even if .items() were the better
> choice, but I'm curious as to the reason for the decision.

Consider this:

if key in dict:
...

for key in dict:
...

It would be rather surprising if "in" as containment checks operates on
keys and "in" as iterator returns (key, value) tuples.

Christian

Terry Reedy

unread,

Jul 24, 2013, 7:45:37 PM7/24/13

to pytho...@python.org

On 7/24/2013 4:34 PM, Prasad, Ramit wrote:

> I am still not clear on the advantage of views vs. iterators.

A1: Views are iterables that can be iterated more than once. Therefore,
they can be passed to a function that re-iterates its inputs, or to
multiple functions. They support 'x in view' as efficiently as possible.
Think about how you would write the non-view equivalent of '(0,None) in
somedict.views())'. When set-like, views support some set operations.
For .keys, which are always set-like, these operations are easy to
implement as dicts are based on a hashed array of keys.

Q2: What is the advantage of views vs. lists?

A2: They do not take up space that is not needed. They can be converted
to lists, to get all the features of lists, but not vice versa.

> What makes d.viewkeys() better than d.iterkeys()? Why did they decide
> not to rename d.iterkeys() to d.keys() and instead use d.viewkeys()?

This is historically the wrong way to phrase the question. The 2.7
.viewxyz methods were *not* used to make the 3.x .xyz methods. It was
the other way around. 3.0 came out with view methods replacing both list
and iter methods just after 2.6, after a couple of years of design, and
a year and a half before 2.7. The view methods were backported from 3.1
to 2.7, with 'view' added to the name to avoid name conflicts, to make
it easier to write code that would either run on both 2.7 and 3.x or be
converted with 2to3.

A better question is: 'When 3.0 was designed, why were views invented
for the .xyz methods rather than just renaming the .iterxyz methods. The
advantages given above are the answer. View methods replace both list
and iterator methods and are more flexible than either and directly or
indirectly have all the advantages of both.

My question is why some people are fussing so much because Python
developers gave them one thing that is better than either of the two
things it replaces?

The mis-phrased question above illustrates why people new to Python
should use the latest 3.x and ignore 2.x unless they must use 2.x
libraries. 2.7 has all the old stuff, for back compatibility, and as
much of the new stuff in 3.1 as seemed sensible, for forward
compatibility. Thus it has lots of confusing duplication, and in this
case, triplication

--
Terry Jan Reedy

Ethan Furman

unread,

Jul 24, 2013, 7:22:11 PM7/24/13

to Python

On 07/24/2013 01:34 PM, Prasad, Ramit wrote:
>
> I am still not clear on the advantage of views vs. iterators. What

> makes d.viewkeys() better than d.iterkeys()? Why did they decide
> not to rename d.iterkeys() to d.keys() and instead use d.viewkeys()?

> Is the iteration over a set operation on keys really that common a
> use case?

From a practical standpoint, iterkeys() is a one-shot deal, while viewkeys() can be iterated over multiple times:

--> d = {1: 'one', 2: 'two', 3: 'three'}

--> di = d.iterkeys()

--> list(di)
[1, 2, 3]

--> list(di)
[]

--> dv = d.viewkeys()

--> list(dv)
[1, 2, 3]

--> list(dv)
[1, 2, 3]

And views are not sets -- they just support a couple set-like operations.

--
~Ethan~

Ethan Furman

unread,

Jul 24, 2013, 8:59:43 PM7/24/13

to pytho...@python.org

On 07/23/2013 07:11 PM, Steven D'Aprano wrote:
> On Tue, 23 Jul 2013 18:16:08 -0700, Ethan Furman wrote:
>>
>> So now, in Python 3, .keys(), .values(), even .items() all return these
>> 'view' thingies.
>>
>> And everything I thought I knew about when to use one or the other went
>> out the window.
>
> Surely not. The fundamental behaviour of Python's data model hasn't
> changed.

Poetic effect. Dramatic license. Blah blah. ;)

> Repeat after me: "In Python 2, d.keys() returns a list of keys, so if I
> want a list of keys in Python 3, call list explicitly list(d.keys())."

Actually, I would recommend `list(d)`, which also works the same in both 2 and 3.

--
~Ethan~

Ben Finney

unread,

Jul 24, 2013, 10:20:18 PM7/24/13

to pytho...@python.org

Ethan Furman <et...@stoneleaf.us> writes:

> On 07/23/2013 07:11 PM, Steven D'Aprano wrote:

> > On Tue, 23 Jul 2013 18:16:08 -0700, Ethan Furman wrote:
> >> And everything I thought I knew about when to use one or the other went
> >> out the window.
> >
> > Surely not. The fundamental behaviour of Python's data model hasn't
> > changed.
>

> Poetic effect. Dramatic license. Blah blah. ;)

Text-only medium. Clarity of communication. Et cetera. :-)

--
\ “Two hands working can do more than a thousand clasped in |
`\ prayer.” —Anonymous |
_o__) |
Ben Finney

Steven D'Aprano

unread,

Jul 25, 2013, 1:48:36 AM7/25/13

to

On Wed, 24 Jul 2013 08:57:11 -0700, Ethan Furman wrote:

> My point is that in 2.x .keys() did something different from the dict,
> while in 3.x it appears to me that they are the same.

Then you aren't looking very closely. d.keys() returns a set-like view
into the dict, which is great for comparing elements:

py> d1 = dict.fromkeys([1, 2, 3, 4])
py> d2 = dict.fromkeys([3, 4, 5, 6])
py> d1.keys() & d2.keys() # keys that are in both
{3, 4}
py> d1.keys() ^ d2.keys() # keys not in both
{1, 2, 5, 6}
py> d1.keys() - d2.keys() # keys only in d1
{1, 2}
py> d2.keys() - d1.keys() # keys only in d2
{5, 6}

Dicts aren't sets, and don't support set methods:

py> d1 - d2

Traceback (most recent call last):
File "<stdin>", line 1, in <module>

TypeError: unsupported operand type(s) for -: 'dict' and 'dict'

> Peter's point is that in the stdlib the new functionality of .keys() is
> never used, not even once.

The standard library is not the universe of Python code, and most of the
standard library predates Python 3. Some of it goes back to Python 1
idioms. In general, working code doesn't get updated until it stops
working.

I have code that manually walks over each dict and extracts keys that are
in both, or one but not the other. Once I drop support for Python 2.6, I
throw that code away and just use views. But until then, I'm stuck doing
it the horrible way. Judging by a naive grep of my code, you might think
I never used views. I do, I just have to reinvent the wheel.

--
Steven

Steven D'Aprano

unread,

Jul 25, 2013, 1:52:50 AM7/25/13

to

That second point was the deciding factor when direct iteration over
dicts was added. has_key() was deprecated in favour of "key in dict", and
that pretty much forced iteration to go over keys by default. The
reasoning is, "x in y" ought to be equivalent to:

for tmp in y:
if x == tmp: return True
return False

There's probably even a PEP about this, if anyone is less lazy/busy and
can be bothered looking for it.

--
Steven

Steven D'Aprano

unread,

Jul 25, 2013, 1:57:37 AM7/25/13

to

On Wed, 24 Jul 2013 17:59:43 -0700, Ethan Furman wrote:

>> Repeat after me: "In Python 2, d.keys() returns a list of keys, so if I
>> want a list of keys in Python 3, call list explicitly list(d.keys())."
>
> Actually, I would recommend `list(d)`, which also works the same in both
> 2 and 3.

Fair point.

--
Steven

alex23

unread,

Jul 25, 2013, 2:01:25 AM7/25/13

to

On 25/07/2013 4:31 AM, Ethan Furman wrote:
> 2) Hopefully learn something about when a view is useful.

I haven't seeen this mentioned - forgive me if it's a repeat - but views
are constant references to whichever set they represent.

Python 2.7:

>>> dd = dict(a=1,b=2,c=3)
>>> keys = dd.keys()
>>> 'a' in keys
True
>>> dd['d'] = 4
>>> 'd' in keys
False

Python 3.3:
>>> dd = dict(a=1,b=2,c=3)
>>> keys = dd.keys()
>>> 'a' in keys
True
>>> dd['d'] = 4
>>> 'd' in keys
True

If part of my code is only interested in what keys or values are
present, it doesn't need to be given a reference to the full dictionary,
just to whichever view it cares about.

Chris Angelico

unread,

Jul 25, 2013, 2:02:42 AM7/25/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 3:48 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> Dicts aren't sets, and don't support set methods:
>
> py> d1 - d2
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unsupported operand type(s) for -: 'dict' and 'dict'

I wouldn't take this as particularly significant, though. A future
version of Python could add that support (and it might well be very
useful), without breaking any of the effects of views.

ChrisA

Steven D'Aprano

unread,

Jul 25, 2013, 3:04:38 AM7/25/13

to

On Wed, 24 Jul 2013 11:31:58 -0700, Ethan Furman wrote:

> On 07/24/2013 10:23 AM, Stefan Behnel wrote:
>> Peter Otten, 24.07.2013 08:23:
>>> Ethan Furman wrote:
>>>>
>>>> So, my question boils down to: in Python 3 how is dict.keys()
>>>> different from dict? What are the use cases?
>>>
>>> To me it looks like views are a solution waiting for a problem.
>>
>> They reduce the API overhead. Previously, you needed values() and
>> itervalues(), with values() being not more than a special case of what
>> itervalues() provides anyway. Now it's just one method that gives you
>> everything. It simply has corrected the tradeoff from two special
>> purpose APIs to one general purpose API, that's all.
>
> I started this thread for two reasons:
>
> 1) Increase awareness that using `list(dict)` is a cross-version
> replacement for `dict.keys()`
>
> 2) Hopefully learn something about when a view is useful.
>
> So far #2 is pretty much a failure.

I don't think so.

- viewkeys() can be used as a drop-in replacement for iterkeys(),
provided you remember not to iterate over it a second time. Doing so
actually iterates over the view, instead of failing as with the iterator.
If you actually want a one-shot iterator, call iter() on the view.

- viewkeys() can be used as a drop-in replacement for Python2 keys(),
provided you only iterate over it once. If you want an actual list, call
list() on the view.

- Views support set methods which don't modify the set. If there is a non-
modifying set method which is not supported, it probably should be, and
somebody should raise an enhancement request in the bug tracker. If you
want an actual independent set you can modify without changing the dict,
call set() (or frozenset) on the view.

- Views support efficient (O(1) in the case of keys) membership testing,
which neither iterkeys() nor Python2 keys() does.

- Views support live, read-only access to dict keys and values.

> Only one use-case so far (and it
> feels pretty rare).

Iterating over a dict's values or items is not rare. Using a view is
better than making a list-copy of the dict and iterating over the list,
because it avoids copying.

For one-shot iteration, there's no benefit of a view over an iterator, or
vice versa, but views are useful for more than just one-shot iteration.

> But hey, I have learned that while some set
> operations are allowed (&, ^, |, .isdisjoint()), others are not
> (.remove(), .discard(), .union(), etc.).
>
> The old .keys(), .values(), and .items() (and their .iter...()
> variations) did something commonly useful. Of what common use are these
> views?

Everything that dict iteration does, dict views do, and more. So if
iterkeys() is useful, then so is viewkeys(). Had viewkeys come first,
iterkeys would never have been invented.

Making an actual list copy of the keys (values, items) is useful, but
it's not useful enough to dedicate a method (three methods) for it. Just
call list() on the view (or, in the case of keys, directly on the dict).

--
Steven

Steven D'Aprano

unread,

Jul 25, 2013, 3:27:40 AM7/25/13

to

I don't think dicts can ever support set methods, since *they aren't
sets*. Every element consists of both a key and a value, so you have to
consider both. Set methods are defined in terms of singleton elements,
not binary elements, so before you even begin, you have to decide what
does it mean when two elements differ in only one of the two parts?

Given dicts {1: 'a'}, {1: 'b'}, what is the union of them? I can see five
possibilities:

{1: 'a'}
{1: 'b'}
{1: ['a', 'b']}
{1: set(['a', 'b'])}
Error

Each of the five results may be what you want in some circumstances. It
would be a stupid thing for dict.union to pick one behaviour and make it
the One True Way to perform union on two dicts.

--
Steven

Chris Angelico

unread,

Jul 25, 2013, 4:02:11 AM7/25/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 5:04 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> - Views support efficient (O(1) in the case of keys) membership testing,
> which neither iterkeys() nor Python2 keys() does.

To save me the trouble and potential error of digging through the
source code: What's the complexity of membership testing on
values/items? Since you're calling it "efficient" it must be better
than O(n) which the list form would be, yet it isn't O(1) or you
wouldn't have qualified "in the case of keys". Does this mean
membership testing of the values and items views is O(log n) in some
way, eg a binary search?

ChrisA

Peter Otten

unread,

Jul 25, 2013, 4:13:47 AM7/25/13

to pytho...@python.org

Chris Angelico wrote:

> On Thu, Jul 25, 2013 at 5:04 PM, Steven D'Aprano
> <steve+comp....@pearwood.info> wrote:

>> - Views support efficient (O(1) in the case of keys) membership testing,
>> which neither iterkeys() nor Python2 keys() does.
>

> To save me the trouble and potential error of digging through the
> source code: What's the complexity of membership testing on
> values/items? Since you're calling it "efficient" it must be better
> than O(n) which the list form would be, yet it isn't O(1) or you
> wouldn't have qualified "in the case of keys". Does this mean
> membership testing of the values and items views is O(log n) in some
> way, eg a binary search?

keys() and items() is O(1); both look up the key in the dictionary and
items() then proceeds to compare the value. values() is O(n).

Chris Angelico

unread,

Jul 25, 2013, 4:15:22 AM7/25/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 5:27 PM, Steven D'Aprano

That's true, but we already have that issue with sets. What's the
union of {0} and {0.0}? Python's answer: It depends on the order of
the operands.

>>> i={0}
>>> f={0.0}
>>> i | f
{0}
>>> f | i
{0.0}

I would say that Python can freely pick from the first two options you
offered (either keep-first or keep-last), most likely the first one,
and it'd make good sense. Your third option would be good for a few
specific circumstances, but then you probably would also want the
combination of {1:'a'} and {1:'a'} to be {1:['a','a']} for
consistency. This would make a good utility function, but isn't what
I'd normally see set union doing. Similarly with the fourth option,
though there it's a little more arguable. Raising an error would work,
but is IMO unnecessary.

(Pike has dictionary operations, but has made different choices. For
instance, 0 and 0.0 are considered distinct, so a set can contain
both. Mappings (dicts) merge by keeping the last, not the first. But
the specifics don't much matter.)

A Python set already has to distinguish between object value and
object identity; a dict simply adds a bit more distinction between
otherwise-considered-identical keys, namely their values.

>>> a="This is a test."
>>> b="This is a test."
>>> a is b
False
>>> id(a)
16241512
>>> id(b)
16241392
>>> id(({a}|{b}).pop())
16241512

Assuming a and b have different ids, which is true in the above
example of Py3.3 on Windows, the set union *must* be different from
one of them. Suppose you do a dictionary of id(key) -> value, and a
set of the keys themselves. You could then do set operations on the
keys, and then go and retrieve the values.

Sure, maybe the way of doing things won't be exactly what everyone
expects... but it works, and it makes plausible sense. And as a
theoretical "might be implemented in Python 3.5", it still has no
impact on views, beyond that there are some operations which must be
done with views in <=3.3 that could be done on the dicts themselves in
future.

ChrisA

Steven D'Aprano

unread,

Jul 25, 2013, 5:44:43 AM7/25/13

to

That's a side-effect of how numeric equality works in Python. Since 0 ==
0.0, you can't have both as keys in the same dict, or set. Indeed, the
same numeric equality issue occurs here:

py> from fractions import Fraction
py> [0, 2.5] == [0.0, Fraction(5, 2)]
True

So nothing really to do with sets or dicts specifically.

Aside: I think the contrary behaviour is, well, contrary. It would be
strange and disturbing to do this:

for key in some_dict:
if key == 0:
print("found")
print(some_dict[key])

and have the loop print "found" and then have the key lookup fail, but
apparently that's how things work in Pike :-(

> I would say that Python can freely pick from the first two options you
> offered (either keep-first or keep-last), most likely the first one, and
> it'd make good sense. Your third option would be good for a few specific
> circumstances, but then you probably would also want the combination of
> {1:'a'} and {1:'a'} to be {1:['a','a']} for consistency.

Okay, that's six variations. And no, I don't think the "consistency"
argument is right -- the idea is that you can have multiple values per
key. Since 'a' == 'a', that's only one value, not two.

The variation using a list, versus the set, depends on whether you care
about order or hashability.

[...]

> Raising an error would work, but is IMO unnecessary.

I believe that's the only reasonable way for a dict union method to work.
As the Zen says:

In the face of ambiguity, refuse the temptation to guess.

Since there is ambiguity which value should be associated with the key,
don't guess.

[...]

> A Python set already has to distinguish between object value and object
> identity; a dict simply adds a bit more distinction between
> otherwise-considered-identical keys, namely their values.

Object identity is a red herring. It would be perfectly valid for a
Python implementation to create new instances of each element in the set
union, assuming such creation was free of side-effects (apart from memory
usage and time, naturally). set.union() makes no promise about the
identity of elements, and it is defined the same way for languages where
object identity does not exist (say, old-school Pascal).

--
Steven

Chris Angelico

unread,

Jul 25, 2013, 6:34:23 AM7/25/13

to pytho...@python.org

On Thu, Jul 25, 2013 at 7:44 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> On Thu, 25 Jul 2013 18:15:22 +1000, Chris Angelico wrote:
>> That's true, but we already have that issue with sets. What's the union
>> of {0} and {0.0}? Python's answer: It depends on the order of the
>> operands.
>
> That's a side-effect of how numeric equality works in Python. Since 0 ==
> 0.0, you can't have both as keys in the same dict, or set. Indeed, the
> same numeric equality issue occurs here:
>
> py> from fractions import Fraction
> py> [0, 2.5] == [0.0, Fraction(5, 2)]
> True
>
> So nothing really to do with sets or dicts specifically.

Here's how I imagine set/dict union:
1) Take a copy of the first object
2) Iterate through the second. If the key doesn't exist in the result, add it.

This works just fine even when "add it" means "store this value
against this key". The dict's value and the object's identity are both
ignored, and you simply take the first one you find.

> Aside: I think the contrary behaviour is, well, contrary. It would be
> strange and disturbing to do this:
>
> for key in some_dict:
> if key == 0:
> print("found")
> print(some_dict[key])
>
> and have the loop print "found" and then have the key lookup fail, but
> apparently that's how things work in Pike :-(

I agree, that would be very strange and disturbing. I mentioned that
aspect merely in passing, but the reason for the difference is not an
oddity of key lookup, but a different decision about float and int: in
Pike, 0 and 0.0 are not equal. (Nor are 1 and 1.0, in case you thought
this was a weirdness of zero.) It's a debatable point; are we trying
to say that all numeric types represent real numbers, and are equal if
they represent the same real number? Or are different representations
distinct, just as much as the string "0" is different from the integer
0? Pike took the latter approach. PHP took the former approach to its
illogical extreme, that the string "0001E1" is equal to "000010" (both
strings). No, the dictionary definitely needs to use object equality
to do its lookup, although I could well imagine an implementation that
runs orders of magnitude faster when object identity can be used.

>> I would say that Python can freely pick from the first two options you
>> offered (either keep-first or keep-last), most likely the first one, and
>> it'd make good sense. Your third option would be good for a few specific
>> circumstances, but then you probably would also want the combination of
>> {1:'a'} and {1:'a'} to be {1:['a','a']} for consistency.
>
> Okay, that's six variations. And no, I don't think the "consistency"
> argument is right -- the idea is that you can have multiple values per
> key. Since 'a' == 'a', that's only one value, not two.

Well, it depends what you're doing with the merging of the dicts. But
all of these extra ways to do things would be explicitly-named
functions with much rarer usage (and quite possibly not part of the
standard library, they'd be snippets shared around and put directly in
application code).

>> Raising an error would work, but is IMO unnecessary.
>
> I believe that's the only reasonable way for a dict union method to work.
> As the Zen says:
>
> In the face of ambiguity, refuse the temptation to guess.
>
> Since there is ambiguity which value should be associated with the key,
> don't guess.

There's already ambiguity as to which of two equal values should be
retained by the set. Python takes the first. Is that guessing? Is that
violating the zen? I don't see a problem with the current set
implementation, and I also don't see a problem with using that for
dict merging.

> Object identity is a red herring. It would be perfectly valid for a
> Python implementation to create new instances of each element in the set
> union, assuming such creation was free of side-effects (apart from memory
> usage and time, naturally). set.union() makes no promise about the
> identity of elements, and it is defined the same way for languages where
> object identity does not exist (say, old-school Pascal).

That still doesn't deal with the "which type should the new object
be". We're back to this question: What is the union of 0 and 0.0?

>>> {0} | {0.0}
{0}
>>> {0.0} | {0}
{0.0}

Maybe Python could create a brand new object, but would it be an int
or a float? The only way I could imagine this working is with a
modified-set class that takes an object constructor, and passes every
object through it. That way, you could have set(float) that coerces
everything to float on entry, which would enforce what you're saying
(even down to potentially creating a new object with a new id, though
float() seems to return a float argument unchanged in CPython 3.3).
Would that really help anything, though? Do we gain anything by not
simply accepting, in the manner of Colonel Fairfax, the first that
comes?

ChrisA

Johannes Bauer

unread,

Jul 25, 2013, 9:22:59 AM7/25/13

to

On 25.07.2013 07:48, Steven D'Aprano wrote:

> Then you aren't looking very closely. d.keys() returns a set-like view
> into the dict, which is great for comparing elements:
>
> py> d1 = dict.fromkeys([1, 2, 3, 4])
> py> d2 = dict.fromkeys([3, 4, 5, 6])
> py> d1.keys() & d2.keys() # keys that are in both
> {3, 4}
> py> d1.keys() ^ d2.keys() # keys not in both
> {1, 2, 5, 6}
> py> d1.keys() - d2.keys() # keys only in d1
> {1, 2}
> py> d2.keys() - d1.keys() # keys only in d2
> {5, 6}

I know this is completely off-topic, but I really must thank you for
showing that neat trick. I didn't know set()'s operators &, ^, - were
overloaded (and always used difference/intersection, etc). That's
really, really neat.

Thanks again,
Joe

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?
> Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1...@speranza.aioe.org>

Ethan Furman

unread,

Jul 25, 2013, 9:47:08 AM7/25/13

to pytho...@python.org

In these cases why is a view better than just using the dict? Is it a safety so the you don't accidentally modify the dict?

--
~Ethan~

Ethan Furman

unread,

Jul 25, 2013, 9:57:15 AM7/25/13

to pytho...@python.org

On 07/24/2013 10:48 PM, Steven D'Aprano wrote:
> On Wed, 24 Jul 2013 08:57:11 -0700, Ethan Furman wrote:
>
>> My point is that in 2.x .keys() did something different from the dict,
>> while in 3.x it appears to me that they are the same.
>
> Then you aren't looking very closely.

Actually, I am. That's why I started this thread.

Thank you for the insights.

--
~Ethan~

Steven D'Aprano

unread,

Jul 25, 2013, 10:57:10 AM7/25/13

to

On Thu, 25 Jul 2013 20:34:23 +1000, Chris Angelico wrote:

> On Thu, Jul 25, 2013 at 7:44 PM, Steven D'Aprano
> <steve+comp....@pearwood.info> wrote:
>> On Thu, 25 Jul 2013 18:15:22 +1000, Chris Angelico wrote:
>>> That's true, but we already have that issue with sets. What's the
>>> union of {0} and {0.0}? Python's answer: It depends on the order of
>>> the operands.
>>
>> That's a side-effect of how numeric equality works in Python. Since 0
>> == 0.0, you can't have both as keys in the same dict, or set. Indeed,
>> the same numeric equality issue occurs here:
>>
>> py> from fractions import Fraction
>> py> [0, 2.5] == [0.0, Fraction(5, 2)] True
>>
>> So nothing really to do with sets or dicts specifically.
>
> Here's how I imagine set/dict union:
> 1) Take a copy of the first object
> 2) Iterate through the second. If the key doesn't exist in the result,
> add it.

That's because you're too much of a programmer to step away from the
implementation. Fundamentally, set union has nothing to do with objects,
or bit strings, or any concrete implementation. Sets might be infinite,
and "take a copy" impossible or meaningless.

Logically, the union of set A and set B is the set containing every
element which is in A, every element in B, and no element which is not.
How you assemble those elements in a concrete implementation is, in a
sense, irrelevant. In old-school Pascal, the universe of possible
elements is taken from the 16-bit, or 32-bit if you're lucky, integers;
in Python, it's taken from hashable objects. Even using your suggested
algorithm above, since union is symmetric, it should make no difference
whether you start with the first, or with the second.

> This works just fine even when "add it" means "store this value against
> this key". The dict's value and the object's identity are both ignored,
> and you simply take the first one you find.

I don't believe that "works", since the whole point of dicts is to store
the values. In practice, the values are more important than the keys. The
key only exists so you can get to the value -- the key is equivalent to
the index in a list, the value to the value at that index. We normally
end up doing something like "print adict[key]", not "print key". So
throwing away the values just because they happen to have the same key is
a fairly dubious thing to do, at least for union or intersection.

(In contrast, that's exactly what you want an update method to do.
Different behaviour for different methods.)

[...]

>>> Raising an error would work, but is IMO unnecessary.
>>
>> I believe that's the only reasonable way for a dict union method to
>> work. As the Zen says:
>>
>> In the face of ambiguity, refuse the temptation to guess.
>>
>> Since there is ambiguity which value should be associated with the key,
>> don't guess.
>
> There's already ambiguity as to which of two equal values should be
> retained by the set.

In an ideal world of Platonic Ideals, it wouldn't matter, since
everything is equal to itself, and to nothing else. There's only one
number "two", whether you write it as 2 or 2.0 or 800/400 or Ⅱ or 0b10,
and it is *impossible even in principle* to distinguish them since there
is no "them" to distinguish between. Things that are equal shouldn't be
distinguishable, not by value, not by type, not by identity.

But that would be *too abstract* to be useful, and so we allow some of
the abstractness leak away, to the benefit of all. But the consequence of
this is that we sometimes have to make hard decisions, like, which one of
these various "twos" do we want to keep? Or more often, we stumble into a
decision by allowing the implementation specify the behaviour, rather
than choosing the behaviour and the finding an implementation to match
it. Given the two behaviours:

{2} | {2.0} => {2} or {2.0}, which should it be? Why not Fraction(2) or
Decimal(2) or 2+0j?

there's no reason to prefer Python's answer, "the value on the left",
except that it simplifies the implementation. The union operator ought to
be symmetrical, a ∪ b should be identical to b ∪ a, but isn't. Another
leaky abstraction.

--
Steven

Chris Angelico

unread,

Jul 25, 2013, 11:07:55 AM7/25/13

to pytho...@python.org

On Fri, Jul 26, 2013 at 12:57 AM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> [ snip lengthy explanation of sets ]

> The union operator ought to
> be symmetrical, a ∪ b should be identical to b ∪ a, but isn't. Another
> leaky abstraction.

Right. I agree with all your theory, which is fine and good. If we had
a "set of real numbers", then each one would be both equal to and
indistinguishable from any other representation of itself. But Python
doesn't work with real numbers. It works with ints and floats and
Fractions and Decimals and Guido-knows-what. (Sorry, I don't normally
talk like that, but the expression begged to be said. :) )

So since Python already lets its abstraction leak a bit for
usefulness, why not retain the exact same leak when working with a
dict? A set is a dict with no values... a dict is a set with extra
payload. They're given very similar literal notation; if they were
meant to be more distinct, why was no alternative symbol used?

(I love how a random side comment can become a topic of its own.)

ChrisA

Ian Kelly

unread,

Jul 25, 2013, 11:53:33 AM7/25/13

to Python

On Thu, Jul 25, 2013 at 2:13 AM, Peter Otten <__pet...@web.de> wrote:
> Chris Angelico wrote:
>

>> On Thu, Jul 25, 2013 at 5:04 PM, Steven D'Aprano
>> <steve+comp....@pearwood.info> wrote:
>>> - Views support efficient (O(1) in the case of keys) membership testing,
>>> which neither iterkeys() nor Python2 keys() does.
>>

>> To save me the trouble and potential error of digging through the
>> source code: What's the complexity of membership testing on
>> values/items? Since you're calling it "efficient" it must be better
>> than O(n) which the list form would be, yet it isn't O(1) or you
>> wouldn't have qualified "in the case of keys". Does this mean
>> membership testing of the values and items views is O(log n) in some
>> way, eg a binary search?
>
> keys() and items() is O(1); both look up the key in the dictionary and
> items() then proceeds to compare the value. values() is O(n).

3.x values() is O(n) but avoids the unnecessary step of copying all the
values in the dict that you get when performing the same operation
using 2.x values(). Hence, although the asymptotic complexity is the
same, it's still more efficient.

Prasad, Ramit

unread,

Jul 25, 2013, 12:11:47 PM7/25/13

to pytho...@python.org

Terry Reedy wrote:

>
> On 7/24/2013 4:34 PM, Prasad, Ramit wrote:
>
> > I am still not clear on the advantage of views vs. iterators.
>

> A1: Views are iterables that can be iterated more than once. Therefore,
> they can be passed to a function that re-iterates its inputs, or to
> multiple functions. They support 'x in view' as efficiently as possible.
> Think about how you would write the non-view equivalent of '(0,None) in
> somedict.views())'. When set-like, views support some set operations.
> For .keys, which are always set-like, these operations are easy to
> implement as dicts are based on a hashed array of keys.

Hmm, that is a change that makes some sense to me. Does the view
get updated when dictionary changes or is a new view needed? I
assume the latter.

>
> Q2: What is the advantage of views vs. lists?
>
> A2: They do not take up space that is not needed. They can be converted
> to lists, to get all the features of lists, but not vice versa.
>

> > What makes d.viewkeys() better than d.iterkeys()? Why did they decide
> > not to rename d.iterkeys() to d.keys() and instead use d.viewkeys()?
>

> This is historically the wrong way to phrase the question. The 2.7
> .viewxyz methods were *not* used to make the 3.x .xyz methods. It was
> the other way around. 3.0 came out with view methods replacing both list
> and iter methods just after 2.6, after a couple of years of design, and
> a year and a half before 2.7. The view methods were backported from 3.1
> to 2.7, with 'view' added to the name to avoid name conflicts, to make
> it easier to write code that would either run on both 2.7 and 3.x or be
> converted with 2to3.
>
> A better question is: 'When 3.0 was designed, why were views invented
> for the .xyz methods rather than just renaming the .iterxyz methods. The
> advantages given above are the answer. View methods replace both list
> and iterator methods and are more flexible than either and directly or
> indirectly have all the advantages of both.
>
> My question is why some people are fussing so much because Python
> developers gave them one thing that is better than either of the two
> things it replaces?

I personally am not "fussing" as existing functionality was preserved
(and improved). I just was not clear on the difference. Thanks for
all the detail and context.

>
> The mis-phrased question above illustrates why people new to Python
> should use the latest 3.x and ignore 2.x unless they must use 2.x
> libraries. 2.7 has all the old stuff, for back compatibility, and as
> much of the new stuff in 3.1 as seemed sensible, for forward
> compatibility. Thus it has lots of confusing duplication, and in this
> case, triplication

I work with 2.6 so no choice there... :)

>
> --
> Terry Jan Reedy
>
> --
> http://mail.python.org/mailman/listinfo/python-list

Ethan Furman

unread,

Jul 25, 2013, 12:21:02 PM7/25/13

to pytho...@python.org

On 07/25/2013 09:11 AM, Prasad, Ramit wrote:
> Terry Reedy wrote:
>>
>> On 7/24/2013 4:34 PM, Prasad, Ramit wrote:
>>
>>> I am still not clear on the advantage of views vs. iterators.
>>
>> A1: Views are iterables that can be iterated more than once. Therefore,
>> they can be passed to a function that re-iterates its inputs, or to
>> multiple functions. They support 'x in view' as efficiently as possible.
>> Think about how you would write the non-view equivalent of '(0,None) in
>> somedict.views())'. When set-like, views support some set operations.
>> For .keys, which are always set-like, these operations are easy to
>> implement as dicts are based on a hashed array of keys.
>
> Hmm, that is a change that makes some sense to me. Does the view
> get updated when dictionary changes or is a new view needed? I
> assume the latter.

Nope, the former. That is a big advantage that the views have over concrete lists: they show the /current/ state, and
so are always up-do-date.

--
~Ethan~

Peter Otten

unread,

Jul 25, 2013, 12:25:19 PM7/25/13

to pytho...@python.org

Ian Kelly wrote:

> On Thu, Jul 25, 2013 at 2:13 AM, Peter Otten <__pet...@web.de> wrote:
>> Chris Angelico wrote:
>>
>>> On Thu, Jul 25, 2013 at 5:04 PM, Steven D'Aprano
>>> <steve+comp....@pearwood.info> wrote:

>>>> - Views support efficient (O(1) in the case of keys) membership
>>>> testing, which neither iterkeys() nor Python2 keys() does.
>>>

>>> To save me the trouble and potential error of digging through the
>>> source code: What's the complexity of membership testing on
>>> values/items? Since you're calling it "efficient" it must be better
>>> than O(n) which the list form would be, yet it isn't O(1) or you
>>> wouldn't have qualified "in the case of keys". Does this mean
>>> membership testing of the values and items views is O(log n) in some
>>> way, eg a binary search?
>>
>> keys() and items() is O(1); both look up the key in the dictionary and
>> items() then proceeds to compare the value. values() is O(n).
>
> 3.x values() is O(n) but avoids the unnecessary step of copying all the
> values in the dict that you get when performing the same operation
> using 2.x values(). Hence, although the asymptotic complexity is the
> same, it's still more efficient.

In Python 2 the prudent pythonista used itervalues() to avoid unnecessary
intermediate list...

Terry Reedy

unread,

Jul 25, 2013, 2:37:14 PM7/25/13

to pytho...@python.org

I think 'view' is generally used in CS to mean a live view, as opposed
to a snapshot. Memoryviews in 3.x are also live views. Dictionary views
are read-only. I believe memoryviews can be read-write if allowed by the
object being viewed.

Python slices are snapshots. It has been proposed that they should be
views to avoid copying memory, but that has been rejected since views
necessarily keep the underlying object alive. Instead, applications can
define the views they need. (They might, for instance, allow multiple
slices in a view, as tk Text widgets do.)

--
Terry Jan Reedy