docs patch: dicts and sets

Alan Isaac

unread,

May 11, 2007, 8:59:24 PM5/11/07

to

This is an attempt to synthesize Bill and Carsten's proposals.
(I'm changing the subject line to better match the topic.)

http://docs.python.org/lib/typesmapping.html: for footnote (3)

Keys and values are listed in an arbitrary order. This order is
indeterminate and generally depends on factors outside the scope of
the
containing program. However, if items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no
intervening modifications to the dictionary, the lists will directly
correspond.

http://docs.python.org/lib/types-set.html: append a new sentence to 2nd par.

Iteration over a set returns elements in an indeterminate
order,which
generally depends on factors outside the scope of the containing
program.

Alan Isaac

Raymond Hettinger

unread,

May 11, 2007, 9:41:32 PM5/11/07

to

This doesn't improve the docs. It suggests some mystic forces at work
while offering nothing that is actionable or that improves
understanding. Adding this kind of muck will only make the docs less
clear.

Recommend dropping this one and moving on to solve some real problems.

Raymond

ru...@yahoo.com

unread,

May 13, 2007, 7:52:18 PM5/13/07

to

On May 11, 7:41 pm, Raymond Hettinger <pyt...@rcn.com> wrote:
> On May 11, 5:59 pm, "Alan Isaac" <ais...@american.edu> wrote:
>
> > This is an attempt to synthesize Bill and Carsten's proposals.
> > (I'm changing the subject line to better match the topic.)
>

> >http://docs.python.org/lib/typesmapping.html:forfootnote (3)

>
> > Keys and values are listed in an arbitrary order. This order is
> > indeterminate and generally depends on factors outside the scope of
> > the
> > containing program. However, if items(), keys(), values(),
> > iteritems(), iterkeys(), and itervalues() are called with no
> > intervening modifications to the dictionary, the lists will directly
> > correspond.
>

> >http://docs.python.org/lib/types-set.html:appenda new sentence to 2nd par.

>
> > Iteration over a set returns elements in an indeterminate
> > order,which
> > generally depends on factors outside the scope of the containing
> > program.
>
> This doesn't improve the docs. It suggests some mystic forces at work
> while offering nothing that is actionable or that improves
> understanding. Adding this kind of muck will only make the docs less
> clear.

I too find the suggested text not very clear and would not
immediately predict from it the effects that started this
thread (or actually, the thread in
http://groups.google.com/group/comp.lang.python/msg/4dc632b476fdc6d3?hl=en&)

> Recommend dropping this one and moving on to solve some real problems.

Perhaps this attitude helps explain some of the problems
in the current documentation.

Dismissing this as not a "real problem" is both wrong
and offensive to people taking the time to actually
propose improvements.

The current docs are clearly wrong. To repeat what has
already been pointed out, they say, "Keys and values are
listed in an arbitrary order which is non-random,
varies across Python implementations, and depends on the
dictionary's history of insertions and deletions."

It has been shown that even when the history of insertions
and deletions is the same, the order may be different.
Taking "history" to extend across program invocation
boundaries is unconventional to put it charitably, and
there is no reason to assume that interpretation would
occur to a reasonable reader. The whole issue can be
cleared up simply by clarifying the documentation; I
really fail to see why this should be at all controversial.

I will offer my own suggestion based on the belief that
documentation should be as explicit as possible:

"Keys and values are listed in an arbitrary but non-random
order which may vary across Python versions, implementations,
and the dictionary's history of insertions and deletions.
When the contents are objects using the default implementation
of __hash__() and __eq__(), the order will depend on the
objects' id() values which may be different even between
different invocations of a program (whether executed from
a .py or a .pyc file for example.)"

Apropos sig...
--
Any software can be broken or fixed simply by changing
the documentation.

Alan Isaac

unread,

May 15, 2007, 12:31:37 PM5/15/07

to

This discussion ended abruptly, and I'd like to see it reach a
conclusion. I will attempt to synthesize Bill and Carsten's
proposals.

There are two proposed patches. The first is to
http://docs.python.org/lib/typesmapping.html
where it is proposed for footnote (3) to state:

Keys and values are listed in an arbitrary order. This order is
indeterminate and generally depends on factors outside the scope
of the containing program. However, if items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no
intervening modifications to the dictionary, the lists will
directly correspond.

The second is for http://docs.python.org/lib/types-set.html
where the proposal is to append a new sentence to the 2nd paragraph:

Iteration over a set returns elements in an indeterminate
order,which generally depends on factors outside the scope of the
containing program.

Alan Isaac

unread,

May 19, 2007, 9:19:44 AM5/19/07

to

I submitted the language based on Bill and Carsten's proposals:

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&group_id=5470

That language has been rejected.
You many want to read the discussion and see if
acceptible language still seems discoverable.

Alan Isaac

Steven Bethard

unread,

May 19, 2007, 11:06:15 AM5/19/07

to

Seems to me that you're focusing on the wrong part of the docs. The
source of this "bug" is not sets or dicts, but the default __hash__
method implementation. Why don't you propose adding something like:

The default __hash__ method is based on an object's id(), and can
therefore change between different iterations of the same program.

to the docs for __hash__:

http://docs.python.org/ref/customization.html

Then if you really feel you need to add something for sets and dicts,
you can add a cross-reference to the __hash__ docs.

STeVe

7stud

unread,

May 19, 2007, 12:58:47 PM5/19/07

to

On May 19, 9:06 am, Steven Bethard <steven.beth...@gmail.com> wrote:
> Alan Isaac wrote:
> > I submitted the language based on Bill and Carsten's proposals:
>

> >https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&...

>
> > That language has been rejected.
> > You many want to read the discussion and see if
> > acceptible language still seems discoverable.
>
> Seems to me that you're focusing on the wrong part of the docs. The
> source of this "bug" is not sets or dicts, but the default __hash__
> method implementation. Why don't you propose adding something like:
>
> The default __hash__ method is based on an object's id(), and can
> therefore change between different iterations of the same program.
>
> to the docs for __hash__:
>
> http://docs.python.org/ref/customization.html
>
> Then if you really feel you need to add something for sets and dicts,
> you can add a cross-reference to the __hash__ docs.
>
> STeVe

Here's an idea--add All the proposed changes to the docs. Why not
allow user's to add any explanations to the docs that they want? Then
readers can choose the explanations that make the most sense to them.
It would eliminate endless, petty discussions about what minutiae are
more important, and it would allow people to spend their time on more
productive efforts. And it would improve the docs exponentially.

Steve Holden

unread,

May 19, 2007, 1:38:04 PM5/19/07

to pytho...@python.org

Except in those instances where users added information that was
explicitly wrong. Which any reader of this newsgroup knows is all too
easy to do. So there would need to be some editorial control. Which
would take effort that may not currently be available.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.com squidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------

7stud

unread,

May 19, 2007, 1:47:02 PM5/19/07

to

On May 19, 11:38 am, Steve Holden <s...@holdenweb.com> wrote:
> Except in those instances where users added information that was
> explicitly wrong.

It's a self correcting mechanism. Other reader's will spot the error
and post corrections.

Steve Holden

unread,

May 19, 2007, 2:36:26 PM5/19/07

to pytho...@python.org

The last thing I want to read in a language's documentation is an
ill-informed and sometimes interminable argument about a particular feature.

For documentation I'm all in favor of user contributions, but I believe
an editorial process is required to ensure readability. I am aware that
the documentation isn't perfect but it's pretty good, and I don't think
throwing it open to anyone (including, by the way, web spammers) to add
to it is necessarily the best way to improve it.

7stud

unread,

May 19, 2007, 2:58:15 PM5/19/07

to

On May 19, 12:36 pm, Steve Holden <s...@holdenweb.com> wrote:
> The last thing I want to read in a language's documentation is an
> ill-informed and sometimes interminable argument about a particular feature.
>

Yet some readers will be able to get to the bottom of an issue they
are having by reading those comments.

Steven Bethard

unread,

May 19, 2007, 3:17:22 PM5/19/07

to

7stud wrote:
> On May 19, 9:06 am, Steven Bethard <steven.beth...@gmail.com> wrote:
>> Alan Isaac wrote:
>>> I submitted the language based on Bill and Carsten's proposals:
>>> https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&...
>>> That language has been rejected.
>>> You many want to read the discussion and see if
>>> acceptible language still seems discoverable.
>> Seems to me that you're focusing on the wrong part of the docs. The
>> source of this "bug" is not sets or dicts, but the default __hash__
>> method implementation. Why don't you propose adding something like:
>>
>> The default __hash__ method is based on an object's id(), and can
>> therefore change between different iterations of the same program.
>>
>> to the docs for __hash__:
>>
>> http://docs.python.org/ref/customization.html
>>
>> Then if you really feel you need to add something for sets and dicts,
>> you can add a cross-reference to the __hash__ docs.
>

> Here's an idea--add All the proposed changes to the docs. Why not
> allow user's to add any explanations to the docs that they want? Then
> readers can choose the explanations that make the most sense to them.
> It would eliminate endless, petty discussions about what minutiae are
> more important, and it would allow people to spend their time on more
> productive efforts.

Actually, it would just move the "endless, petty discussions about what
minutiae are more important" into the docs. I don't see how that's an
improvement.

STeVe

Robert Kern

unread,

May 19, 2007, 3:28:57 PM5/19/07

to pytho...@python.org

And most will simply be confused.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

7stud

unread,

May 19, 2007, 6:00:45 PM5/19/07

to

>Actually, it would just move the "endless, petty discussions about what

>minutiae are more important" into the docs. I don't see how that's an
>improvement.

Because it highlights the issues you will be faced with when using the
described functions. People will post about an issue they had with a
function, and then they will post their solution. Instead of having
to search all over google for an answer, the most relevant discussions
will be right in the docs. As you read the user comments, you would
be able to quickly tell whether a comment pertains to the issue you
are having trouble with, and if the comment isn't relevant, you can
skip the comment and look at the next comment. If you wanted, you
could limit yourself to reading just the official python description
of the function and be no worse off.

>And most will simply be confused.

Then it's likely someone will post something to clear up the confusion.

Steve Holden

unread,

May 19, 2007, 9:50:58 PM5/19/07

to pytho...@python.org

But the real point is that it won't actually do much good to turn the
documentation into a miniature version of c.l.py itself.

You and other readers might be interested in a recent experiment by
Georg Brandl, one of the Python core developers. As Georg says, "For the
impatient: the result can be seen at <http://pydoc.gbrandl.de>".

This is based on a translation of the existing Latex source into
ReStructured Text format. I understand Georg is considering enabling
user comments, among other enhancements. Yo should also understand that
this is a work in progress which may never come to full fruition.

think...@hotmail.com

unread,

May 20, 2007, 4:35:33 AM5/20/07

to

On May 19, 8:06 am, Steven Bethard <steven.beth...@gmail.com> wrote:
> Seems to me that you're focusing on the wrong part of the docs. The
> source of this "bug" is not sets or dicts,

Seems to me, this thread has lost touch with reality. There is no
bug, just a quest to make some random change to docs just to make
the OP feel better about not being able to grasp the concept of an
unordered collection.

Seems to me, he missed something so basic that docs won't help him.
When you care about order, then don't use an unordered collection.
Case closed. No need to add useless, distracting garbage to the docs.

Seems to me, some people would rather think themselves into knots
than to accept the obvious.

Richard T

Raymond Hettinger

unread,

May 20, 2007, 5:12:48 AM5/20/07

to

On May 13, 4:52 pm, r...@yahoo.com wrote:
> Dismissing this as not a "real problem" is both wrong
> and offensive to people taking the time to actually
> propose improvements.

I should have elaborated on what I meant by saying that there is not a
real problem. Another way to put it is that the docs are sufficient
when they say that set ordering is arbitrary. That should be a cue to
not have *any* expectations about the internal ordering of sets and
dicts.

Any further documentation of behavior would be a mistake because it
would of necessity expose implementation specific details. For
instance, there is another intentionally undocumented observable
behavior that sets and dicts change their internal order as new
members are added. It is also intentional that Python makes almost no
promises about the location of objects in memory. IIRC, the only
guarantees made about object identity are that "a is a" is always true
and None can be tested with "is".

Raymond

Alan Isaac

unread,

May 20, 2007, 3:05:46 PM5/20/07

to

"Raymond Hettinger" <pyt...@rcn.com> wrote in message
news:1179652368.4...@k79g2000hse.googlegroups.com...

> Another way to put it is that the docs are sufficient
> when they say that set ordering is arbitrary. That should be a cue to
> not have *any* expectations about the internal ordering of sets and
> dicts.

You are usually more careful.

1. Please do not conflate two issues here.
It confuses people like Richard T.

Did *anyone* who participated in the initial conversation
express an expectation that set ordering is not arbitrary?
No. Not one.

What surprised people was that this ordering
could vary between two *sequential* executions of
an *unchanged* source.

Martin dismisses this by simply asserting (on what basis?)
that anyone who was surprised lacks Python experience,
and that to address this in any way would make the
reference library assume the role of a tutorial.
Not very plausible, IMO, given the rest of the library
documentation.

2. You say it the existing docs "should be a cue",
and yet they clearly did not provide enough guidance
to an ordinary user (me) and some more sophisticated users.
So the docs "should be a cue" to people who do not need a cue.
Do I understand you correctly?

3. Finally, please do not claim that the docs say that set ordering is
arbitrary.
At least not the docs we have benn talking about:
http://docs.python.org/lib/types-set.html
It is fascinating that you would confuse this, since it is the core
of the proposed documentation patch (although the proposed
language was "indeterminate" rather than arbitrary).

So it also seems you are now claiming that the patch should not be in
because of the presence of language that is in fact not there.

Look, I was just trying to help other users who might be
as surprised as I was. As I said, I am not attached to any
language, and in fact I just used the proposals of others.
I just wanted there to be some clue for users who read the docs.
If you prefer to leave such users baffled, so be it.
My effort is exhausted.

Cheers,
Alan Isaac

ru...@yahoo.com

unread,

May 20, 2007, 7:51:04 PM5/20/07

to

On May 20, 3:12 am, Raymond Hettinger <pyt...@rcn.com> wrote:
> On May 13, 4:52 pm, r...@yahoo.com wrote:
> > Dismissing this as not a "real problem" is both wrong
> > and offensive to people taking the time to actually
> > propose improvements.
>
> I should have elaborated on what I meant by saying that there is not a
> real problem. Another way to put it is that the docs are sufficient
> when they say that set ordering is arbitrary. That should be a cue to
> not have *any* expectations about the internal ordering of sets and
> dicts.

I disagree. When reading the docs, the reader will
always have and need assumtions because the docs can't
describe all behavior from first priciples. Every
programmer will bring and apply his or her understanding
of how computers and computer programs operate under
the hood.

For example, nowhere in the "file" object documentation
does is say the files are read starting from byte 0.
It relies on the fact that the reader will have that
expectation based on previous experience with computers.

The are two basics principles that I think most
programmers apply sans explicit contradictory
information:
* That documentation about behavior applies within
the bounds of a single execution.
* That computers are fundamentaly deterministic
(with a possible exception for code running in
Microsoft OSes. :-)

When I read that sets return items in arbitrary
order (and the docs aren't even that specific),
I make a natural assumption that, no information
provided to the contrary, within a single program
execution the order will be arbitrary. Since it
says nothing about between execution, the very
strong general rule applies: that if no obvious
source of volatilty or dependence on environment
exist, the same program should produce the same
results.

> Any further documentation of behavior would be a mistake because it
> would of necessity expose implementation specific details.

You don't need to make promises to explain surprising
behavior. (The word "may" is amazingly useful in these
cases :-) A "for example" that exposes implementation
details make no promises yet can make clear non-intuative
behavior. A concise but clear noting of the surprising
behavior seen by the OP would improve the clarity of the
documentation, not harm it.

> For
> instance, there is another intentionally undocumented observable
> behavior that sets and dicts change their internal order as new
> members are added. It is also intentional that Python makes almost no
> promises about the location of objects in memory. IIRC, the only
> guarantees made about object identity are that "a is a" is always true
> and None can be tested with "is".

One last comment. While I treat opinions from Python
experts on Python technical details with great respect
and appreciation, opinions on documentation should be
viewed with much greater skepticism. It can be difficult
for an expert to view Python with the same eyes as a
non-guru level programmer, yet the latter is (or
should be) the target audience of the documentation.
[And please, let's not start the reference vs tutorial
thing!]