Message from discussion
Python's doc problems: sort
Path: g2news1.google.com!postnews.google.com!y18g2000pre.googlegroups.com!not-for-mail
From: "xah...@gmail.com" <xah...@gmail.com>
Newsgroups: comp.lang.perl.misc,comp.lang.python,comp.lang.lisp
Subject: Python's doc problems: sort
Date: Tue, 29 Apr 2008 19:48:39 -0700 (PDT)
Organization: http://groups.google.com
Lines: 274
Message-ID: <929d5ce9-9063-4e6c-98aa-89526f89fba3@y18g2000pre.googlegroups.com>
NNTP-Posting-Host: 24.6.97.120
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1209523719 21943 127.0.0.1 (30 Apr 2008 02:48:39 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Wed, 30 Apr 2008 02:48:39 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: y18g2000pre.googlegroups.com; posting-host=24.6.97.120;
posting-account=bRPKjQoAAACxZsR8_VPXCX27T2YcsyMA
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_4_11; en)
AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.1 Safari/525.18,gzip(gfe),gzip(gfe)
Of my Python and Perl tutorial at http://xahlee.org/perl-python/index.html
the most popular page is
=E2=80=9CSorting in Python and Perl=E2=80=9D
http://xahlee.org/perl-python/sort_list.html
For example, in last week, that page is fetched 550 times.
The second most popular page, trails quite a distance. Here's the top
3 pages and their number of times fetched:
550 http://xahlee.org/perl-python/sort_list.html
341 http://xahlee.org/perl-python/system_calls.html
222 http://xahlee.org/perl-python/index.html
Note that the first 2 pages are far more popular than the entry page
the table of contents.
Apparently, and also verified by my web log, that people have
difficulty in using sort, and they find my pages thru web search
engines.
------------------
In 2005, i wrote over ten essays detailing Python's documentation
problems. One of them is titled:
=E2=80=9CPython Doc Problem Example: sort()=E2=80=9D
http://xahlee.org/perl-python/python_doc_sort.html
It's been 3 years, and python has gone from 2.4.x to 2.5.2.
Looking at the current version of the doc, apparently, Python doc of
that page hasn't improved a bit.
I want to emphasize a point here, as i have done quite emphatically in
the past. The Python documentation, is the world's worst technical
writing. As far as technical writing goes, it is even worse than
Perl's in my opinion.
Although i disliked Perl very much, in part that it is lead by a cult
figure that manipulates and deceives the populace, but there is at
least one aspect of Perl community that is very positive, namely,
embrace all walks of life. This aspect is taken by a Perl derivative
the Pretty Home Page, and its success surpassed Perl, yet without
Perl's cult string. Now, in my experience, the Python community, is
filled with politics more so than Perl, and far more fucking assholes
with high hats.
Python priests: go fuck yourselfs.
(disclaimer: all statements about actual persons in this post are
statements of opinion.)
----------------------
Now, i find it pertinent to post my essay about the sort documentation
problem again.
The HTML version with colors and formatting is here:
http://xahlee.org/perl-python/python_doc_sort.html
Below is a abridged textual version.
-------------------------------------
Python Doc Problem Example: sort()
Python doc =E2=80=9C3.6.4 Mutable Sequence Types=E2=80=9D at http://python.o=
rg/doc/2.4/lib/typesseq-mutable.html
in which contains the documentation of the =E2=80=9Csort=E2=80=9D method of =
a list.
Quote:
=C2=AB...=C2=BB
As a piece of documentation, this is a lousy one.
The question Python doc writers need to ask when evaluating this piece
of doc are these:
* Can a experienced programer who is expert at several languages
but new to Python, and also have read the official Python tutorial,
can he, read this doc, and know exactly how to use sort with all the
options?
* Can this piece of documentation be rewritten fairly easily, so
that the answer to the previous question is a resounding yes?
To me, the answers to the above questions are No and Yes. Here are
some issues with the doc:
=E2=80=A2 In the paragraph about the =E2=80=9Ckey=E2=80=9D parameter, the il=
lustration given
is: =E2=80=9Ccmp=3Dstr.lower=E2=80=9D. It should be be =E2=80=9Ckey=3Dstr.lo=
wer=E2=80=9D
=E2=80=A2 This doc lacks examples. One or two examples will help a lot,
especially to less experienced programers. (which comprises the
majority of readers) In particular, it should give a full example of
using the comparison function and one with the =E2=80=9Ckey=E2=80=9D paramet=
er.
Examples are particularly needed here because these parameters are
functions, often with the =E2=80=9Clambda=E2=80=9D construct. These are unus=
ual and
advanced constructs among imperative languages.
=E2=80=A2 This doc fails to mention what happens when the predicate and the
shortcut version conflicts. e.g. =E2=80=9CmyList.sort(cmp=3Dlambda x,y:
cmp(x[0], y[0]), key=3Dlambda x: str(x[1]) )=E2=80=9D
=E2=80=A2 The notation the Python doc has adopted for indicating the syntax =
of
optional parameters, does not give a clear view just exactly what
combination of optional parameters can be omitted. The notation:
=E2=80=9Cs.sort([cmp[, key[, reverse]]])=E2=80=9D gives the impression that =
only
trailing arguments can be omitted, which is not true.
=E2=80=A2 The doc gives no indication of how to omit a optional arg. Should =
it
be =E2=80=9Cnul=E2=80=9D, =E2=80=9CNull=E2=80=9D, 0, or left empty? Since it=
doesn't give any
examples, doc reader who isn't Python experts is left to guess at how
true/false values are presented in Python.
=E2=80=A2 On the whole, the way this doc is written does not give a clear
picture of the roles of the supplied options, nor how to use them.
Suggested Quick Remedy: add a example of using the cmp function. And a
example using the =E2=80=9Ckey=E2=80=9D function. Add a example of Using one=
of them
and with reverse. (the examples need not to come with much
explanations. One sentence annotation is better than none.)
Other than that, the way the doc is laid out with a terse table and
run-on footnotes (employed in several places in Python doc) is not
inductive. For a better improvement, there needs to be a overhaul of
the organization and the attitude of the entire doc. The organization
needs to be programing based, as opposed to implementation or computer
science based. (in this regard, one can learn from the Perl folks). As
to attitude, the writing needs to be Python-as-is, as opposed to
computer science framework, as indicated in the early parts of this
critique series.
Addendum, 200510: Since Python 2.4 released in 2005-03, a new built-in
function sorted() was added. There's no mention of it at the doc page
of the sort() method.
Addendum, 2005-10
Here's further example of Python's extreme low quality of
documentation. In particular, what follows focuses on the bad writing
skill aspect, and comments on some language design and quality issues
of Python.
=46rom the Official Python documentation of the sort() method, at:
http://python.org/doc/2.4.2/lib/typesseq-mutable.html, Quote:
=C2=ABThe sort() method takes optional arguments for controlling the
comparisons.=C2=BB
It should be =E2=80=9Coptional parameter=E2=80=9D not =E2=80=9Coptional argu=
ment=E2=80=9D. Their
difference is that =E2=80=9Cparameter=E2=80=9D indicates the variable, while=
=E2=80=9Cargument=E2=80=9D indicates the actual value.
=C2=AB... for controlling the comparisons.=C2=BB
This is a bad writing caused by lack of understanding. No, it doesn't
=E2=80=9Ccontrol the comparison=E2=80=9D. The proper way to say it is that =
=E2=80=9Cthe
comparison function specifies an order=E2=80=9D.
=C2=ABThe sort() and reverse() methods modify the list in place for
economy of space when sorting or reversing a large list. To remind you
that they operate by side effect, they don't return the sorted or
reversed list. =C2=BB
This is a example of tech-geeking drivel. The sort() and reverse()
methods are just the way they are. Their design and behavior are
really not for some economy or remind programers of something. The
Python doc is bulked with these irrelevant drivels. These littered
inanities dragged down the whole quality and effectiveness of the doc.
=C2=ABChanged in version 2.4: Support for key and reverse was added.=C2=
=BB
=C2=ABIn general, the key and reverse conversion processes are much
faster than specifying an equivalent cmp function. This is because cmp
is called multiple times for each list element while key and reverse
touch each element only once.=C2=BB
When sorting something, one needs to specify a order. The easiest way
is to simply list all the elements as a sequence. That way, their
order is clearly laid out. However, this is in general not feasible
and impractical. Therefore, we devised a mathematically condensed way
to specify the order, by defining a function f(x,y) that can take any
two elements and tell us which one comes first. This, is the gist of
sorting a list in any programing language.
The ordering function, being a mathematically condensed way of
specifying the order, has some constraints. For example, the function
should not tell us x < y and y < x. (For a complete list of these
constraints, see http://xahlee.org/perl-python/sort_list.html )
With this ordering function, it is all sort needed to sort a list.
Anything more is interface complexity.
The optional parameters =E2=80=9Ckey=E2=80=9D and =E2=80=9Creverse=E2=80=9D =
in Python's sort method is
a interface complexity. What happened here is that a compiler
optimization problem is evaded by moving it into the language syntax
for programers to worry about. If the programer does not use the =E2=80=9Cke=
y=E2=80=9D
syntax when sorting a large matrix (provided that he knew in advance
of the list to be sorted or the ordering function), then he is
penalized by a severe inefficiency by a order of magnitude of
execution time.
This situation, of moving compiler problems to the syntax surface is
common in imperative languages.
=C2=ABChanged in version 2.3: Support for None as an equivalent to
omitting cmp was added.=C2=BB
This is a epitome of catering towards morons. =E2=80=9CmyList.sort()=E2=80=
=9D is
perfect but Pythoners had to add =E2=80=9CmyList.sort(None)=E2=80=9D interfa=
ce
complexity just because idiots need it.
The motivation here is simple: a explicit =E2=80=9CNone=E2=80=9D gives codin=
g monkeys
a direct sensory input of the fact that =E2=80=9Cthere is no comparison
function=E2=80=9D. This is like the double negative in black English =E2=80=
=9CI ain't
no gonna do it!=E2=80=9D. Logically, =E2=80=9CNone=E2=80=9D is not even corr=
ect and leads to
bad thinking. What really should be stated in the doc, is that =E2=80=9Cthe
default ordering function to sort() is the =E2=80=98cmp=E2=80=99 function.=
=E2=80=9D.
=C2=ABStarting with Python 2.3, the sort() method is guaranteed to be
stable. A sort is stable if it guarantees not to change the relative
order of elements that compare equal -- this is helpful for sorting in
multiple passes (for example, sort by department, then by salary
grade).=C2=BB
One is quite surprised to read this. For about a decade of a
language's existence, its sort functionality is not smart enough to
preserve order?? A sort that preserves original order isn't something
difficult to implement. What we have here is sloppiness and poor
quality common in OpenSource projects.
Also note the extreme low quality of the writing. It employs the
jargon =E2=80=9Cstable sort=E2=80=9D then proceed to explain what it is, the=
n in
trying to illustrate the situation, it throws =E2=80=9Cmultiple passes=E2=80=
=9D and
the mysterious =E2=80=9Cby department, by salary=E2=80=9D.
Here's a suggested rewrite: =E2=80=9CSince Python 2.3, the result of sort() =
no
longer rearrange elements where the comparison function returns 0.=E2=80=9D
Xah
x...@xahlee.org
=E2=88=91 http://xahlee.org/
=E2=98=84