Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Summer reading list

7 views
Skip to first unread message

Raymond Hettinger

unread,
Aug 12, 2003, 2:57:05 AM8/12/03
to
Found in a pamphlet at a pre-school:
---------------------------------------
Reading improves vocabulary
Reading raises cultural literacy through shared knowledge
Reading develops writing skills
Reading opens the mind to new ways of understanding
Reading is fun


Accordingly, I suggest the following works of literature:

* heapq.py (255 lines)
* sets.py (536 lines)
* textwrap.py (355 lines)
* csv.py (427 lines)

These make enjoyable reading, cover interesting topics/algorithms,
demonstrate superb modern python technique, and showcase
effective use of Python's newer features.

Learn from the masters:
Pinard, O'Connor, Peters, Wilson, Martelli, van Rossum,
Ward, Montanaro, Altis, Drake, and others

have-you-read-any-good-code-lately-ly yours,


Raymond Hettinger


P.S. The unittests for sets.py are *not* as enjoyable reading; however,
they are a highly instructive example of Greg's sophisticated use of the
testing framework and his unusally thorough approach to deciding
what and how to test. Lib/test/test_sets.py (692 lines)
Learning from Greg's example enabled me to use similar ideas in
developing Lib/test/test_random.py (298 lines).


Bengt Richter

unread,
Aug 12, 2003, 11:28:54 AM8/12/03
to
On Tue, 12 Aug 2003 06:57:05 GMT, "Raymond Hettinger" <vze4...@verizon.net> wrote:

>Found in a pamphlet at a pre-school:
>---------------------------------------
>Reading improves vocabulary
>Reading raises cultural literacy through shared knowledge
>Reading develops writing skills
>Reading opens the mind to new ways of understanding
>Reading is fun
>
>
>Accordingly, I suggest the following works of literature:
>
> * heapq.py (255 lines)
> * sets.py (536 lines)
> * textwrap.py (355 lines)
> * csv.py (427 lines)
>
>These make enjoyable reading, cover interesting topics/algorithms,
>demonstrate superb modern python technique, and showcase
>effective use of Python's newer features.
>
>Learn from the masters:
> Pinard, O'Connor, Peters, Wilson, Martelli, van Rossum,
> Ward, Montanaro, Altis, Drake, and others
>

Thanks for the nudge. It reminds me: I've wondered if a Python Reading Club
could work. I.e., with agreed-upon reading schedule and discussion, like some
folks do with non-technical stuff.

Anyway, it might be a good nudge to get me to read the module docs I haven't
yet looked at, or cpython sources, or ... sigh, not enough time ;-/

As a byproduct, IWT it could yield a lot of doc usability info and ideas for
improvement. Maybe people would be motivated to rewrite selected paragraphs.
Maybe there could be a wiki for this kind of thing, so they could just do it
when the motivation was hot.

There could be different level groups for different interests, e.g., for
tutorial reading vs metaclass arcana vs C extensions vs threading, vs
review of all the __xxx__ definitions, or other systematic topic coverage, etc.

The only trouble is finding time for everything.

Regards,
Bengt Richter

Joe Cheng

unread,
Aug 12, 2003, 11:56:11 AM8/12/03
to
> Accordingly, I suggest the following works of literature:
>
> * heapq.py (255 lines)
> * sets.py (536 lines)
> * textwrap.py (355 lines)
> * csv.py (427 lines)
>
> These make enjoyable reading, cover interesting topics/algorithms,
> demonstrate superb modern python technique, and showcase
> effective use of Python's newer features.

I read heapq.py from here:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/he
apq.py

Quoting from the comments:

"""Usage:

heap = [] # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0] # smallest item on the heap without popping it
heapify(x) # transforms list into a heap, in-place, in linear time
item = heapreplace(heap, item) # pops and returns smallest item, and adds
# new item; the heap size is unchanged"""

It might just be my Java background creeping in (I'm a Python newbie), but,
wouldn't it be better if this was OO?

heap = Heap()
heap.push(item)
item = heap.pop()
item = heap[0]
heapified = Heap(x)
item = heap.replace(item)

Otherwise the user could easily break the heap by doing something dumb to
the list...

Joe Cheng

unread,
Aug 12, 2003, 2:32:59 PM8/12/03
to
> > It might just be my Java background creeping in (I'm a
> Python newbie), but,
> > wouldn't it be better if this was OO?
> >
> > heap = Heap()
> > heap.push(item)
> > item = heap.pop()
> > item = heap[0]
> > heapified = Heap(x)
> > item = heap.replace(item)
> >
> > Otherwise the user could easily break the heap by doing
> something dumb to
> > the list...
>
> True. But the flexibility of using the builtin is also nice. For
> example, you can add a bunch of objects to the list, then
> heapify once,
> rather than having to call heap.push() a bunch of times (which may be
> slower, because you need to maintain the heap property after you push
> each new item.)

Hmmm, I'm not sure if I buy this particular argument. Line 5 in my
examples there was supposed to illustrate constructing a heap from a
list--so you could do pretty much the same thing.

list = [1, 19, 33, 40]
heap = Heap(list)

I hope you are not suggesting there is also something to be gained by
directly manipulating the list as a list after it has been heapified? :)

Oh, and of course a heap class should probably support a to_list()
export function (that returns a _copy_ of the internal list).

> I think the idea is that, if you want a real Heap class, you can build
> one very easily (see below). And if you don't need a heap class, you
> can gain some benefits from this approach because it is exposing and
> operating on lists directly.

I've never heard this idea before... I'll need to chew on it a little.
Is this a common notion among Pythoners?

To me it sounds disturbingly like "Procedures are more flexible than
classes because you can compose classes out of procedures." I'd worry
that the proliferation of procedures would undermine good OO, which
wants coupling at the class interface level. Meanwhile you gain nothing
by using the raw procedures instead of dealing with the class. (Haven't
thought too much about this, so I could be way off...?)

> This probably comes under "practicality beats purity". (See
> 'The Zen of
> Python', or type "import this" into your Python interpreter)

I don't consider myself an OO purist. However, in practical terms, I'd
much rather work with a "black box" heap than one where I carry around
its state in a mutable data structure...


Chad Netzer

unread,
Aug 12, 2003, 2:11:11 PM8/12/03
to
On Tue, 2003-08-12 at 08:56, Joe Cheng wrote:

> Quoting from the comments:
>
> """Usage:
>
> heap = [] # creates an empty heap
> heappush(heap, item) # pushes a new item on the heap
> item = heappop(heap) # pops the smallest item from the heap
> item = heap[0] # smallest item on the heap without popping it
> heapify(x) # transforms list into a heap, in-place, in linear time
> item = heapreplace(heap, item) # pops and returns smallest item, and adds
> # new item; the heap size is unchanged"""
>
> It might just be my Java background creeping in (I'm a Python newbie), but,
> wouldn't it be better if this was OO?
>
> heap = Heap()
> heap.push(item)
> item = heap.pop()
> item = heap[0]
> heapified = Heap(x)
> item = heap.replace(item)
>
> Otherwise the user could easily break the heap by doing something dumb to
> the list...

True. But the flexibility of using the builtin is also nice. For


example, you can add a bunch of objects to the list, then heapify once,
rather than having to call heap.push() a bunch of times (which may be
slower, because you need to maintain the heap property after you push
each new item.)

I think the idea is that, if you want a real Heap class, you can build


one very easily (see below). And if you don't need a heap class, you
can gain some benefits from this approach because it is exposing and
operating on lists directly.

This probably comes under "practicality beats purity". (See 'The Zen of


Python', or type "import this" into your Python interpreter)

Quick heap class (any error corrections are appreciated):
---------------------------------------------------------

import heapq

class Heap:
def __init__( self, heap=[] ):
heapq.heapify( heap )
self._heap = heap

def __getitem__( self, i ):
return self._heap[ i ]

def push( self, item ):
return heapq.heappush( self._heap, item )

def pop( self ):
return heapq.heappop( self._heap )

def replace( self, item ):
return heapq.heapreplace( self._heap, item )

if __name__ == '__main__':
# Tests
heap = Heap()
heap.push(3)
heap.push(2)
heap.push(1)
item = heap.pop()

assert item == 1

item = heap[0]
assert item == 2

item = heap.replace(4)
assert item == 2
assert heap[0] == 3
assert heap[1] == 4

--
Chad Netzer


Chad Netzer

unread,
Aug 12, 2003, 4:00:08 PM8/12/03
to
On Tue, 2003-08-12 at 11:32, Joe Cheng wrote:

> > True. But the flexibility of using the builtin is also nice. For
> > example, you can add a bunch of objects to the list, then
> > heapify once,

> Hmmm, I'm not sure if I buy this particular argument. Line 5 in my


> examples there was supposed to illustrate constructing a heap from a
> list--so you could do pretty much the same thing.

True. I am am very sympathetic with your point of view, because I am
developing a signal transform package which has many of the same
issues. I take a signal (say a list of numbers), do a transform, then
return the coefficients. Rather than having a function which does the
transform, returning just list of coefficients, I make a class which
provide transformer objects. These objects do the work, and return the
coefficients bundled in another yet object which contains the
transformed coefficients, plus other data that is needed to do an
inverse transform.

So, in essence, I think in terms of services. An object provides the
"service" of transforming some raw data. The resulting object also
provides the "service" of (properly) inverting the data, as well as
doing some simple manipulations of it.

The reason for doing things this way, is because, as you say, exposing
just a list would allow for the opportunity for the results to be
improperly mutated or mishandled (ie. pop()-ing from the resulting list
would be disastrous). However, when I show this approach to colleages,
many grit and gnash their teeth. They would like to simply have the
transformed coefficients to operate on directly (this is what they are
used to doing in functional paradigms). They know the result of a
transform is a list of coefficients (that is what the textbooks say),
and they do not want perform extra steps to "pull out" just the
coefficients.

In general, this is a widespread concern in fields where raw data is
manipulated with algorithms. Should one encapsulate resultant data (ie.
bundle the ralted peices all up in a special class), or just return data
as simple, common objects (ie. lists and arrays) that can be operated on
universally.

I'd say that this case (heapq) is similar, and the author simply made
the choice that was comfortable with him. Still, since the heap is
defined more by operations than state (ie. other than a list, there
isn't much extra state that needs to be encapsulated in a heap object),
I think I also prefer the approach of having a class that defines just
what operations can be performed. Construction can easily be used to
make new heaps, and (in principle), a list comprehension can easily
extract the list from the heap object. In this way, when you pass me a
heap, I know it is a heap by its type.


> I hope you are not suggesting there is also something to be gained by
> directly manipulating the list as a list after it has been heapified? :)

I won't, but perhaps others will. :) I honestly don't know what is to
be gained from operating on the list directly, once it has been
heapified. Probably it conforms to an existing, popular, interface for
dealing with heaps (ie. C arrays, with functions that manipulate them).
Or perhaps it is so foten operated on as a list, that it was easiest to
always keep it a list.


> I've never heard this idea before... I'll need to chew on it a little.
> Is this a common notion among Pythoners?
>
> To me it sounds disturbingly like "Procedures are more flexible than
> classes because you can compose classes out of procedures."

I don't think most Pythoners would accept that quote. I just think, in
this case, some might say "Try to use the builtin types to do your
work. If you create functions that manipulate them, you can also reuse
those functions in your classes." :)

Or even more likely, "In Python, it is easy to stick a Class interface
onto a functional API. If that makes you comfortable, do it." :)

I offer those partly in jest, and partly as serious advice.

Perhaps Kevin O'Connor or Tim Peters will comment more on the heapq
interface. One could always propose an (additional) class based
interface to be included as a standard part of the module, with likely
some >0 percent chance of it being accepted.

In any case, I think it raises some good issues worth thinking about and
discussing.

--
Chad Netzer


Andrew Dalke

unread,
Aug 12, 2003, 7:54:26 PM8/12/03
to
Chad Netzer

> Quick heap class (any error corrections are appreciated):
> ---------------------------------------------------------

> def __init__( self, heap=[] ):

make that
def __init__(self, heap = None):
if heap is None:
heap = []

Otherwise, all Heaps created with default args will share the
same data.

Andrew
da...@dalkescientific.com


Andrew Dalke

unread,
Aug 12, 2003, 8:00:25 PM8/12/03
to
Joe Cheng:

> It might just be my Java background creeping in (I'm a Python newbie),
but,
> wouldn't it be better if this was OO?

Here's perhaps the definitive statement on the topic, from Tim Peters:
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1620162

Summary: heapq is a concrete interface, not an abstract one. It doesn't
try to encompass the different ways to do heaps. It's like bisect in that
it works on an existing data type.

Andrew
da...@dalkescientific.com


John J. Lee

unread,
Aug 12, 2003, 8:41:33 PM8/12/03
to
Chad Netzer <cne...@sonic.net> writes:

> On Tue, 2003-08-12 at 08:56, Joe Cheng wrote:
>
> > Quoting from the comments:
> >
> > """Usage:
> >
> > heap = [] # creates an empty heap
> > heappush(heap, item) # pushes a new item on the heap

[...]


> > It might just be my Java background creeping in (I'm a Python newbie), but,
> > wouldn't it be better if this was OO?
> >
> > heap = Heap()
> > heap.push(item)

[...]


> > Otherwise the user could easily break the heap by doing something dumb to
> > the list...
>
> True. But the flexibility of using the builtin is also nice. For
> example, you can add a bunch of objects to the list, then heapify once,
> rather than having to call heap.push() a bunch of times (which may be
> slower, because you need to maintain the heap property after you push
> each new item.)

I don't know what the design goals were, but perhaps there is benefit
in having heapq generic, rather than rigidly OO. Certainly Numeric
was deliberately designed this way -- so ufuncs could be applied to
any old sequence, not just Numeric arrays (do I mean ufuncs?... it's
been a while since I used Numeric).


> I think the idea is that, if you want a real Heap class, you can build
> one very easily (see below). And if you don't need a heap class, you

Certainly true, as Chad goes on to prove.


John

John J. Lee

unread,
Aug 12, 2003, 8:56:03 PM8/12/03
to
"Andrew Dalke" <ada...@mindspring.com> writes:

That URL comes up blank for me. Found this, though:

http://www.python.org/dev/summary/2003-04-16_2003-04-30.html

| The idea of turning the heapq module into a class came up, and later
| led to the idea of having a more proper FIFO (First In, First Out)
| data structure. Both ideas were shot down. The reason for this was
| that the stdlib does not need to try to grow every single possible
| data structure in programming. Guido's design philosophy is to have a
| few very powerful data structures that other ones can be built off
| of. This is why the bisect and heapq modules just work on standard
| lists instead of defining a new class. Queue is an exception, but it
| is designed to mediate messages between threads instead of being a
| general implementation of a queue.


John

Fredrik Lundh

unread,
Aug 13, 2003, 4:03:30 AM8/13/03
to
Andrew Dalke wrote:

> Summary: heapq is a concrete interface, not an abstract one. It doesn't
> try to encompass the different ways to do heaps. It's like bisect in that
> it works on an existing data type.

more importantly, it works on *any* existing data type, as long as the
type behaves like a mutable sequence.

</F>


Michele Simionato

unread,
Aug 13, 2003, 12:04:39 PM8/13/03
to
"Joe Cheng" <jmc...@alum.mit.edu> wrote in message news:<mailman.106071326...@python.org>...

> To me it sounds disturbingly like "Procedures are more flexible than
> classes because you can compose classes out of procedures."

I would subscribe that. Not that I dislike inheritance, but I don't kill
a mosquito with a bazooka.

Let me give a real life example, happened to me recently.

I had a class manipulating text, with a "dedent" method. It turns out
that the "textwrap" module included in the Python 2.3 distribution
contains a "dedent" function doing exactly the same.
Then I had the satisfaction of killing my own implementation,
and to add to my class the textwrap.dedent function just with one
line of code, "dedent=staticmethod(textwrapper.dedent)". I am
very happy with that because:

1) I shortened my class;
2) I reused pre-existing code;
3) I trust the developers of the standard library more than myself;
4) If the standard library contains bugs, they are much more easily
discovered than my own bugs;
5) the burden to fix them is up the Python developers, not me ;)

The fact that "dedent" was a function and not a method in a class
made my life easier. I had not to worry about inheriting from another
class with potential name clashes with my own, and "dedent" was the
only function I needed.

Fortunately, quite a lot of modules in the standard library are
written without a class interface and I would date say I have never
seen an example of usage of a class when the class is not needed.

In other words: most of the time a lightweight approach is more than
appropriate, why should I be forced to take a heavy weight approach?
The fact of having "free" functions (i.e. not bounded to classes) is
to me a big strenght of Python and it helps reuse of code quite a lot.

To reuse classes is good, but typically only works when you know
about the class you want to inherit *before* you start coding your
own class; on the other hand, it is quite easy to add functions
or methods to your class even *after* you wrote it.

Moreover, this is nothing wrong about using many short modules collecting
utilities functions, you will never clutter your namespace, if you use
a minimum of care (even globals are globals only in their module,
I love that! ;)

I tend to code in terms of small functions, then I compose them in
small classes, then I compose the classes in modules. When I am
done, I play with metaclasses if I need to modify what I wrote
with a minimum of effort.

There are quite few languages that can give you such a flexibility,
and no one simpler to use than Python.

Just my own view,


Michele

Harry George

unread,
Aug 13, 2003, 1:03:55 PM8/13/03
to
mi...@pitt.edu (Michele Simionato) writes:

On the other hand, I have a "TabbedWriter" module which uses a class
which has a dedent (actually "undent") method. This class has
internal memory to track current indentation level (e.g., for indented
XML files). Dedenting need to know this information.

If I had used a module with dedent as a function and saved level as a
global in the module, I would be limited to only 1 tabbed writer per
application. I often need several going at once.

>
>
> Michele

--
harry.g...@boeing.com
6-6M31 Knowledge Management
Phone: (425) 342-5601

John J. Lee

unread,
Aug 15, 2003, 5:52:31 PM8/15/03
to
mi...@pitt.edu (Michele Simionato) writes:

> "Joe Cheng" <jmc...@alum.mit.edu> wrote in message
> news:<mailman.106071326...@python.org>...
> > To me it sounds disturbingly like "Procedures are more flexible than
> > classes because you can compose classes out of procedures."
>
> I would subscribe that. Not that I dislike inheritance, but I don't kill
> a mosquito with a bazooka.
>
> Let me give a real life example, happened to me recently.
>
> I had a class manipulating text, with a "dedent" method. It turns out
> that the "textwrap" module included in the Python 2.3 distribution
> contains a "dedent" function doing exactly the same.
> Then I had the satisfaction of killing my own implementation,
> and to add to my class the textwrap.dedent function just with one
> line of code, "dedent=staticmethod(textwrapper.dedent)". I am
> very happy with that because:

[...snip usual reasons for reusing code...]

> The fact that "dedent" was a function and not a method in a class
> made my life easier. I had not to worry about inheriting from another
> class with potential name clashes with my own, and "dedent" was the
> only function I needed.
>
> Fortunately, quite a lot of modules in the standard library are
> written without a class interface and I would date say I have never
> seen an example of usage of a class when the class is not needed.
>
> In other words: most of the time a lightweight approach is more than
> appropriate, why should I be forced to take a heavy weight approach?
> The fact of having "free" functions (i.e. not bounded to classes) is
> to me a big strenght of Python and it helps reuse of code quite a lot.

All fine.


> To reuse classes is good, but typically only works when you know
> about the class you want to inherit *before* you start coding your
> own class; on the other hand, it is quite easy to add functions
> or methods to your class even *after* you wrote it.

[...]

But I don't think that makes any sense. As I'm certain you know,
there would be nothing to stop you using a class in exactly the same
way you used the function, because reuse != inheritance. Functions
can be implemented in terms of classes, and classes can be implemented
in terms of functions.

The only reason it's good that Python library uses functions sometimes
instead of classes is as you say: KISS. If it only needs a function,
use a function.


John

Michele Simionato

unread,
Aug 16, 2003, 8:30:25 AM8/16/03
to
j...@pobox.com (John J. Lee) wrote in message news:<87ekzms...@pobox.com>...

> mi...@pitt.edu (Michele Simionato) writes:
> > To reuse classes is good, but typically only works when you know
> > about the class you want to inherit *before* you start coding your
> > own class; on the other hand, it is quite easy to add functions
> > or methods to your class even *after* you wrote it.
> [...]
>
> But I don't think that makes any sense. As I'm certain you know,
> there would be nothing to stop you using a class in exactly the same
> way you used the function, because reuse != inheritance. Functions
> can be implemented in terms of classes, and classes can be implemented
> in terms of functions.

Let me see if I understand what you mean. You are saying "I could invoke
an instance method just as I invoke a function, without using inheritance,
and reuse the method in another class". True, but this is clumsy: the point
of using a class is making use of inheritance for code reuse, otherwise I
could just use a function, isn't it? Besides, typically an instance
method is doing something to the 'self' argument and may have unwanted
side effects on the object; a function is safer in this respect.
In my experience, it takes a certain effort to code a reusable class;
it takes a much lesser effort to code a reusable function.

> The only reason it's good that Python library uses functions sometimes
> instead of classes is as you say: KISS. If it only needs a function,
> use a function.
>
> John

Yep.

Michele

John J. Lee

unread,
Aug 16, 2003, 6:47:11 PM8/16/03
to
mi...@pitt.edu (Michele Simionato) writes:

> j...@pobox.com (John J. Lee) wrote in message news:<87ekzms...@pobox.com>...
> > mi...@pitt.edu (Michele Simionato) writes:
> > > To reuse classes is good, but typically only works when you know
> > > about the class you want to inherit *before* you start coding your
> > > own class; on the other hand, it is quite easy to add functions
> > > or methods to your class even *after* you wrote it.
> > [...]
> >
> > But I don't think that makes any sense. As I'm certain you know,
> > there would be nothing to stop you using a class in exactly the same
> > way you used the function, because reuse != inheritance. Functions
> > can be implemented in terms of classes, and classes can be implemented
> > in terms of functions.
>
> Let me see if I understand what you mean. You are saying "I could invoke
> an instance method just as I invoke a function, without using inheritance,
> and reuse the method in another class". True, but this is clumsy: the point

I don't think composition is clumsy.


> of using a class is making use of inheritance for code reuse, otherwise I
> could just use a function, isn't it?

Composition doesn't preclude inheritance in the class hierarchy you're
reusing by composition. In fact, that's usually what happens. And
extension of interfaces (which is what you were talking about,
presumably with implementation inheritance of the old methods) isn't
the only useful thing about classes -- associating functions with data
is useful on its own, as is plain old interface inheritance without
extension.


> Besides, typically an instance
> method is doing something to the 'self' argument and may have unwanted
> side effects on the object; a function is safer in this respect.

I don't understand.

class foo:
def ni(self): print "ni"

class bar:
def __init__(self): self.foo = foo()
def NININI(self):
for i in range(3): self.foo.ni()


> In my experience, it takes a certain effort to code a reusable class;
> it takes a much lesser effort to code a reusable function.

s/reusable class/inheritable class/, and add the proviso that it's
just as easy to reuse classes by composition as it is functions, and
I'd agree. But you knew that!


> > The only reason it's good that Python library uses functions sometimes
> > instead of classes is as you say: KISS. If it only needs a function,
> > use a function.
> >
> > John
>
> Yep.
>
> Michele


John

0 new messages