Why do Pythoneers reinvent the wheel?

Stefano Masini

unread,

Sep 9, 2005, 11:17:38 AM9/9/05

to pytho...@python.org

On 8 Sep 2005 08:24:50 -0700, Fuzzyman <fuzz...@gmail.com> wrote:
> What is pythonutils ?
> =====================
> ConfigObj - simple config file handling
> validate - validation and type conversion system
> listquote - string to list conversion
> StandOut - simple logging and output control object
> pathutils - for working with paths and files
> cgiutils - cgi helpers
> urlpath - functions for handling URLs
> odict - Ordered Dictionary Class

Fuzzyman, your post reminded me of something I can't stop thinking
about. Please don't take this as a critique on your work. I place
myself on the same side of yours.
I just wanted to share this thought with everybody had an opinion about it.

I wonder how many people (including myself) have implemented their own
versions of such modules, at least once in their pythonic life. I
indeed have my own odict (even same name! :). My own pathutils
(different name, but same stuff). My own validate... and so forth.

This is just too bad.
There are a few ares where everybody seems to be implementing their
own stuff over and over: logging, file handling, ordered dictionaries,
data serialization, and maybe a few more.
I don't know what's the ultimate problem, but I think there are 3 main reasons:
1) poor communication inside the community (mhm... arguable)
2) lack of a rich standard library (I heard this more than once)
3) python is such an easy language that the "I'll do it myself" evil
side lying hidden inside each one of us comes up a little too often,
and prevents from spending more time on research of what's available.

It seems to me that this tendency is hurting python, and I wonder if
there is something that could be done about it. I once followed a
discussion about placing one of the available third party modules for
file handling inside the standard library. I can't remember its name
right now, but the discussion quickly became hot with considerations
about the module not being "right" enough to fit the standard library.
The points were right, but in some sense it's a pity because by being
in the stdlib it could have had a lot more visibility and maybe people
would have stopped writing their own, and would have begun using it.
Then maybe, if it was not perfect, people would have begun improving
it, and by now we would have a solid feature available to everybody.

mhm... could it be a good idea to have two versions of the stdlib? One
stable, and one testing, where stuff could be thrown in without being
too picky, in order to let the community decide and improve?

Again, Fuzzyman, your post was just the excuse to get me started. I
understand and respect your work, also because you put the remarkable
effort to make it publicly available.

That's my two cents,
stefano

Michael Amrhein

unread,

Sep 9, 2005, 12:06:28 PM9/9/05

to

Stefano Masini schrieb:

Did you take a look at pyPI (http://www.python.org/pypi) ?
At least you'd find another odict ...
;-) Michael

djw

unread,

Sep 9, 2005, 12:23:22 PM9/9/05

to

Stefano Masini wrote:

> I don't know what's the ultimate problem, but I think there are 3 main reasons:
> 1) poor communication inside the community (mhm... arguable)
> 2) lack of a rich standard library (I heard this more than once)
> 3) python is such an easy language that the "I'll do it myself" evil
> side lying hidden inside each one of us comes up a little too often,
> and prevents from spending more time on research of what's available.
>

I think, for me, this most important reason is that the stdlib version
of a module doesn't always completely fill the requirements of the
project being worked on. That's certainly why I wrote my own, much
simpler, logging module. In this case, its obvious that the original
author of the stdlib logging module had different ideas about how
straightforward and simple a logging module should be. To me, this just
demonstrates how difficult it is to write good library code - it has to
try and be everything to everybody without becoming overly general,
abstract, or bloated.

-Don

Dave Brueck

unread,

Sep 9, 2005, 12:48:40 PM9/9/05

to pytho...@python.org

Stefano Masini wrote:
> I wonder how many people (including myself) have implemented their own
> versions of such modules, at least once in their pythonic life. I
> indeed have my own odict (even same name! :). My own pathutils
> (different name, but same stuff). My own validate... and so forth.
>
> This is just too bad.
> There are a few ares where everybody seems to be implementing their
> own stuff over and over: logging, file handling, ordered dictionaries,
> data serialization, and maybe a few more.

> I don't know what's the ultimate problem, but I think there are 3 main reasons:
> 1) poor communication inside the community (mhm... arguable)
> 2) lack of a rich standard library (I heard this more than once)
> 3) python is such an easy language that the "I'll do it myself" evil
> side lying hidden inside each one of us comes up a little too often,
> and prevents from spending more time on research of what's available.

IMO the reason is something similar to #3 (above and beyond #1 and #2 by a long
shot). The cost of developing _exactly_ what you need often is (or at least
*appears* to be) the same as or lower than bending to use what somebody else has
already built.

(my wheel reinvention has typically covered config files, logging, and simple
HTTP request/response/header processing)

> It seems to me that this tendency is hurting python

I think it helps on the side of innovation - the cost of exploring new ideas is
cheaper than in many other languages, so in theory the community should be able
to stumble upon truly great ways of doing things faster than would otherwise be
possible. The problem lies in knowing when we've found that really good way of
doing something, and then nudging more and more people to use it and refine it
without turning it into a bloated one-size-fits-all solution.

I think we have half of what we need - people like Fuzzyman coming up with handy
modules and then making them available for others to use. But right now it's
hard for a developer to wade through all the available choices out there and
know which one to pick.

Maybe instead of being included in the standard library, some modules could at
least attain some "recommended" status by the community. You can't exactly tell
people to stop working on their pet project because it's not special or
different enough from some other solution, so maybe the solution is to go the
other direction and single out some of the really best ones, and hope that the
really good projects can begin to gain more momentum.

For example, there are several choices available to you if you need to create a
standalone Windows executable; if it were up to me I'd label py2exe "blessed by
the BDFL!", ask the other tool builders to justify the existence of their
alternatives, and then ask them to consider joining forces and working on py2exe
instead. But of course I'm _not_ in charge, I don't even know if the BDFL likes
py2exe, and it can be really tough knowing which 1 or 2 solutions should receive
recommended status.

FWIW, RubyOnRails vs all the Python web frameworks is exactly what you're
talking about. What makes ROR great has little to do with technology as far as I
can tell, it's all about lots of people pooling their efforts - some of them
probably not seeing things develop precisely as they'd prefer, but remaining
willing to contribute anyway.

Many projects (Python-related or not) often seem to lack precisely what has
helped Python itself evolve so well - a single person with decision power who is
also trusted enough to make good decisions, such that when disagreements arise
they don't typically end in the project being forked (the number of times people
disagreed but continued to contribute to Python is far higher than the number of
times they left to form Prothon, Ruby, and so on).

In the end, domain-specific BDFLs and their projects just might have to buble to
the top on their own, so maybe the best thing to do is find the project you
think is the best and then begin contributing and promoting it.

> and I wonder if
> there is something that could be done about it. I once followed a
> discussion about placing one of the available third party modules for
> file handling inside the standard library. I can't remember its name
> right now, but the discussion quickly became hot with considerations
> about the module not being "right" enough to fit the standard library.

I think an extremely rich standard library is both a blessing and a curse. It's
so handy to have what you need already there, but as you point out it becomes
quite a debate to know what should be added. For one, a module to be added needs
to be sufficiently broad in scope and power to be widely useful, but this often
breeds complexity (e.g. the logging package added in Py2.3 sure looks powerful,
but other than playing around with it for a few minutes I've never used it in a
real app because it's a little overwhelming and it seems easier to just use a
quickie logging function that does all I need).

Having two versions of the standard lib probably wouldn't solve anything - you'd
still have debates about what goes in the "lite" version, but you'd also have
debates about what to include in the big version - maybe even moreso.

-Dave

Stefano Masini

unread,

Sep 9, 2005, 1:02:39 PM9/9/05

to pytho...@python.org

On 9/9/05, Michael Amrhein <mic...@adrhinum.de> wrote:
> Did you take a look at pyPI (http://www.python.org/pypi) ?
> At least you'd find another odict ...

Oh, yeah. And another filesystem abstraction layer... and another xml
serialization methodology... :)
PyPI is actually pretty cool. If I had to vote for something going
into a "testing" stdlib, I'd vote for PyPI.

You see, that's my point, we have too many! :)

stefano

Stefano Masini

unread,

Sep 9, 2005, 1:10:26 PM9/9/05

to pytho...@python.org

On 9/9/05, djw <donald...@hp.com> wrote:
> I think, for me, this most important reason is that the stdlib version
> of a module doesn't always completely fill the requirements of the
> project being worked on. That's certainly why I wrote my own, much
> simpler, logging module. In this case, its obvious that the original
> author of the stdlib logging module had different ideas about how
> straightforward and simple a logging module should be. To me, this just
> demonstrates how difficult it is to write good library code - it has to
> try and be everything to everybody without becoming overly general,
> abstract, or bloated.

That's very true. But...
...there are languages (ahem... did I hear somebody say Java? :) that
make it so hard to write code, that one usually prefers using whatever
is already available even if this means adopting a "style" that
doesn't quite match his expectations.
To me, it is not clear which is best: a very comfortable programmer
with a longer todo list, or an unfomfortable programmer with a short
todo list.
So far, I've always struggled to be in the first category, but I'm
amazed when I look back and see how many wheels I reinvented. But
maybe it's just lack of wisdom. :)

stefano

Stefano Masini

unread,

Sep 10, 2005, 1:55:24 AM9/10/05

to pytho...@python.org

On 9/9/05, Dave Brueck <da...@pythonapocrypha.com> wrote:
> shot). The cost of developing _exactly_ what you need often is (or at least
> *appears* to be) the same as or lower than bending to use what somebody else has
> already built.

That's right. But as you say, this is _often_ the case, not always.
One doesn't necessarily need to "bend" too much in order to use
something that's available out there.
If we're talking about simple stuff, like ordered dictionaries, file
system management, ini files roundtripping, xml serialization (this
one maybe is not that trivial...), I don't think you would have to
come to big compromises.

I myself reinvented these wheels a few times in different projects,
because I wasn't happy with the way I reinvented the first time, then
I eventually found some code written by someone else that was
_exactly_ the same as my last attempt, my most evolved and "perfect",
my prescioussss :), if it wasn't even better. Separate paths of
evolution that converged to the same solution, because the problem was
simple to begin with. Under this light, it seems to me that I wasted a
lot of time. If odict was in the stdlib I wouldn't have bothered
writing it.

And yet, this code is not available in the stdlib. Sometimes it's not
even trivial to be googled for. Plus, if you think of a python
beginner, what's the chance that he's gonna say: naa, this code in the
library sucks. I'm gonna search google for another ini file round
tripper. Whatever is available there, he's gonna use, at least in the
beginning. Then he will soon figure out that it indeed sucks, and at
that point there's a chance that he'll say: man... _python_ sucks! I
cannot even round trip an ini file with the same module!

That's why I say this lack of a centralized, officially recommended
code repository maybe is hurting python.

I agree that building library code is hard because it has to be both
correct and simple. But, again, there's a lot of useful stuff not the
library, that's simple in the start so it's just a matter of writing
it correctly. If the semantics can be different, just provide a couple
of alternatives, and history will judge.

It would be great if there was a section in the Python manual like this:

"Quick and Dirty: Commonly needed tricks for real applications"

1. odict
2. file system management
3. xml (de)serialization
4. ...

Each section would describe the problem and list one or a few
recommended implementations. All available under __testing_stdlib__.
Appoint somebody as the BDFL and make the process of updating the
testing stdlib democratic enough to allow for more evolution freedom
than the stable stdlib.

If such a "quick and dirty" section existed, I think it would also
become a natural randevouz point for innovators. If one invented a new
cool, simple and useful module, rather than just publishing it in his
own public svn repository, he would feel comfortable to start a
discussion on the python-testing-stdlib mailing list suggesting to
include it in the "quick and dirty" section of the manual. The manual
is the primary resource that every python programmer makes use of,
expecially beginners. But it is so official that no one would ever
dare suggesting to include something in it. If the Vaults of Parnassus
were listed in there (maybe a bit trimmed and evaluated first ;) a
beginner would have immediate access to the most common tricks that
one soon faces when it comes to writing real applications.

I'm talking wildly here... I'm quite aware of how simplistic I made it.
Just throwing an idea.

What do you think?

stefano

Tim Daneliuk

unread,

Sep 10, 2005, 2:10:59 AM9/10/05

to

Stefano Masini wrote:

<SNIP>

> I wonder how many people (including myself) have implemented their own
> versions of such modules, at least once in their pythonic life. I
> indeed have my own odict (even same name! :). My own pathutils
> (different name, but same stuff). My own validate... and so forth.

As someone who implemented their own configuration mini-language
with validation, blah, blah, blah (http://www.tundraware.com/Software/tconfpy/)
I can give you a number of reasons - all valid for different people at
different times:

1) The existing tool is inadequate for the task at hand and OO subclassing
is overrated/overhyped to fix this problem. Even when you override
base classes with your own stuff, you're still stuck with the larger
*architecture* of the original design. You really can't subclass
your way out of that, hence new tools to do old things spring into
being.

2) It's a learning exercise.

3) You don't trust the quality of the code for existing modules.
(Not that *I* have this problem :-p but some people might.)

--
----------------------------------------------------------------------------
Tim Daneliuk tun...@tundraware.com
PGP Key: http://www.tundraware.com/PGP/

Stefano Masini

unread,

Sep 10, 2005, 2:53:24 AM9/10/05

to pytho...@python.org

On 10 Sep 2005 02:10:59 EDT, Tim Daneliuk <tun...@tundraware.com> wrote:
> As someone who implemented their own configuration mini-language
> with validation, blah, blah, blah (http://www.tundraware.com/Software/tconfpy/)

Well, a configuration mini language with validation and blahs is not
exactly what I would call _simple_... :) so maybe it doesn't even fit
into my idea of testing-stdlib, or "quick and dirty" section of the
manual (see my other post).
But certainly it would be worth mentioning in the list of available
solutions under the subsection "Configuration files handling".

> 1) The existing tool is inadequate for the task at hand and OO subclassing
> is overrated/overhyped to fix this problem. Even when you override
> base classes with your own stuff, you're still stuck with the larger
> *architecture* of the original design. You really can't subclass
> your way out of that, hence new tools to do old things spring into
> being.

That's true, but usually only when the original design if too simple
comparing to the complexity of the problem. Instead a very general
solution can usually be subclassed to easily handle a simpler problem.
You still have to actually understand the general and complex design
in order to be able to write subclasses, so maybe one can be tempted
to punt on it, and write its own simple solution. But in this case it
would just be enough to propose a few solutions in the testing-stdlib:
a) one simple implementation for simple problems, easy to understand,
but limited.
b) one complex implementation for complex problems,
c) one simplified implementation for simple problems, easy to
understand, but subclassed from a complex model, that leaves room for
more understanding and extension just in case one needs more power.

I fully understand the difficulty of reusing code, as it always forces
you to a learning curve and coming to compromises. But I've also
wasted a lot of time reinventing the wheel and later found stuff I
could have happily lived with if I only had known.

> 2) It's a learning exercise.

Well, so we might as well learn a little more and rewrite os.path, the
time module and pickle. Right? :)

> 3) You don't trust the quality of the code for existing modules.
> (Not that *I* have this problem :-p but some people might.)

That's a good point, but it really boils down to being a wise
programmer on one side, being able to discern the Good from the Bad,
and an active community on the other side, able to provide good
solutions and improve them.
If either one is missing, then a lot of bad stuff can happen, and we
can't really take community decisions basing on the assumption that
programmers won't be able to understand, or that the community won't
be able to provide. So we might as well assume that we have good
programmers and an active community.
Which I think is true, by the way!
So, let's talk about a way to more effectively present available
solutions to our good programmers! :)

cheers,
stefano

Tim Daneliuk

unread,

Sep 10, 2005, 3:16:02 AM9/10/05

to

Stefano Masini wrote:

> On 10 Sep 2005 02:10:59 EDT, Tim Daneliuk <tun...@tundraware.com> wrote:
>
>>As someone who implemented their own configuration mini-language
>>with validation, blah, blah, blah (http://www.tundraware.com/Software/tconfpy/)
>
>
> Well, a configuration mini language with validation and blahs is not
> exactly what I would call _simple_... :) so maybe it doesn't even fit

It's actually not *that* complicated. Then again, the code is not
as elegant as is might be.

>
>>1) The existing tool is inadequate for the task at hand and OO subclassing
>> is overrated/overhyped to fix this problem. Even when you override
>> base classes with your own stuff, you're still stuck with the larger
>> *architecture* of the original design. You really can't subclass
>> your way out of that, hence new tools to do old things spring into
>> being.
>
>
> That's true, but usually only when the original design if too simple
> comparing to the complexity of the problem. Instead a very general
> solution can usually be subclassed to easily handle a simpler problem.
> You still have to actually understand the general and complex design
> in order to be able to write subclasses, so maybe one can be tempted
> to punt on it, and write its own simple solution. But in this case it

The problem is that for a lot of interesting problems, you don't know
the "generic" big-picture stuff until you've hacked around at small
specific examples. This is one of the deepest flaws in the gestalt of
OO, IMHO. Good OO requires just what you suggest - and understanding of
generics, specific applications, and just what to factor. But in the
early going of new problems, you simply don't know enough. For the
record, I think Python is magnificent both in allowing you to work
quickly in the "poking around" stage of things, and then later to create
the more elegant fully-featured architectures.

One other point here: In the commericial world, especially, software
tends to be a direct reflection of the organization's *processes*.
Commercial institutions distinguish themselves from one another (in an
attempt to create competitive advantage) by customizing and tuning these
business processes - well, the successful companies do, anyway. For
example, Wal-Mart is really a supply chain management company, not a
consumer goods retailer. It is their supply chain expertise and IT
systems that have knocked their competitors well into 2nd place. And
here's the important point: These distinguishing business processes are
unique and proprietary *by intent*. This means that generic software
frameworks are unlikely to serve them well as written. I realize this is
all at a level of complexity above what you had in mind, but it's easy
to forget that a significant portion of the world likes/needs/benefits
from things that are *not* particularly generic. This is thus reflected
in the software they write.

>>2) It's a learning exercise.
>
>
> Well, so we might as well learn a little more and rewrite os.path, the
> time module and pickle. Right? :)

I'm not deeply committed to that level of education at the moment :P

>
>
>>3) You don't trust the quality of the code for existing modules.
>> (Not that *I* have this problem :-p but some people might.)

> So, let's talk about a way to more effectively present available
> solutions to our good programmers! :)

Grappa?

Kay Schluehr

unread,

Sep 10, 2005, 4:14:15 AM9/10/05

to

Tim Daneliuk wrote:

> 1) The existing tool is inadequate for the task at hand and OO subclassing
> is overrated/overhyped to fix this problem. Even when you override
> base classes with your own stuff, you're still stuck with the larger
> *architecture* of the original design. You really can't subclass
> your way out of that, hence new tools to do old things spring into
> being.

Allthough I do think that you are completely wrong in principle there
is some true point in your statement: refactoring a foreign ill
designed tool that nevertheless provides some nice functionality but is
not mentioned for being extendable by 3-rd party developers is often
harder than writing a nice and even though inextendable tool on your
own. That's independent of the language allthough I tend to think that
C and Python programmers are more alike in their crude pragmatism than
Java or Haskell programmers ( some might object that it is a bit unfair
to equate Java and Haskell programmers, because no one ever claimed
that the latter need code-generators and no intelligence to do their
work ).

Kay

Stefano Masini

unread,

Sep 10, 2005, 4:38:52 AM9/10/05

to pytho...@python.org

On 10 Sep 2005 03:16:02 EDT, Tim Daneliuk <tun...@tundraware.com> wrote:
> frameworks are unlikely to serve them well as written. I realize this is
> all at a level of complexity above what you had in mind, but it's easy
> to forget that a significant portion of the world likes/needs/benefits
> from things that are *not* particularly generic. This is thus reflected
> in the software they write.

In my opinion this has got more to deal with the open source vs.
proprietary debate, that I wouldn't like to talk about, since it's
somewhat marginal.

What I was pointing out is well summarized in the subject: Why do
Pythoneers reinvent the wheel?
Reinventing the wheel (too much) is Bad for both the open source
community and industry. It's bad for development in general. I got the
feeling that in the specific case of Python the ultimate reason for
this tendency in also the same reason why this language is so much
better that others for, say, fast prototyping and exploration of new
ideas: it's simple.

So, without taking anything out of python, I'm wondering if a richer
and less formal alternative standard library would help forming a
common grounds where programmers could start from in order to build
better and reinvent less.

If such an aid to _general_ problem solving is indeed missing (I might
be wrong) from the current state of python, I don't really think the
reason is related to industry. I would look for reasons elsewhere,
like it beeing difficult to come out with effective decisional support
in an open source community, or something like this. I can certainly
see the challenge of who and how should decide what goes in the
library, and what not.

stefano

Tim Daneliuk

unread,

Sep 10, 2005, 5:36:08 AM9/10/05

to

Kay Schluehr wrote:

> Tim Daneliuk wrote:
>
>
>>1) The existing tool is inadequate for the task at hand and OO subclassing
>> is overrated/overhyped to fix this problem. Even when you override
>> base classes with your own stuff, you're still stuck with the larger
>> *architecture* of the original design. You really can't subclass
>> your way out of that, hence new tools to do old things spring into
>> being.
>
>
> Allthough I do think that you are completely wrong in principle there
> is some true point in your statement: refactoring a foreign ill
> designed tool that nevertheless provides some nice functionality but is
> not mentioned for being extendable by 3-rd party developers is often
> harder than writing a nice and even though inextendable tool on your own

It has nothing to do with being "ill designed", though that too would
pose a (different) problem. It has to do with the fact that all
realworld tools are a tradeoff between pragmatism and generic elegance.
This tradeoff yields a tool/module/library/program with some POV about
what problem it was solving. If the problem you wish to solve is not in
that same space, you can inherit, subclass and do all the usual OO
voodoo you like, you're not going to get clean results.

On a more general note, for all the promises made over 3 decades about
how OO was the answer to our problems, we have yet to see quantum
improvements in code quality and productivity even though OO is now "the
thing" everyone is supposed to subscribe to. In part, that's because it
is profoundly difficult to see the most generic/factorable pieces of a
problem until you've worked with it for a long time. Once you get past
the "a mammal is an animal" level of problems, OO starts to
self-destruct pretty quickly as the inheritance hierarchies get so
complex no mere mortal can grasp them all. This is exactly Java's
disease at the moment. It has become a large steaming pile of object
inheritance which cannot be completely grokked by a single person. In
effect, the traditional problem of finding algorithms of appropriate
complexity gets transformed into a "what should my inheritance hierarchy
be" problem.

IMHO, one of Python's greatest virtues is its ability to shift paradigms
in mid-program so that you can use the model that best fits your problem
space. IOW, Python is an OO language that doesn't jam it down your
throat, you can mix OO with imperative, functional, and list processing
coding models simultaneously.

In my view, the doctrinaire', indeed religious, adherence to OO purity
has harmed our discipline considerably. Python was a nice breath of
fresh air when I discovered it exactly because it does not have this
slavish committment to an exclusively OO model.

Tim Daneliuk

unread,

Sep 10, 2005, 5:55:42 AM9/10/05

to

Stefano Masini wrote:

> On 10 Sep 2005 03:16:02 EDT, Tim Daneliuk <tun...@tundraware.com> wrote:
>
>>frameworks are unlikely to serve them well as written. I realize this is
>>all at a level of complexity above what you had in mind, but it's easy
>>to forget that a significant portion of the world likes/needs/benefits
>>from things that are *not* particularly generic. This is thus reflected
>>in the software they write.
>
>
> In my opinion this has got more to deal with the open source vs.
> proprietary debate, that I wouldn't like to talk about, since it's
> somewhat marginal.

I think the point I was trying to make was there are times when
a generic factoring of reusable code is unimportant since the code
is so purpose-built that doing a refactoring makes no sense.

>
> What I was pointing out is well summarized in the subject: Why do
> Pythoneers reinvent the wheel?
> Reinventing the wheel (too much) is Bad for both the open source
> community and industry. It's bad for development in general. I got the

I don't share your conviction on this point. Reinventing the wheel
makes the wheel smoother, lighter, stronger, and rounder. Well,
it *can* do this. Of far greater import (I think) is whether
any particular implementation is fit to run across a breadth of
platforms. To me, a signficant benefit of Python is that I am
mostly able to program the same way across Unix, Windows, Mac
and so on.

<SNIP>

> If such an aid to _general_ problem solving is indeed missing (I might
> be wrong) from the current state of python, I don't really think the
> reason is related to industry. I would look for reasons elsewhere,
> like it beeing difficult to come out with effective decisional support
> in an open source community, or something like this. I can certainly
> see the challenge of who and how should decide what goes in the
> library, and what not.

This is too abstract for me to grasp - but I admit to be old and feeble ;)

I think what you see today in the standard library are two core ideas:
1) Modules that are more-or-less pass-through wrappers for the common
APIs found in Unix and 2) Modules needed commonly to "do the things that
applications do" like manipulate data structures or preserve active
objects on backing store. If what you want here is for everyone to agree
on a common set of these and stick exclusively to them, I think you will
be sorely disappointed. OTOH, if someone has a better/faster/smarter
reimplementation of what exists, I think you'd find the community open
to embracing incremental improvement. But there is always going to be
the case of what happened when I wrote 'tconfpy'. The existing
configuration module was nice, but nowhere near the power of what I
wanted, so I wrote something that suited me exactly (well ... sort of,
'tconfpy2' is in my head at the moment). If the community embraced
it as a core part of their work, I'd be delighted (and surprised), but
I don't need for that to happen in order for that module to have value
to *me*, even though it does not displace the existing stuff.

A.M. Kuchling

unread,

Sep 10, 2005, 7:27:41 AM9/10/05

to

On Sat, 10 Sep 2005 08:53:24 +0200,

Stefano Masini <ste...@pragma2000.com> wrote:
> Well, so we might as well learn a little more and rewrite os.path, the
> time module and pickle. Right? :)

And in fact people have done all of these:
os.path: path.py (http://www.jorendorff.com/articles/python/path/)
time: mxDateTime, the stdlib's datetime.
pickle: XML serialization, YAML.

> So, let's talk about a way to more effectively present available
> solutions to our good programmers! :)

PEP 206 (http://www.python.org/peps/pep-0206.html) suggests assembling an
advanced library for particular problem domains (e.g. web programming,
scientific programming), and then providing a script that pulls the relevant
packages off PyPI. I'd like to hear suggestions of application domains and
of the packages that should be included.

--amk

Fuzzyman

unread,

Sep 10, 2005, 10:15:30 AM9/10/05

to

> > [snip..]

> Did you take a look at pyPI (http://www.python.org/pypi) ?
> At least you'd find another odict ...

Oh right. Where ?

I remember when I started coding in Python (about two years ago) in one
of my first projects I ended up re-implementing some stuff that is in
the standard library. The standard library is *fairly* big - but the
'Python blessed' modules idea sounds good.

I've often had the problem of having to assess multiple third party
libraries/frameworks and decide which of several alternatives is going
to be best for me - without really having the information on which to
base a decision (and nor the time to try them all out). Web templating
and web application frameworks are particularly difficult in this area.

If a module is in the standard library then *most* developers will
*first* use that - and only if it's not suitable look for something
else.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python

> ;-) Michael

Martin P. Hellwig

unread,

Sep 10, 2005, 10:55:51 AM9/10/05

to

Stefano Masini wrote:
<cut reinventing wheel example>

Although I'm not experienced enough to comment on python stuff itself I
do know that in general there are 2 reasons that people reinvent the wheel:
- They didn't know of the existence of the first wheel
- They have different roads
Those reasons can even be combined.

The more difficult it is to create a new wheel the bigger the chance is
that you:
- Search longer for fitting technologies
- Adapt your road

--
mph

Paul Boddie

unread,

Sep 10, 2005, 11:01:41 AM9/10/05

to

A.M. Kuchling wrote:
> PEP 206 (http://www.python.org/peps/pep-0206.html) suggests assembling an
> advanced library for particular problem domains (e.g. web programming,
> scientific programming), and then providing a script that pulls the relevant
> packages off PyPI. I'd like to hear suggestions of application domains and
> of the packages that should be included.

I'm not against pointing people in what I consider to be the right
direction, but PEP 206 seems to be quite the lobbying instrument for
people to fast-track pet projects into the standard library (or some
super-distribution), perhaps repeating some of the mistakes cited in
that document with regard to suitability. Meanwhile, in several areas,
some of the pressing needs of the standard library would remain
unaddressed if left to some kind of Pythonic popularity contest; for
example, everyone likes to dish out their favourite soundbites and
insults about DOM-based XML APIs, but just leaving minidom in the
library in slow motion maintenance mode whilst advocating more
"Pythonic" APIs doesn't help Python's interoperability with (or
relevance to) the wider development community.

The standard library is all about providing acceptable solutions so
that other people aren't inclined or forced to write their own. Every
developer should look at their repertoire of packages and consider
which ones wouldn't need to exist if the standard library had been
better. For me, if there had been a decent collection of Web
application objects in the standard library, I wouldn't have created
WebStack; if I didn't have to insist on PyXML and then provide patches
for it in order to let others run software I created, I wouldn't have
created libxml2dom.

PEP 206 is an interesting idea but "dangerous" because as a PEP it
promotes a seemingly purely informational guide to some kind of edict,
and (speaking from experience) since a comprehensive topic guide to any
reasonable number of packages and solutions is probably too much work
that no-one really wants to do anyway, the likelihood of subjective
popularity criteria influencing the selection of presented software
means that the result may be considerably flawed. Although I see that a
common trend these days is to form some kind of narrow consensus, hype
it repeatedly and, in the name of one cause, to push another agenda
entirely, all whilst ignoring the original problems that got people
thinking in the first place, I am quite sure that as a respected Python
contributor this was not a goal of yours in writing the PEP. However,
we should all be aware of the risks of picking favourites, even if the
level of dispute around those favourites is likely to be much lower for
some packages than for others.

This overly harsh criticism really brings me to ask: what happened to
the maintenance and promotion of the python.org topic guides? Or do
people only read PEPs these days?

Paul

Mike Meyer

unread,

Sep 10, 2005, 12:35:37 PM9/10/05

to

Stefano Masini <ste...@pragma2000.com> writes:
> It would be great if there was a section in the Python manual like this:
>
> "Quick and Dirty: Commonly needed tricks for real applications"
>
> 1. odict
> 2. file system management
> 3. xml (de)serialization
> 4. ...
>
> Each section would describe the problem and list one or a few
> recommended implementations. All available under __testing_stdlib__.
> Appoint somebody as the BDFL and make the process of updating the
> testing stdlib democratic enough to allow for more evolution freedom
> than the stable stdlib.

Why are you reinvinting the wheel?

Things that are very similar to this already exist. There are a number
of code repositories/listings around - PyPI being the obvious example,
because it's linked to from python.org. The Python cookbook site is
closer to what you're talking about, though.

And the latter reveals the problem with your suggestion - this
"section" would pretty much outweigh the rest of the
documentation. Clearly, it should be a separate book - but we've
already got that.

However, a list of these isn't in the manual. The only thing I could
find was a page on the Wiki about "How to publish a module". That
doesn't seem likely to be found by someone trying to find code.

I think the manual does need a section on how to find code other than
the library. But where do you put it?

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Aahz

unread,

Sep 10, 2005, 12:39:35 PM9/10/05

to

In article <97mav2-...@eskimo.tundraware.com>,

Tim Daneliuk <tun...@tundraware.com> wrote:
>
>IMHO, one of Python's greatest virtues is its ability to shift paradigms
>in mid-program so that you can use the model that best fits your problem
>space. IOW, Python is an OO language that doesn't jam it down your
>throat, you can mix OO with imperative, functional, and list processing
>coding models simultaneously.
>
>In my view, the doctrinaire', indeed religious, adherence to OO purity
>has harmed our discipline considerably. Python was a nice breath of
>fresh air when I discovered it exactly because it does not have this
>slavish committment to an exclusively OO model.

+1 QOTW
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

Aahz

unread,

Sep 10, 2005, 12:42:17 PM9/10/05

to

In article <mailman.173.1126284...@python.org>,

Dave Brueck <da...@pythonapocrypha.com> wrote:
>
>Many projects (Python-related or not) often seem to lack precisely
>what has helped Python itself evolve so well - a single person with
>decision power who is also trusted enough to make good decisions, such
>that when disagreements arise they don't typically end in the project
>being forked (the number of times people disagreed but continued to
>contribute to Python is far higher than the number of times they left
>to form Prothon, Ruby, and so on).
>
>In the end, domain-specific BDFLs and their projects just might have
>to buble to the top on their own, so maybe the best thing to do is
>find the project you think is the best and then begin contributing and
>promoting it.

You've got a point there -- reStructuredText seems to be succeeding
precisely in part because David Goodger is the BDFNow.

Message has been deleted

Bengt Richter

unread,

Sep 10, 2005, 3:20:42 PM9/10/05

to

On Sat, 10 Sep 2005 16:55:51 +0200, "Martin P. Hellwig" <mhel...@xs4all.nl> wrote:

>Stefano Masini wrote:
><cut reinventing wheel example>
>
>Although I'm not experienced enough to comment on python stuff itself I
>do know that in general there are 2 reasons that people reinvent the wheel:
>- They didn't know of the existence of the first wheel
>- They have different roads

- They want the feeling that they are in the same league as the original inventor ;-)

>Those reasons can even be combined.
>
>The more difficult it is to create a new wheel the bigger the chance is
>that you:
>- Search longer for fitting technologies
>- Adapt your road

- Think more carefully about ego satisfaction cost/benefit vs getting the job done ;-)

Regards,
Bengt Richter

Tim Daneliuk

unread,

Sep 10, 2005, 6:57:07 PM9/10/05

to

Dennis Lee Bieber wrote:

> On 10 Sep 2005 05:36:08 EDT, Tim Daneliuk <tun...@tundraware.com>
> declaimed the following in comp.lang.python:

>
>
>
>>On a more general note, for all the promises made over 3 decades about
>>how OO was the answer to our problems, we have yet to see quantum
>
>

> OO goes back /that/ far? (2 decades, yes, I might even go 2.5
> decades for academia <G>). My college hadn't even started "structured
> programming" (beyond COBOL's PERFORM statement) by the time I graduated
> in 1980. Well, okay... SmallTalk... But for most of the "real world", OO
> became a known concept with C++ mid to late 80s.
>

OO ideas predate C++ considerably. The idea of encapsulation and
abstract data types goes back to the 1960s IIRC. I should point
out that OO isn't particularly worse than other paradigms for
claiming to be "The One True Thing". It's been going on for
almost a half century. I've commented on this previously:

http://www.tundraware.com/Technology/Bullet/

François Pinard

unread,

Sep 10, 2005, 8:24:32 PM9/10/05

to Tim Daneliuk, pytho...@python.org

[Tim Daneliuk]

> OO ideas predate C++ considerably. The idea of encapsulation and
> abstract data types goes back to the 1960s IIRC.

Did not Simula-67 have it all already?

When C++ came along, much later, I asked someone knowledgeable in the
field of language design what was his opinion about C++. He answered
very laconically: Simula-- . And this was not far from fully true:
Simula had many virtues which are still missing from C++.

Moreover, a language like Simula cannot be made up of thin air, it only
crystallizes a long maturation of many trends. The term "OO" may have
been coined later, but the concepts were already there. In computer
science, I often saw old concepts resurrecting with new names, and then
mistaken for recent inventions. New ideas are not so frequent...

--
François Pinard http://pinard.progiciels-bpi.ca

Steve Holden

unread,

Sep 11, 2005, 1:21:17 AM9/11/05

to pytho...@python.org

Indeed, the simple answer to the original question is "because they
can". Python as a language attracts many people who aren't already
familiar with programming methods, which is why this list sees many
questions with relatively simple answers. I love the way the responses
determinedly refuse to put the questioners down for the simplicity of
the questions: we all have to learn, after all.

Generally as we get more experienced in programming we will spend a much
larger amount of time looking for (and carefully evaluating) existing
solutions to a problem, and rather less time trying to write our own
code to solve a problem.

Python's elegance and simplicity encourages people with less programming
experience to attempt solutions to larger problems, albeit with varying
degrees of success. So, despite the language's "There should be one (and
preferably only one) obvious way to do it" philosophy, we often end up
with many "competing" solutions to a given problem.

While this can sometimes be tedious, it's probably an overall indication
of Python's health.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Bruno Desthuilliers

unread,

Sep 11, 2005, 1:07:17 PM9/11/05

to

Stefano Masini a écrit :
(snip)

> If such a "quick and dirty" section existed, I think it would also
> become a natural randevouz point for innovators.

s/randevouz/rendez-vous/ !-)

pardon-my-french-ly y'rs

Gregory Bond

unread,

Sep 12, 2005, 1:57:52 AM9/12/05

to

François Pinard wrote:
> In computer
> science, I often saw old concepts resurrecting with new names, and then
> mistaken for recent inventions. New ideas are not so frequent...
>

"There are very few problems in Computer Science that cannot be solved
with an additional level of indirection."

-- Dunno who said it first, but I wish it was me.

A.M. Kuchling

unread,

Sep 12, 2005, 8:40:58 AM9/12/05

to

On Sat, 10 Sep 2005 12:35:37 -0400,
Mike Meyer <m...@mired.org> wrote:
> I think the manual does need a section on how to find code other than
> the library. But where do you put it?

The tutorial's final section (http://docs.python.org/tut/node14.html)
mentions PyPI. A link to the ASPN cookbook should also be added; anything
else that should go in this section of the tutorial?

Similar text could be added to the introductory section of the library
reference, but I doubt that many users would see it because people probably
dive into the LibRef for a particular module instead of reading it straight
through.

--amk

Claudio Grondi

unread,

Sep 12, 2005, 2:40:59 PM9/12/05

to

Here some of my thougts on this subject:

I think that this question adresses only a tiny
aspect of a much more general problem the
entire human race has in any area.
Reinventing the wheel begins when the grandpa
starts to teach his grandchild remembering well
that he has done it already many times before
to own children.

As any of us takes the chance to be somehow
different and writing program code does need
much understanding how and why something
works, it is very probably, that what is there is hard
to understand (Python as a programming language
is not really an exception here). There is not much
around of so universal value, that it can be
taken any time by anyone.

When I am coming myself back to what I have created
in the past I often see what trash I have produced.
The unique discoveries of the kind "wow! today I would
do it the same way or even less smart not given enough
time" don't change the general picture.

So the question is here, where are the tools making
it possible to find a piece of code solving a
problem when the problem is formulated only
using natural language? I am finding myself reinventing
the wheel all the time only because I am not able to find
appropriate pieces of code in the collection I have put
together (am I alone here? bad memory? lack of
proper filing system?).

It seems, that posting to a newsgroup is usually
the best choice, but even this needs much work
in advance before it is possible to communicate what the
problem is, that one has.
In case of the OpenCV interface to Python even that
seem not to help ... (I am pretty sure there is someone
out there who would be able to put me in the right
direction).

Are there any tools in Python based on associations
looking for specific types of code?
Something similar to http://www.qknow.com , but addressed
towards specific needs of a programmer looking for code
snippets? (not a kind of search engine or system of
folders with cross links, but a system able to find a chain
of snippets required to solve a problem).

To name a simplest example:
What should I do to find a piece of code taking an
integer and giving a string with binary form of a
number? How to put some available pieces of code
together if the binary form is needed and the integer
is provided as a string holding its hexadecimal form?
What if the string is the binary representation of the
integer value as internally stored in memory?
What if I would like the binary form to be splitted
in nibbles separated with one space and bytes with
two spaces?

How can I avoid to reinvent the wheel, when I
don't have the tools to find what I am looking
for?

Saying it in words of the Beatles song:
"Help me if you can, I'm feeling down.
And I do appreciate you being round.
Help me, get my feet back on the ground,
Won't you please, please help me,
help me, help me, oh. "

Claudio

"Stefano Masini" <ste...@pragma2000.com> schrieb im Newsbeitrag
news:mailman.167.1126279...@python.org...

It seems to me that this tendency is hurting python, and I wonder if
there is something that could be done about it. I once followed a
discussion about placing one of the available third party modules for
file handling inside the standard library. I can't remember its name
right now, but the discussion quickly became hot with considerations
about the module not being "right" enough to fit the standard library.
The points were right, but in some sense it's a pity because by being
in the stdlib it could have had a lot more visibility and maybe people
would have stopped writing their own, and would have begun using it.
Then maybe, if it was not perfect, people would have begun improving
it, and by now we would have a solid feature available to everybody.

mhm... could it be a good idea to have two versions of the stdlib? One
stable, and one testing, where stuff could be thrown in without being
too picky, in order to let the community decide and improve?

Again, Fuzzyman, your post was just the excuse to get me started. I
understand and respect your work, also because you put the remarkable
effort to make it publicly available.

That's my two cents,
stefano

Magnus Lycka

unread,

Sep 14, 2005, 7:50:07 AM9/14/05

to

Claudio Grondi wrote:
> To name a simplest example:
> What should I do to find a piece of code taking an
> integer and giving a string with binary form of a
> number? How to put some available pieces of code
> together if the binary form is needed and the integer
> is provided as a string holding its hexadecimal form?
> What if the string is the binary representation of the
> integer value as internally stored in memory?
> What if I would like the binary form to be splitted
> in nibbles separated with one space and bytes with
> two spaces?

It's possible that you have a point in principle,
but these examples don't strengthen your point.

A function that turns e.g. 5 into '101' is trivial
and just a few lines of code. Finding that in some
kind of code catalog would certainly be more work
than to just code it. Besides, there are a number
of variants here, so a variant that makes everybody
happy when it concerns dealing with negative numbers,
range checks, possibly filling with zeros to a certain
length etc, would probably be both bigger and slower
than what the average Joe needs.

This is simply the wrong level of reuse. It's too
simple and too varied. To be able to express things
like that in code is very basic programing.

You create integers from numeric representation in
strings with the int() function. You should read
chapter 2 in the library reference again Claudio.
This is one of the most common builtin function.
int() accepts all bases you are likely to use and
then some.

Filtering out spaces is again trivial.
"0101 0110".replace(' ','') Also chapter 2 in the
library manual.

You should read this until you know it Claudio! It's
really one of the most important pieces of Python
documentation.

I might be wrong, but I suspect you just need to get
more routine in programming. Your reasoning sounds a
bit like: "I don't want to invent new sentences all the
time, there should be a catalog of useful sentences
that I can look up and use.

Sure, there are phrase books for tourists, but they are
only really interesting for people who use a language
on a very naive level. We certainly reuse words, and
it's also very useful to reuse complete texts, from
short poems to big books. Sure, many sentences are often
repeated, but the ability to create new sentences in
a natural language is considered a basic skill of the
user. No experienced user of a language use phrase
books, and if you really want to learn a language
properly, phrase books aren't nearly as useful as proper
texts.

There are simply so many possibly useful sentences, so
it would be much, much more work to try to catalog and
identify useful sentences than to reinvent them as we
need them.

It's just the same with the kinds of problems you described
above. With fundamental language skills, you'll solve these
problems much faster than you can look them up. Sure, the
first attempts might be less than ideal, especially if you
haven't read chapter 2 in the library manual, but you learn
much, much more from coding than from looking at code
snippets.

konrad...@laposte.net

unread,

Sep 14, 2005, 10:03:28 AM9/14/05

to

Stefano Masini wrote:

> There are a few ares where everybody seems to be implementing their
> own stuff over and over: logging, file handling, ordered dictionaries,
> data serialization, and maybe a few more.
> I don't know what's the ultimate problem, but I think there are 3 main reasons:
> 1) poor communication inside the community (mhm... arguable)
> 2) lack of a rich standard library (I heard this more than once)
> 3) python is such an easy language that the "I'll do it myself" evil
> side lying hidden inside each one of us comes up a little too often,
> and prevents from spending more time on research of what's available.

I'd like to add one more that I haven't seen mentioned yet: ease of
maintenance and distribution.

Whenever I decide to use someone else's package for an important
project, I need to make sure it is either maintained or looks clean
enough that I can maintain it myself. For small packages, that alone is
often more effort than writing my own.

If I plan to distribute my code to the outside world, I also want to
minimize the number of dependencies to make installation simple enough.
This would only stop being a concern if a truly automatic package
installation system for Python existed for all common platforms - I
think we aren't there yet, in spite of many good ideas. And even then,
the maintenance issue would be even more critical with code distributed
to the outside world.

None of these issues is specific to Python, but with Python making new
developments that much simpler, they gain in weight relative to the
effort of development.

> It seems to me that this tendency is hurting python, and I wonder if
> there is something that could be done about it. I once followed a

I don't think it hurts Python. However, it is far from an ideal
situation, so thinking about alternatives makes sense. I think the best
solution would be self-regulation by the community. Whenever someone
discovers three date-format modules on the market, he/she could contact
the authors and suggest that they sit together and develop a common
version that satisfies everyone's needs, perhaps with adaptor code to
make the unified module compatible with everyone's individual modules.

Konrad.

Message has been deleted

Jorgen Grahn

unread,

Sep 18, 2005, 7:05:16 AM9/18/05

to

On 14 Sep 2005 07:03:28 -0700, konrad...@laposte.net <konrad...@laposte.net> wrote:
> Stefano Masini wrote:
>
>> There are a few ares where everybody seems to be implementing their
>> own stuff over and over: logging, file handling, ordered dictionaries,
>> data serialization, and maybe a few more.
>> I don't know what's the ultimate problem, but I think there are 3 main reasons:
>> 1) poor communication inside the community (mhm... arguable)
>> 2) lack of a rich standard library (I heard this more than once)
>> 3) python is such an easy language that the "I'll do it myself" evil
>> side lying hidden inside each one of us comes up a little too often,
>> and prevents from spending more time on research of what's available.
>
> I'd like to add one more that I haven't seen mentioned yet: ease of
> maintenance and distribution.
>
> Whenever I decide to use someone else's package for an important
> project, I need to make sure it is either maintained or looks clean
> enough that I can maintain it myself. For small packages, that alone is
> often more effort than writing my own.

If the licenses are compatible, you also have the option to simply steal the
code and merge it into yours -- possibly cutting away the stuff you don't
need. Or if not, to read and learn from it.

That's another kind of reuse, which is sometimes overlooked.

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!

Mike Meyer

unread,

Sep 18, 2005, 12:36:54 PM9/18/05

to

Jorgen Grahn <jgrah...@algonet.se> writes:
> On Sat, 10 Sep 2005 20:24:32 -0400, François Pinard <pin...@iro.umontreal.ca> wrote:
> Yeah. I've often wished for some overview or guide that translates the
> current buzzwords to old concepts I'm familiar with. For example, I'm sure
> you can capture the core ideas of something like .NET in a couple of
> sentences.

Just taking a stab in the dark, since I'm only vaguely familiar with
.NET: P-code for multiple languages?