SimplePrograms challenge

7 views
Skip to first unread message

Steve Howell

unread,
Jun 11, 2007, 5:56:05 PM6/11/07
to pytho...@python.org
Hi, I'm offering a challenge to extend the following
page by one good example:

http://wiki.python.org/moin/SimplePrograms

Right now the page starts off with 15 examples that
cover lots of ground in Python, but they're still
scratching the surface. (There are also two Eight
Queens implementations, but I'm looking to fill the
gap in lines-of-code, and they're a little long now.)

I'm looking for a good 16-line code example with the
following qualities:

1) It introduces some important Python concept that
the first 15 programs don't cover.

2) It's not too esoteric. Python newbies are the
audience (but you can assume they're not new to
programming in general).

3) It runs on Python 2.4.

4) It doesn't just demonstrate a concept; it solves
a problem at face value. (It can solve a whimsical
problem, like counting rabbits, but the program itself
should be "complete" and "suitably simple" for the
problem at hand.)

5) You're willing to have your code reviewed by the
masses.

6) No major departures from PEP 8.

Any takers?

-- Steve



____________________________________________________________________________________
Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
http://farechase.yahoo.com/

infidel

unread,
Jun 11, 2007, 6:51:24 PM6/11/07
to
# reading CSV files, tuple-unpacking
import csv

#pacific.csv contains:
#1,CA,California
#2,AK,Alaska
#3,OR,Oregon
#4,WA,Washington
#5,HI,Hawaii

reader = csv.reader(open('pacific.csv'))
for id, abbr, name in reader:
print '%s is abbreviated: "%s"' % (name, abbr)

André

unread,
Jun 11, 2007, 7:10:21 PM6/11/07
to

Ok, doctest-based version of the Unit test example added; so much more
Pythonic ;-)

André

P.S. Congrats for starting this!

John Machin

unread,
Jun 11, 2007, 7:20:27 PM6/11/07
to
On Jun 12, 8:51 am, infidel <saint.infi...@gmail.com> wrote:
> # reading CSV files, tuple-unpacking
> import csv
>
> #pacific.csv contains:
> #1,CA,California
> #2,AK,Alaska
> #3,OR,Oregon
> #4,WA,Washington
> #5,HI,Hawaii
>
> reader = csv.reader(open('pacific.csv'))

For generality and portability, this should be:
reader = csv.reader(open('pacific.csv', 'rb'))

> for id, abbr, name in reader:
> print '%s is abbreviated: "%s"' % (name, abbr)

and this example doesn't demonstrate why one should use the csv module
instead of:
for line in open('pacific.csv'):
id, abbr, name = line.rstrip().split(',')
# etc
which is quite adequate for the simplistic example file.


Steve Howell

unread,
Jun 11, 2007, 7:16:36 PM6/11/07
to infidel, pytho...@python.org

--- infidel <saint....@gmail.com> wrote:

Definitely on track.

I like that it uses a batteries-included module, and
this program definitely reminds me of the type of
smallish, but very useful, programs that I write in
Python frequently on the job.

I would add one more criterion to my challenge,
though. You should be able to run the program simply
by copying/pasting the code into a file and running
Python. So I would change your program (as currently
presented) to write its own input file for
demonstration purposes.

One more suggestion--maybe it could exercise a little
more of the CVS module, i.e. have something in the
data that would trip up the ','.split() approach?


____________________________________________________________________________________
Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/

John Machin

unread,
Jun 11, 2007, 7:40:26 PM6/11/07
to
On Jun 12, 9:16 am, Steve Howell <showel...@yahoo.com> wrote:

>
> One more suggestion--maybe it could exercise a little
> more of the CVS module, i.e. have something in the
> data that would trip up the ','.split() approach?

The what approach?? Do you mean blah.split(',') ??

Perhaps like an example I posted a few days ago:

"Jack ""The Ripper"" Jones","""Eltsac Ruo"", 123 Smith St",,Paris TX
12345
(name and 3 address fields)
[for avoidance of doubt caused by line wrapping, repr(last_field) is
'Paris TX 12345', and the 2nd-last is '']

Steve Howell

unread,
Jun 11, 2007, 8:19:14 PM6/11/07
to John Machin, pytho...@python.org

--- John Machin <sjma...@lexicon.net> wrote:

> On Jun 12, 9:16 am, Steve Howell
> <showel...@yahoo.com> wrote:
>
> >
> > One more suggestion--maybe it could exercise a
> little
> > more of the CVS module, i.e. have something in the
> > data that would trip up the ','.split() approach?
>
> The what approach?? Do you mean blah.split(',') ??
>

Yes, of course.

I obviously had ','.join() on my brain. :)



____________________________________________________________________________________
Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool.
http://autos.yahoo.com/carfinder/

mensa...@aol.com

unread,
Jun 11, 2007, 11:49:56 PM6/11/07
to

I just posted a 30-line generator function
on your site. Should I have posted it here
first? Also, why do you count comments and
blank lines instead of lines of executable
code? Are you trying to encourage obfuscation?

Rob Wolfe

unread,
Jun 12, 2007, 8:06:08 AM6/12/07
to

Steve Howell wrote:
> Hi, I'm offering a challenge to extend the following
> page by one good example:
>
> http://wiki.python.org/moin/SimplePrograms

What about simple HTML parsing? As a matter of fact this is not
language concept, but shows the power of Python standard library.
Besides, that's very popular problem among newbies. This program
for example shows all the linked URLs in the HTML document:

<code>
from HTMLParser import HTMLParser

page = '''
<html><head><title>URLs</title></head>
<body>
<ul>
<li><a href="http://domain1/page1">some page1</a></li>
<li><a href="http://domain2/page2">some page2</a></li>
</ul>
</body></html>
'''

class URLLister(HTMLParser):
def reset(self):
HTMLParser.reset(self)
self.urls = []

def handle_starttag(self, tag, attrs):
try:
# get handler for tag and call it e.g. self.start_a
getattr(self, "start_%s" % tag)(attrs)
except AttributeError:
pass

def start_a(self, attrs):
href = [v for k, v in attrs if k == "href"]
if href:
self.urls.extend(href)

parser = URLLister()
parser.feed(page)
parser.close()
for url in parser.urls: print url
</code>

--
Regards,
Rob

Steven Bethard

unread,
Jun 12, 2007, 4:34:41 PM6/12/07
to
Rob Wolfe wrote:
> Steve Howell wrote:
>> Hi, I'm offering a challenge to extend the following
>> page by one good example:
>>
>> http://wiki.python.org/moin/SimplePrograms
>
> What about simple HTML parsing? As a matter of fact this is not
> language concept, but shows the power of Python standard library.
> Besides, that's very popular problem among newbies. This program
> for example shows all the linked URLs in the HTML document:
>
> <code>
> from HTMLParser import HTMLParser

[Sorry if this comes twice, it didn't seem to be showing up]

I'd hate to steer a potential new Python developer to a clumsier library
when Python 2.5 includes ElementTree::

import xml.etree.ElementTree as etree

page = '''
<html><head><title>URLs</title></head>
<body>
<ul>
<li><a href="http://domain1/page1">some page1</a></li>
<li><a href="http://domain2/page2">some page2</a></li>
</ul>
</body></html>
'''

tree = etree.fromstring(page)
for a_node in tree.getiterator('a'):
url = a_node.get('href')
if url is not None:
print url

I know that the wiki page is supposed to be Python 2.4 only, but I'd
rather have no example than an outdated one.

STeVe

Rob Wolfe

unread,
Jun 12, 2007, 5:37:59 PM6/12/07
to
Steven Bethard <steven....@gmail.com> writes:

> I'd hate to steer a potential new Python developer to a clumsier

"clumsier"???
Try to parse this with your program:

page2 = '''


<html><head><title>URLs</title></head>
<body>
<ul>
<li><a href="http://domain1/page1">some page1</a></li>
<li><a href="http://domain2/page2">some page2</a></li>

</body></html>
'''

> library when Python 2.5 includes ElementTree::
>
> import xml.etree.ElementTree as etree
>
> page = '''
> <html><head><title>URLs</title></head>
> <body>
> <ul>
> <li><a href="http://domain1/page1">some page1</a></li>
> <li><a href="http://domain2/page2">some page2</a></li>
> </ul>
> </body></html>
> '''
>
> tree = etree.fromstring(page)
> for a_node in tree.getiterator('a'):
> url = a_node.get('href')
> if url is not None:
> print url

It might be even one-liner:
print "\n".join((url.get('href', '') for url in tree.findall(".//a")))

But as far as HTML (not XML) is concerned this is not very realistic solution.

>
> I know that the wiki page is supposed to be Python 2.4 only, but I'd
> rather have no example than an outdated one.

This example is by no means "outdated".

--
Regards,
Rob

Steven Bethard

unread,
Jun 12, 2007, 6:06:38 PM6/12/07
to
Rob Wolfe wrote:
> Steven Bethard <steven....@gmail.com> writes:
>> I'd hate to steer a potential new Python developer to a clumsier
>
> "clumsier"???
> Try to parse this with your program:
>
> page2 = '''
> <html><head><title>URLs</title></head>
> <body>
> <ul>
> <li><a href="http://domain1/page1">some page1</a></li>
> <li><a href="http://domain2/page2">some page2</a></li>
> </body></html>
> '''

If you want to parse invalid HTML, I strongly encourage you to look into
BeautifulSoup. Here's the updated code:

import ElementSoup # http://effbot.org/zone/element-soup.htm
import cStringIO

tree = ElementSoup.parse(cStringIO.StringIO(page2))


for a_node in tree.getiterator('a'):
url = a_node.get('href')
if url is not None:
print url

>> I know that the wiki page is supposed to be Python 2.4 only, but I'd


>> rather have no example than an outdated one.
>
> This example is by no means "outdated".

Given the simplicity of the ElementSoup code above, I'd still contend
that using HTMLParser here shows too complex an answer to too simple a
problem.

STeVe

Steven Bethard

unread,
Jun 12, 2007, 6:09:44 PM6/12/07
to
Steven Bethard wrote:
> Rob Wolfe wrote:
>> Steven Bethard <steven....@gmail.com> writes:
>>> I'd hate to steer a potential new Python developer to a clumsier
>>
>> "clumsier"???
>> Try to parse this with your program:
>>
>> page2 = '''
>> <html><head><title>URLs</title></head>
>> <body>
>> <ul>
>> <li><a href="http://domain1/page1">some page1</a></li>
>> <li><a href="http://domain2/page2">some page2</a></li>
>> </body></html>
>> '''
>
> If you want to parse invalid HTML, I strongly encourage you to look into
> BeautifulSoup. Here's the updated code:
>
> import ElementSoup # http://effbot.org/zone/element-soup.htm
> import cStringIO
>
> tree = ElementSoup.parse(cStringIO.StringIO(page2))
> for a_node in tree.getiterator('a'):
> url = a_node.get('href')
> if url is not None:
> print url

I should also have pointed out that using the above ElementSoup code can
parse the following text::

<html><head><title>URLs</title></head>
<body>
<ul>
<li<a href="http://domain1/page1">some page1</a></li>
<li><a href="http://domain2/page2">some page2</a></li>
</body></html>

where the HTMLParser code raises an HTMLParseError.

STeVe

George Sakkis

unread,
Jun 12, 2007, 6:25:26 PM6/12/07
to

I love the 7-line version of the prime number generator by Tim
Hochberg at the last comment of http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/117119:

from itertools import count, ifilter
def sieve():
seq = count(2)
while True:
p = seq.next()
seq = ifilter(p.__rmod__, seq)
yield p


I suspect that it violates your second rule though :)

George

Steve Howell

unread,
Jun 12, 2007, 6:27:10 PM6/12/07
to Rob Wolfe, pytho...@python.org
--- Rob Wolfe <r...@smsnet.pl> wrote:

> Steven Bethard <steven....@gmail.com> writes:
>
> > I'd hate to steer a potential new Python developer

> > to a clumsier [...]
>
> [...]


> But as far as HTML (not XML) is concerned this is
> not very realistic solution.
>

I think both posted examples would be valuable on the
page, with proper disclaimers. Feel free to post
them:

http://wiki.python.org/moin/SimplePrograms

Without having used either library, it seems the older
module is a bit more difficult to use for simple
examples, but it is more robust in the face of
real-world non-XML-compliant HTML, and it obviously
more readibly usable for folks with older Python
versions (which could even include newbies who are
squatting on somebody else's machine). The newer
module is easier to use in some cases, but it has the
limitations of assuming some level of XML compliance
and the availability of 2.5.

Is there a third, better alternative out there
somewhere? Also, is there a way to rewrite Rob's
program so that it seems a little more
straightforward, or was that necessary complexity even
for the problem at hand?


____________________________________________________________________________________
No need to miss a message. Get email on-the-go
with Yahoo! Mail for Mobile. Get started.
http://mobile.yahoo.com/mail

Steve Howell

unread,
Jun 12, 2007, 6:43:32 PM6/12/07
to George Sakkis, pytho...@python.org

--- George Sakkis <george...@gmail.com> wrote:
>
> from itertools import count, ifilter
> def sieve():
> seq = count(2)
> while True:
> p = seq.next()
> seq = ifilter(p.__rmod__, seq)
> yield p
>
>
> I suspect that it violates your second rule though
> :)
>

I'm genuinely torn. The elegance of the solution far
outweighs its esotericness. And I certainly can't
complain about the choice of the problem (finding
primes), since I included a much more pedestrian
solution to the same problem on the very page that
we're talking about.

I do feel, however, like I want to order solutions by
how long they are in line numbers, and if I stick to
that rule, I do think that the solution above, while
elegant, might be a little advanced as the seventh
example of Python's simplicity.

Is there a way to broaden the problem somehow, so that
it can be a longer solution and further down on the
page, and so that I can continue to enforce my
somewhat arbitrary rule of ordering examples by how
long they are?

(I fully confess that my ordering rule unfairly
penalizes short-and-sweet limitations, but I hope that
*all* solutions are short-and-sweet, and this one
certainly fits the bill.)


____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow

Steve Howell

unread,
Jun 12, 2007, 6:55:53 PM6/12/07
to Rob Wolfe, pytho...@python.org

--- Rob Wolfe <r...@smsnet.pl> wrote:

>
> What about simple HTML parsing? As a matter of fact
> this is not
> language concept, but shows the power of Python
> standard library.
> Besides, that's very popular problem among newbies.

I totally agree with the relevance of this example.
FWIW it wasn't HTML parsing, but a very similar
problem (needing an HTML template library), that was
the final straw in converting me from Python to Perl.
It wasn't even a batteries-included solution, but the
simplicity of Python in the context of working with
HTML finally won me over.



____________________________________________________________________________________
Got a little couch potato?
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz

Steve Howell

unread,
Jun 12, 2007, 7:07:47 PM6/12/07
to Steve Howell, Rob Wolfe, pytho...@python.org

--- I wrote:
> [...] was the final straw in converting
> me from Python to Perl.

Er, Perl to Python.

In looking backwards, I started writing backwards.

And no, I'm not ever going back to Perl...

Steven Bethard

unread,
Jun 12, 2007, 7:09:23 PM6/12/07
to Steve Howell, George Sakkis, pytho...@python.org
Steve Howell wrote:
> --- George Sakkis <george...@gmail.com> wrote:
>> from itertools import count, ifilter
>> def sieve():
>> seq = count(2)
>> while True:
>> p = seq.next()
>> seq = ifilter(p.__rmod__, seq)
>> yield p
[snip]

> Is there a way to broaden the problem somehow, so that
> it can be a longer solution and further down on the
> page, and so that I can continue to enforce my
> somewhat arbitrary rule of ordering examples by how
> long they are?

How about we just comment it better?

import itertools

def iter_primes():
# an iterator of all numbers between 2 and +infinity
numbers = itertools.count(2)

# generate primes forever
while True

# generate the first number from the iterator,
# which should always be a prime
prime = numbers.next()
yield prime

# lazily remove all numbers from the iterator that
# are divisible by prime we just selected
numbers = itertools.ifilter(prime.__rmod__, numbers)

I think that's 17-ish, though you could shrink it down by removing some
of the spaces.

STeVe

Steve Howell

unread,
Jun 12, 2007, 7:11:47 PM6/12/07
to Steven Bethard, pytho...@python.org, George Sakkis

No, that's perfect. Do you want me to post it, or do
you want the honors?



____________________________________________________________________________________
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.
http://games.yahoo.com/games/front

Steve Howell

unread,
Jun 12, 2007, 7:13:10 PM6/12/07
to Steven Bethard, pytho...@python.org, George Sakkis

--- Steven Bethard <steven....@gmail.com> wrote:

Actually, just one small caveat--I'd still want the
program to print out the results. I think I've
followed that convention for all the other programs.



____________________________________________________________________________________
Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545433

Steven Bethard

unread,
Jun 12, 2007, 7:20:36 PM6/12/07
to
Steve Howell wrote:
> --- Steven Bethard <steven....@gmail.com> wrote:
>> How about we just comment it better?
>>
>> import itertools
>>
>> def iter_primes():
>> # an iterator of all numbers between 2 and +infinity
>> numbers = itertools.count(2)
[snip]

>
> Actually, just one small caveat--I'd still want the
> program to print out the results. I think I've
> followed that convention for all the other programs.

Good point. Adding something as simple as::

# print the first 100 primes
for prime in itertools.islice(iter_primes(), 100):
print prime

to the end should make it a full-fledged program.

STeVe

Steven Bethard

unread,
Jun 12, 2007, 7:09:23 PM6/12/07
to Steve Howell, pytho...@python.org, George Sakkis
Steve Howell wrote:
> --- George Sakkis <george...@gmail.com> wrote:
>> from itertools import count, ifilter
>> def sieve():
>> seq = count(2)
>> while True:
>> p = seq.next()
>> seq = ifilter(p.__rmod__, seq)
>> yield p
[snip]

> Is there a way to broaden the problem somehow, so that
> it can be a longer solution and further down on the
> page, and so that I can continue to enforce my
> somewhat arbitrary rule of ordering examples by how
> long they are?

How about we just comment it better?

import itertools

def iter_primes():
# an iterator of all numbers between 2 and +infinity
numbers = itertools.count(2)

# generate primes forever
while True

# generate the first number from the iterator,
# which should always be a prime
prime = numbers.next()
yield prime

# lazily remove all numbers from the iterator that
# are divisible by prime we just selected
numbers = itertools.ifilter(prime.__rmod__, numbers)

I think that's 17-ish, though you could shrink it down by removing some
of the spaces.

STeVe

Steve Howell

unread,
Jun 12, 2007, 7:35:31 PM6/12/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:

> On 6/12/07, Steve Howell <show...@yahoo.com>


> wrote:
> > --- Steven Bethard <steven....@gmail.com>
> wrote:

> > > import itertools
> > >
> > > def iter_primes():
> > > # an iterator of all numbers between 2 and
> > > +infinity
> > > numbers = itertools.count(2)
> > >
> > > # generate primes forever
> > > while True
> > >
> > > # generate the first number from the
> > > iterator,
> > > # which should always be a prime
> > > prime = numbers.next()
> > > yield prime
> > >
> > > # lazily remove all numbers from the
> > > iterator that
> > > # are divisible by prime we just
> selected
> > > numbers =
> itertools.ifilter(prime.__rmod__,
> > > numbers)
> > >
> > > I think that's 17-ish, though you could shrink
> it
> > > down by removing some
> > > of the spaces.
> >

> > No, that's perfect. Do you want me to post it, or
> do
> > you want the honors?
>

> Go for it.
>

Done:

http://wiki.python.org/moin/SimplePrograms (example
20)

Thanks, George and Steve!


____________________________________________________________________________________
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
http://answers.yahoo.com/dir/?link=list&sid=396546091

Steve Howell

unread,
Jun 12, 2007, 7:42:23 PM6/12/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:
> # print the first 100 primes
> for prime in itertools.islice(iter_primes(),
> 100):
> print prime
>

Sure. FWIW, in the example I posted, I kept the
output loop a little more pedestrian (using an
if/break idiom), and I printed primes less than 100,
not the first 100 primes.

I still find itertools a little magical at times, so I
want to keep it limited to places where it really
simplifies an algorithm.


____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting

Stefan Behnel

unread,
Jun 13, 2007, 2:26:00 AM6/13/07
to Steven Bethard

Here's an lxml version:

from lxml import etree as et # http://codespeak.net/lxml
html = et.HTML(page2)
for href in html.xpath("//a/@href[string()]"):
print href

Doesn't count as a 15-liner, though, even if you add the above HTML code to it.

Stefan

Steven Bethard

unread,
Jun 13, 2007, 2:48:09 AM6/13/07
to Stefan Behnel
Stefan Behnel wrote:

> Steven Bethard wrote:
>> If you want to parse invalid HTML, I strongly encourage you to look into
>> BeautifulSoup. Here's the updated code:
>>
>> import ElementSoup # http://effbot.org/zone/element-soup.htm
>> import cStringIO
>>
>> tree = ElementSoup.parse(cStringIO.StringIO(page2))
>> for a_node in tree.getiterator('a'):
>> url = a_node.get('href')
>> if url is not None:
>> print url
>>
[snip]

>
> Here's an lxml version:
>
> from lxml import etree as et # http://codespeak.net/lxml
> html = et.HTML(page2)
> for href in html.xpath("//a/@href[string()]"):
> print href
>
> Doesn't count as a 15-liner, though, even if you add the above HTML code to it.

Definitely better than the HTMLParser code. =) Personally, I still
prefer the xpath-less version, but that's only because I can never
remember what all the line noise characters in xpath mean. ;-)

STeVe

Steve Howell

unread,
Jun 13, 2007, 6:21:17 AM6/13/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:

I think there might be other people who will balk at
the xpath syntax, simply due to their unfamiliarity
with it. And, on the other hand, if you really like
the xpath syntax, then the program really becomes more
of an endorsement for xpath's clean syntax than for
Python's. To the extent that Python enables you to
implement an xpath solution cleanly, that's great, but
then you have the problem that lxml is not batteries
included.

I do hope we can find something to put on the page,
but I'm the wrong person to decide on it, since I
don't really do any rigorous HTML screen scraping in
my current coding. (I still think it's a common use
case, even though I don't do it myself.)

I suggested earlier that maybe we post multiple
solutions. That makes me a little nervous, to the
extent that it shows that the Python community has a
hard time coming to consensus on tools sometimes.
This is not a completely unfair knock on Python,
although I think the reason multiple solutions tend to
emerge for this type of thing is precisely due to the
simplicity and power of the language itself.

So I don't know. What about trying to agree on an XML
parsing example instead?

Thoughts?



____________________________________________________________________________________
Pinpoint customers who are looking for what you sell.
http://searchmarketing.yahoo.com/

Rob Wolfe

unread,
Jun 13, 2007, 7:56:03 AM6/13/07
to

Steve Howell wrote:

> I suggested earlier that maybe we post multiple
> solutions. That makes me a little nervous, to the
> extent that it shows that the Python community has a
> hard time coming to consensus on tools sometimes.

We agree that BeautifulSoup is the best for parsing HTML. :)

> This is not a completely unfair knock on Python,
> although I think the reason multiple solutions tend to
> emerge for this type of thing is precisely due to the
> simplicity and power of the language itself.
>
> So I don't know. What about trying to agree on an XML
> parsing example instead?
>
> Thoughts?

I vote for example with ElementTree (without xpath)
with a mention of using ElementSoup for invalid HTML.

--
Regards,
Rob

Steven Bethard

unread,
Jun 13, 2007, 11:36:31 AM6/13/07
to

Sounds good to me. Maybe something like::

import xml.etree.ElementTree as etree
dinner_recipe = '''
<ingredients>
<ing><amt><qty>24</qty><unit>slices</unit></amt><item>baguette</item></ing>
<ing><amt><qty>2+</qty><unit>tbsp</unit></amt><item>olive_oil</item></ing>
<ing><amt><qty>1</qty><unit>cup</unit></amt><item>tomatoes</item></ing>
<ing><amt><qty>1-2</qty><unit>tbsp</unit></amt><item>garlic</item></ing>
<ing><amt><qty>1/2</qty><unit>cup</unit></amt><item>Parmesan</item></ing>
<ing><amt><qty>1</qty><unit>jar</unit></amt><item>pesto</item></ing>
</ingredients>'''
pantry = set(['olive oil', 'pesto'])
tree = etree.fromstring(dinner_recipe)
for item_elem in tree.getiterator('item'):
if item_elem.text not in pantry:
print item_elem.text

Though I wouldn't know where to put the ElementSoup link in this one...

STeVe

Rob Wolfe

unread,
Jun 13, 2007, 3:31:42 PM6/13/07
to
Steven Bethard <steven....@gmail.com> writes:

>> I vote for example with ElementTree (without xpath)
>> with a mention of using ElementSoup for invalid HTML.
>
> Sounds good to me. Maybe something like::
>
> import xml.etree.ElementTree as etree
> dinner_recipe = '''
> <ingredients>
> <ing><amt><qty>24</qty><unit>slices</unit></amt><item>baguette</item></ing>
> <ing><amt><qty>2+</qty><unit>tbsp</unit></amt><item>olive_oil</item></ing>

^^^^^^^^^

Is that a typo here?

> <ing><amt><qty>1</qty><unit>cup</unit></amt><item>tomatoes</item></ing>
> <ing><amt><qty>1-2</qty><unit>tbsp</unit></amt><item>garlic</item></ing>
> <ing><amt><qty>1/2</qty><unit>cup</unit></amt><item>Parmesan</item></ing>
> <ing><amt><qty>1</qty><unit>jar</unit></amt><item>pesto</item></ing>
> </ingredients>'''
> pantry = set(['olive oil', 'pesto'])
> tree = etree.fromstring(dinner_recipe)
> for item_elem in tree.getiterator('item'):
> if item_elem.text not in pantry:
> print item_elem.text

That's nice example. :)

> Though I wouldn't know where to put the ElementSoup link in this one...

I had a regular HTML in mind, something like:

<code>
# HTML page
dinner_recipe = '''
<html><head><title>Recipe</title></head><body>
<table>
<tr><th>amt</th><th>unit</th><th>item</th></tr>
<tr><td>24</td><td>slices</td><td>baguette</td></tr>
<tr><td>2+</td><td>tbsp</td><td>olive_oil</td></tr>
<tr><td>1</td><td>cup</td><td>tomatoes</td></tr>
<tr><td>1-2</td><td>tbsp</td><td>garlic</td></tr>
<tr><td>1/2</td><td>cup</td><td>Parmesan</td></tr>
<tr><td>1</td><td>jar</td><td>pesto</td></tr>
</table>
</body></html>'''

# program
import xml.etree.ElementTree as etree
tree = etree.fromstring(dinner_recipe)

#import ElementSoup as etree # for invalid HTML
#from cStringIO import StringIO # use this
#tree = etree.parse(StringIO(dinner_recipe)) # wrapper for BeautifulSoup

pantry = set(['olive oil', 'pesto'])

for ingredient in tree.getiterator('tr'):
amt, unit, item = ingredient.getchildren()
if item.tag == "td" and item.text not in pantry:
print "%s: %s %s" % (item.text, amt.text, unit.text)
</code>

But if that's too complicated I will not insist on this. :)
Your example is good enough.

--
Regards,
Rob

Steven Bethard

unread,
Jun 13, 2007, 4:07:35 PM6/13/07
to Rob Wolfe
Rob Wolfe wrote:
> Steven Bethard <steven....@gmail.com> writes:
>
>>> I vote for example with ElementTree (without xpath)
>>> with a mention of using ElementSoup for invalid HTML.
>> Sounds good to me. Maybe something like::
>>
>> import xml.etree.ElementTree as etree
>> dinner_recipe = '''
>> <ingredients>
>> <ing><amt><qty>24</qty><unit>slices</unit></amt><item>baguette</item></ing>
>> <ing><amt><qty>2+</qty><unit>tbsp</unit></amt><item>olive_oil</item></ing>
> ^^^^^^^^^
> Is that a typo here?

Just trying to make Thunderbird line-wrap correctly. ;-) It's better
with a space instead of an underscore.

Sure, that looks fine to me. =)

Steve

infidel

unread,
Jun 13, 2007, 5:34:57 PM6/13/07
to
# writing/reading CSV files, tuple-unpacking, cmp() built-in
import csv

writer = csv.writer(open('stocks.csv', 'wb'))
writer.writerows([
('GOOG', 'Google, Inc.', 505.24, 0.47, 0.09),
('YHOO', 'Yahoo! Inc.', 27.38, 0.33, 1.22),
('CNET', 'CNET Networks, Inc.', 8.62, -0.13, -1.49)
])

stocks = csv.reader(open('stocks.csv', 'rb'))
for ticker, name, price, change, pct in stocks:
print '%s is %s (%s%%)' % (
name,
{-1: 'down', 0: 'unchanged', 1: 'up'}[cmp(float(change),
0.0)],
pct
)

Steve Howell

unread,
Jun 13, 2007, 10:40:43 PM6/13/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:

Whatever makes the most sense, please post it. Sorry
for not responding earlier.

Steven Bethard

unread,
Jun 14, 2007, 12:00:43 AM6/14/07
to

I posted a slight variant of this, trimmed down a bit to 21 lines.

STeVe

Steve Howell

unread,
Jun 14, 2007, 6:14:53 AM6/14/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:
unit.text)
>
> I posted a slight variant of this, trimmed down a
> bit to 21 lines.
>

Thanks, I think this will be a very useful example.

rzed

unread,
Jun 14, 2007, 6:46:55 AM6/14/07
to
Steven Bethard <steven....@gmail.com> wrote in
news:466F27A3...@gmail.com:

How about including a driver? Generators are frustrating for
newbies (including oldies new to generators) because they don't
actually do anything unless you know how to use them. Given the
above, what's a newbie going to try first? Something like:

>>> iter_primes()

Hmmm. Doesn't do anything.

How about
>>> for ix in range(10):
... print iter_primes()

Not what you might expect.

Later:

>>> for ix in range(10):
... print iter_primes().next()

Hmmmmm....

... and so on.

In much of Python's documentation, and in this case, an occasional
working example of use would go FAR in aiding understanding of the
underlying concept.

--
rzed

Steven Bethard

unread,
Jun 14, 2007, 2:22:09 PM6/14/07
to
rzed wrote:
> Steven Bethard <steven....@gmail.com> wrote in
>> def iter_primes():
>> # an iterator of all numbers between 2 and +infinity
>> numbers = itertools.count(2)
>>
>> # generate primes forever
>> while True
>>
>> # generate the first number from the iterator,
>> # which should always be a prime
>> prime = numbers.next()
>> yield prime
>>
>> # lazily remove all numbers from the iterator that
>> # are divisible by prime we just selected
>> numbers = itertools.ifilter(prime.__rmod__, numbers)
>>
>> I think that's 17-ish, though you could shrink it down by
>> removing some of the spaces.
>
> How about including a driver?

Yes, absolutely a good idea. Fortunately, the other Steve borrowed the
time machine already and added this to the end::

for p in iter_primes():
if p > 1000: break
print p

http://wiki.python.org/moin/SimplePrograms

STeVe

Steve Howell

unread,
Jun 14, 2007, 9:17:10 PM6/14/07
to Steven Bethard, pytho...@python.org

--- Steven Bethard <steven....@gmail.com> wrote:

> > How about including a driver?
>
> Yes, absolutely a good idea. Fortunately, the other
> Steve borrowed the
> time machine already and added this to the end::
>
> for p in iter_primes():
> if p > 1000: break
> print p
>

I think rzed (sorry for bad quoting) was
(partly)making the point that we could have more
explicitly showed how .next() worked. I'm fine with
the current example, because I think the examples
should solve problems at face value, and here it's
very natural to use a loop with an iterator/generator
(never mind the underlying magic), but I do think some
future example should show how one can have more
control over getting values from an iterator.


____________________________________________________________________________________
Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
http://farechase.yahoo.com/

Steve Howell

unread,
Jun 14, 2007, 9:26:43 PM6/14/07
to rzed, pytho...@python.org

--- rzed <rza...@gmail.com> wrote:
> Hmmmmm....
>
> ... and so on.
>
> In much of Python's documentation, and in this case,
> an occasional
> working example of use would go FAR in aiding
> understanding of the
> underlying concept.
>

I agree with your sentiment 100%. Feel free to change
the SimplePrograms page to add any clarity, but as the
other Steve pointed out, I did in fact add a driver.

To your larger point...I have argued in another thread
that the Python documentation needs to have a more
flexible structure that allows for better posting of
example code. The current modules documentation is
optimized for the case that a module provides a sort
of integrated solution to a particular poblem, so
there is just one examples page per module, but some
modules (e.g. itertools) are really just a grab-bag of
functions, so they could be documented better by
one-or-more example per function.

Sorry for not putting this post in better context, but
I think we are on to the same idea...maybe you can
elaborate on your own point better than I did just
now...

-- Steve



____________________________________________________________________________________
Food fight? Enjoy some healthy debate
in the Yahoo! Answers Food & Drink Q&A.
http://answers.yahoo.com/dir/?link=list&sid=396545367

Joe Riopel

unread,
Jun 14, 2007, 10:00:05 PM6/14/07
to Steve Howell, pytho...@python.org, rzed
How about this one for recursion and control flow:

>>> def hcd(m,n):
... r = m % n
... if( r > 0 ):
... hcd(n, r)
... else:
... print "hcd = %d" % (n,)
...
>>> hcd(119, 544)
hcd = 17
>>>

It calculates the highest common denominator for m and n. Plus it's E1
in TAoCP by Knuth.

Steve Howell

unread,
Jun 14, 2007, 10:07:18 PM6/14/07
to Joe Riopel, pytho...@python.org, rzed

I'm a little wary of having the page contain too many
mathematical examples, although I admit that I've
already included Fibonacci and Eratosthenes. (And
your example shares in common with those two that it's
a classic example.)

Feel free to post your example on the page, but please
use slightly more descriptive identifiers (e.g. change
hcd to highest_common_denominator or
largest_common_denominator), and also have the program
print out the results.

Thanks.

-- Steve



____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search
that gives answers, not web links.
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC

Pete Forman

unread,
Jun 20, 2007, 5:44:57 AM6/20/07
to
André <andre....@gmail.com> writes:

> Ok, doctest-based version of the Unit test example added; so much
> more Pythonic ;-)

Sorry for being a bit picky but there are a number of things that I'm
unhappy with in that example.

1) It's the second example with 13 lines. Though I suppose that the
pragmatism of pairing the examples overriding an implicit goal of
the page is itself Pythonic.

2) assert is not the simplest example of doctest. The style should be

>>> add_money([0.13, 0.02])
0.15
>>> add_money([100.01, 99.99])
200.0
>>> add_money([0, -13.00, 13.00])
0.0

3) which fails :-( So both the unittest and doctest examples ought to
be redone to emphasize what they are doing without getting bogged
down by issues of floating point representations.

http://wiki.python.org/moin/SimplePrograms

--
Pete Forman -./\.- Disclaimer: This post is originated
WesternGeco -./\.- by myself and does not represent
pete....@westerngeco.com -./\.- the opinion of Schlumberger or
http://petef.port5.com -./\.- WesternGeco.

Steve Howell

unread,
Jun 20, 2007, 8:02:51 AM6/20/07
to Pete Forman, pytho...@python.org

--- Pete Forman <pete....@westerngeco.com> wrote:

> André <andre....@gmail.com> writes:
>
> > Ok, doctest-based version of the Unit test
> example added; so much
> > more Pythonic ;-)
>
> Sorry for being a bit picky but there are a number
> of things that I'm
> unhappy with in that example.
>

Your pickiness is appreciated. :)

> 1) It's the second example with 13 lines. Though I
> suppose that the
> pragmatism of pairing the examples overriding an
> implicit goal of
> the page is itself Pythonic.
>

Since you looked at the page, I have corrected that by
making the example above it 12 lines, so that's no
longer an issue.



> 2) assert is not the simplest example of doctest.
> The style should be
>
> >>> add_money([0.13, 0.02])
> 0.15
> >>> add_money([100.01, 99.99])
> 200.0
> >>> add_money([0, -13.00, 13.00])
> 0.0
>

That's not clear cut to me. I think vertical
conciseness has an advantage for readability, as it
means you get to keep more "real" code on the screen.

> 3) which fails :-( So both the unittest and doctest
> examples ought to
> be redone to emphasize what they are doing
> without getting bogged
> down by issues of floating point representations.
>

I was the one who originally posted the floating point
example (with yet another style of unit testing, BTW),
and I agree that the subtleties of floating point do
kind of cloud the issue. I welcome a better example.
What I didn't realize is that there's an actual error.
Are you saying the program fails? On which test?


___________________________________________________________________________________
You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html

Pete Forman

unread,
Jun 20, 2007, 8:54:13 AM6/20/07
to
Steve Howell <show...@yahoo.com> writes:

>> 2) assert is not the simplest example of doctest.
>> The style should be
>>
>> >>> add_money([0.13, 0.02])
>> 0.15
>> >>> add_money([100.01, 99.99])
>> 200.0
>> >>> add_money([0, -13.00, 13.00])
>> 0.0
>>
>
> That's not clear cut to me. I think vertical
> conciseness has an advantage for readability, as it
> means you get to keep more "real" code on the screen.

What I meant was that doctest should be "type this into the
interpreter and you should see that". A doctest is not a unit test,
though it may form a subset of the tests. There should only be enough
doctests to enclue a human reader. Comprehensive testing should use a
larger framework. Doctests in separate files can do this but I would
use py.test, or alternatives like nose or Testoob.

>>> 2 + 2
4

"assert 2 + 2 == 4" is a concise way of writing a unit test but it is
not the best way to use doctest IMHO.

>> 3) which fails :-( So both the unittest and doctest
>> examples ought to
>> be redone to emphasize what they are doing
>> without getting bogged
>> down by issues of floating point representations.
>>
>
> I was the one who originally posted the floating point
> example (with yet another style of unit testing, BTW),
> and I agree that the subtleties of floating point do
> kind of cloud the issue. I welcome a better example.
> What I didn't realize is that there's an actual error.
> Are you saying the program fails? On which test?

Python 2.5.1 on XP:

Failed example:
add_money([0.13, 0.02])
Expected:
0.15
Got:
0.14999999999999999

Steve Howell

unread,
Jun 20, 2007, 9:26:14 AM6/20/07