Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

ANN: pullparser 0.0.2b released

0 views

Skip to first unread message

John J. Lee

unread,

Dec 23, 2003, 9:55:03 AM12/23/03

http://wwwsearch.sourceforge.net/pullparser

This is the first beta release (and probably the last).

Changes since 0.0.1a:

* Renamed .tag_iter() to .tags(), and allowed it to take multiple name
arguments.
* .get_text() and .get_compressed_text() now no longer raise
NoMoreTagsError, but return "" instead, which is both more convenient
and makes the endcase saner.
* Made a tarball package with setup.py etc.

Requires Python 2.2.

A simple "pull API" for HTML parsing, after Perl's HTML::TokeParser.
Many simple HTML parsing tasks are simpler this way than with the
HTMLParser module. pullparser.PullParser is a subclass of
HTMLParser.HTMLParser.

Example:

import pullparser, sys
f = file(sys.argv[1])
p = pullparser.PullParser(f)
if p.get_tag("title"):
title = p.get_compressed_text()
print "Title: %s" % title

John

0 new messages