Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Nice quote

8 views
Skip to first unread message

Just

unread,
Mar 28, 2003, 12:30:28 PM3/28/03
to
Tim Bray, in
http://www.tbray.org/ongoing/When/200x/2003/03/24/XMLisOK
writes:

"""The Python people also piped to say "everything's just
fine here" but then they always do, I really must learn
that language."""

Just

Jeremy Bowers

unread,
Mar 28, 2003, 8:18:37 PM3/28/03
to
On Fri, 28 Mar 2003 18:30:28 +0100, Just wrote:
> """The Python people also piped to say "everything's just
> fine here" but then they always do, I really must learn that
> language."""

Would somebody amplify on this please? What XML library do we have that's
easier to use then DOM or SAX for real world tasks that doesn't require
reading the whole file into memory to work?

"The Python people" may be referring to some pull-based DOM solution but
IMHO that still has the problem that the DOM, as a middle-of-the-road,
common ground solution isn't easy to use, it's just equally mediocre for
all uses.

Personally, I've been noodling around with the idea of some sort of
declarative syntax where you tell the parser what you're expecting and it
presents you with an object model constructed using classes you give it.
Kind of an "XSL" transformation that goes straight to custom classes.
In other words, stronger support for what amounts to deserialization,
which is what I find myself needing most when I'm using XML. (YMMV of
course.) I haven't looked around enough to know if such a beast already
exists; I'm only familiar at the moment with the traditional solutions and
variations on the DOM and SAX theme (like the aforementioned pull DOM
solutions).

I think one of the problems has been a little too much focus on things
like DOM and SAX that are basically equally mediocre under all
circumstances...

David Mertz

unread,
Mar 28, 2003, 10:51:26 PM3/28/03
to
"Jeremy Bowers" <je...@jerf.org> wrote previously:

|Personally, I've been noodling around with the idea of some sort of
|declarative syntax where you tell the parser what you're expecting and it
|presents you with an object model constructed using classes you give
|it.

You might want to look at Gnosis Utilities, both gnosis.xml.validity and
gnosis.xml.objectify. For discussions, see:

http://gnosis.cx/publish/programming/xml_matters_20.html
http://gnosis.cx/publish/programming/charming_python_b11.html
http://gnosis.cx/publish/programming/xml_matters_11.html
http://gnosis.cx/publish/programming/xml_matters_2.html

Maybe I discussed the issues elsewhere too. The actual download is:

http://gnosis.cx/download/Gnosis_Utils-current.tar.gz

Not that I think Tim Bray was talking about my stuff, even indirectly.
But I'm still proud of these modules.

Yours, David...

--
Keeping medicines from the bloodstreams of the sick; food from the bellies
of the hungry; books from the hands of the uneducated; technology from the
underdeveloped; and putting advocates of freedom in prisons. Intellectual
property is to the 21st century what the slave trade was to the 16th.

Cameron Laird

unread,
Mar 29, 2003, 7:54:07 AM3/29/03
to
In article <pan.2003.03.28....@jerf.org>,

Jeremy Bowers <je...@jerf.org> wrote:
>On Fri, 28 Mar 2003 18:30:28 +0100, Just wrote:
>> """The Python people also piped to say "everything's just
>> fine here" but then they always do, I really must learn that
>> language."""
>
>Would somebody amplify on this please? What XML library do we have that's
>easier to use then DOM or SAX for real world tasks that doesn't require
>reading the whole file into memory to work?
.
.
.
I have hopes that Mr. Bray himself will speak up. In the
meantime, I'll rather gratuitously speculate:

Mr. Bray works most often, from what I understand, in Java
and Perl. While we old hands are conditioned to recognize
that most languages are mostly the same, and that practical
differences in applicability often center on the contingencies
of what libraries make available, I think this is an excep-
tional case. I don't find Python's XML modules paragons of
lucidity or functionality or elegance; I think they're pro-
bably still a cut below those Java and Perl have. HOWEVER,
coding up XML applications in Java and Perl appears
consistently to involved frustration and occasional ugliness.
Python XML work seems to move forward with less drama.
Python syntax and semantics seem to have just the right
balance between flexibility and structure to escape the
hair-pulling use of other languages inspires.

I'd like to provide an example that contrasts usages in the
different languages. I think I can do so, but I'm not will-
ing now to invest the time it would take. Maybe later ...

I summarize: it's not that Python boasts a non-memory-based
API that dominates DOM; coding conventional DOM and SAX just
feels better with Python.
--

Cameron Laird <Cam...@Lairds.com>
Business: http://www.Phaseit.net
Personal: http://phaseit.net/claird/home.html

Peter

unread,
Mar 29, 2003, 7:54:58 AM3/29/03
to
Jeremy Bowers wrote:
>
> On Fri, 28 Mar 2003 18:30:28 +0100, Just wrote:
> > """The Python people also piped to say "everything's just
> > fine here" but then they always do, I really must learn that
> > language."""
>
> Would somebody amplify on this please? What XML library do we have that's
> easier to use then DOM or SAX for real world tasks that doesn't require
> reading the whole file into memory to work?

I'm fairly sure the context you omitted would show that the comment
above didn't imply Python folks have some magic new library that
solves the XML world's problems. It seemed to be a statement made partly
in contrast to the previous statements about Perl's "a zillion ways
to do it" and the difficulties that was causing XML developers using
Perl, and partly in contrast to implied other difficulties with XML
using other languages.

Basically, he was conveying that Python people don't find processing
XML to be a huge problem in any particular way, and I think that's a
fair assessment. IMHO.

-Peter

Jeremy Bowers

unread,
Mar 30, 2003, 10:38:39 AM3/30/03
to
On Sat, 29 Mar 2003 07:54:58 -0500, Peter wrote:
> Jeremy Bowers wrote:
>> Would somebody amplify on this please? What XML library do we have
>> that's easier to use then DOM or SAX for real world tasks that doesn't
>> require reading the whole file into memory to work?
>
> I'm fairly sure the context you omitted would show that the comment above
> didn't imply Python folks have some magic new library that solves the XML
> world's problems. It seemed to be a statement made partly in contrast to
> the previous statements about Perl's "a zillion ways to do it" and the
> difficulties that was causing XML developers using Perl, and partly in
> contrast to implied other difficulties with XML using other languages.
>
> Basically, he was conveying that Python people don't find processing XML
> to be a huge problem in any particular way, and I think that's a fair
> assessment. IMHO.

I didn't take his statement that way because it doesn't make sense some to
me. I've done SAX and DOM in both Perl and Python and they are equal pains
in the ass.

You may consider XML processing not a particular pain but I think that's
more a personal judgment then anything else, because I don't see Python
granting a huge advantage over Perl. The criticisms Tim Bray leveled
against current XML programming paradigms still seems to hold in Python as
well as anything else. At best Python XML processing is at most a small
linear improvement over Perl.

This is also my basic reply to Cameron Laird's comment; unless he's
specifically referring to corner cases where you might need to do
something hairy that is easy in Python and hard in Perl, the two languages
are reasonably similar in my opinion. This *could* be a side effect of
the fact that I use Perl in a large project, and I tend to use it in a
very Pythonic way because that's my solution to writing maintainable Perl
code; it may be the case that using more Perl-ish idioms makes it harder.

Both are easier then C++ or Java just because of the more flexible, less
B&D object models.

The gnosis.xml packages *are* easier then standard DOM processing, and at
least reduce XML to relatively straight-forward tree processing, though as
the author of those utilities mentioned, I doubt Tim Bray was thinking of
those since Tim doesn't know Python. However it's possible that the
"Python people" mailing Tim may have been thinking of that.

I emphasize that I'm posting this to explore an issue and enhance my
understanding of the issues, and hopefully other's understanding as well.
I'm not trying to be combative. Since we're trying to interpret somebody
else's words who isn't right here to clarify, there isn't really a right
answer.

I'm still probably going to noodle around more later with a new library
for Python that will actually construct an object model out of objects you
provide for it, but it would be a long time before it would be publicly
available because I'd want to use it in a lot of situations before I could
be confident it was powerful enough.

Peter Hansen

unread,
Mar 30, 2003, 11:26:30 AM3/30/03
to
Jeremy Bowers wrote:
>
> On Sat, 29 Mar 2003 07:54:58 -0500, Peter wrote:
> > Basically, he was conveying that Python people don't find processing XML
> > to be a huge problem in any particular way, and I think that's a fair
> > assessment. IMHO.
>
> I didn't take his statement that way because it doesn't make sense some to
> me. I've done SAX and DOM in both Perl and Python and they are equal pains
> in the ass.

I would not be one to characterize either SAX or DOM as anything other
than pains in the ass. That doesn't reflect on Python, though, AFAIAC.

> You may consider XML processing not a particular pain but I think that's
> more a personal judgment then anything else, because I don't see Python
> granting a huge advantage over Perl.

a) Of _course_ it's a personal judgment.

b) I'm not clear why you think anyone was saying there's a "huge
advantage" over Perl. They weren't! (I suspect) They simply were
saying they didn't see there being a big problem in Python-land.
If those who spend their time in Perl-land complain a lot more than
those in Python-land, perhaps one can start to draw some conclusions,
but one of those might simply be that Perl-folk are so stressed they
need to vent some anger, while Python-folk are pretty happy in their
daily lives. :-)

> The criticisms Tim Bray leveled against current XML programming
> paradigms still seems to hold in Python as well as anything else.

Perhaps... is it possible that although SAX and DOM are pains in the
ass, using Python to do it leads many (me included) to think "no real
problem here"? That's certainly how I feel. Although some improvement
would be nice, it's not like I spend every hour of my waking life doing
XML stuff, so maybe it's just good enough to avoid me feeling like I
need to complain.

> At best Python XML processing is at most a small linear improvement
> over Perl.

That sounds very much like a personal judgment to me. <wink>

(I've never understood why people think personal judgments,
a.k.a. "opinions", are such a bad thing... every has them, but
nobody likes them? :-)

-Peter

Jeremy Bowers

unread,
Mar 30, 2003, 1:14:05 PM3/30/03
to
On Sun, 30 Mar 2003 11:26:30 -0500, Peter Hansen wrote:
> Jeremy Bowers wrote:
>> You may consider XML processing not a particular pain but I think that's
>> more a personal judgment then anything else, because I don't see Python
>> granting a huge advantage over Perl.
>
> a) Of _course_ it's a personal judgment.
(and from later)

> (I've never understood why people think personal judgments, a.k.a.
> "opinions", are such a bad thing... every has them, but nobody likes
them?
> :-)

I don't mean to imply that it is therefore invalid or bad, sorry. What I
mean is that it's not so clear an advantage that it transcends the very
fuzzy line between "personal judgement" and "obvious, if not really
'objective', fact". (An example of the latter is "For most programming
tasks, Python will result in shorter and easier-to-maintain programs than
C++ will." You can't really *prove* it, but for most practical purposes
it's true.)

> b) I'm not clear why you think anyone was saying there's a "huge
> advantage" over Perl. They weren't! (I suspect)

Logic like this:

1. (From Tim Bray) Current programming paradigms/libraries for XML suck.
2. (From Tim Bray) Python people report it is not a problem.
3. For the purposes of discussion, assume 2 as fact.
4. 1 & 3 -> Python must have an XML programming paradigm/library that
does not suck.
5. Such a library would be a "huge advantage" (my words) over Perl.

The hope that I would find out what this library/paradigm was is what
prompted my original post. Wasn't fruitless, as I was not as aware of
gnosis as I am now. (Had previously heard of the xml.pickle module but was
not aware it was part of a family of other XML tools.)

> They simply were saying
> they didn't see there being a big problem in Python-land. If those who
> spend their time in Perl-land complain a lot more than those in
> Python-land, perhaps one can start to draw some conclusions, but one of
> those might simply be that Perl-folk are so stressed they need to vent
> some anger, while Python-folk are pretty happy in their daily lives. :-)

Once I get the data into the Python program I am happier; if that's all
the Python people were getting at then I am somewhat disappointed. I
suspect Tim Bray will be as well, since IMHO he was clearly complaining
about libraries, and if he does learn Python hoping to find something
better, IMHO he's going to end up disappointed to find out that's all the
"Python people" were referring to.

Ironically, I would have *expected* that the Python community would be
that much more likely to try to come up with and embrace a better XML
parsing solution, as the current SAX/DOM approaches are that much uglier
by contrast to the rest of Python code. In Perl it fits in better with the
generally cruftiness... seems the opposite has occurred and the language
compensates enough for the cruft in SAX/DOM that it doesn't cross the
irritation threshold.

(For what it's worth, my original posting is prompted by the fact that I
intend to write a program in the near future that will want to be reading
effectively arbitrary document XML formats, up to and including the, shall
we say, "well-marked up" contents of a Microsoft Office XML document. I'm
going to need all the power I can get, because time will be money.)

>> At best Python XML processing is at most a small linear improvement over
>> Perl.
>
> That sounds very much like a personal judgment to me. <wink>

But tending toward the "obvious fact" side, as described above, though not
quite making it. A SAX-based parser in Perl and a SAX-based parser in
Python are going to be the same basic size, to within some linear factor,
if they are both doing the same thing. Post-parsing data structures may be
nicer in Python, but that's a reflection of the language, not the parsing
solution.

A lot of people misinterpreted both of the Tim Bray pieces quite badly
(see the Slashdot discussions on both articles for proof). My conclusion
is that the "Python people"*, whomever they were, misinterpreted the first
article as complaining about the final output of SAX/DOM approaches rather
then the SAX/DOM approaches themselves, and that Python does not, with the
possible exception of gnosis.xml (which I can't judge thoroughly as I
haven't used it yet, though I feel confident the API is better then DOM),
have any better solutions to the act of parsing XML then any other
language.

As always, YMMV and I'm not trying to push that conclusion down anybody's
throat. ;-) And I'd like to wave my creds around as a bona-fide Python
advocate; I'm not trying to dis Python, it's my favorite language, I just
think that the claims someone made to Tim Bray were probably not correct,
just as "Pure Python is the best solution for every programming problem"
is not correct.

*- A reminder that by "Python people" I mean the ones Tim Bray was
referring to in his post, who reported that Python has no problems with
XML.

Martin Maney

unread,
Mar 30, 2003, 5:38:10 PM3/30/03
to
Peter Hansen <pe...@engcorp.com> wrote:
> (I've never understood why people think personal judgments,
> a.k.a. "opinions", are such a bad thing... every has them, but
> nobody likes them? :-)

It's one of those "whose ox?" things. Everyone likes their own
opinions, because they rarely disagree with them!

Okay, and in news/mailing lists/etc. there is perhaps more of a
tendency for folks to use "that's just an opinion" when they have
nothing more substantial than their own differing opinions but prefer
not to draw attention to the fact. Used properly, this will sometimes
even convince observers other than themseleves that their position is
much more soundly based than their opponent's.

Of course, all this is just my opinion. :-)

Tom Bryan

unread,
Mar 31, 2003, 12:19:56 AM3/31/03
to
Jeremy Bowers wrote:

> On Fri, 28 Mar 2003 18:30:28 +0100, Just wrote:
>> """The Python people also piped to say "everything's just
>> fine here" but then they always do, I really must learn that
>> language."""
>
> Would somebody amplify on this please? What XML library do we have that's
> easier to use then DOM or SAX for real world tasks that doesn't require
> reading the whole file into memory to work?

Well, at PyCon 2003, there was a talk about Satine.
http://satine.sourceforge.net/
"Satine converts XML documents to Python lists with attributes (xlist)"

I think that it attempts to load portions of the XML stream into memory only
as needed. I'm not sure whether the entire document ends up in memory or
not...I missed most of the talk. :-(

---Tom

Walter Dörwald

unread,
Mar 31, 2003, 2:02:24 PM3/31/03
to
Jeremy Bowers wrote:

> [XML, SAX & DOM]


> Personally, I've been noodling around with the idea of some sort of
> declarative syntax where you tell the parser what you're expecting and it
> presents you with an object model constructed using classes you give it.
> Kind of an "XSL" transformation that goes straight to custom classes.
> In other words, stronger support for what amounts to deserialization,
> which is what I find myself needing most when I'm using XML. (YMMV of
> course.) I haven't looked around enough to know if such a beast already
> exists;

You might want to take a look at XIST (available from
http://www.livinglogic.de/Python/xist). XIST has a tree API for
XML that is very Pythonic and uses custom classes, i.e. every element
type is a class.

Bye,
Walter Dörwald

Jarek Zgoda

unread,
Apr 9, 2003, 5:58:07 PM4/9/03
to
Jeremy Bowers <je...@jerf.org> pisze:

>> """The Python people also piped to say "everything's just
>> fine here" but then they always do, I really must learn that
>> language."""
>
> Would somebody amplify on this please? What XML library do we have that's
> easier to use then DOM or SAX for real world tasks that doesn't require
> reading the whole file into memory to work?

Is it a problem to read whole file to parse the XML? Really? I'm happy
using ReportLab's pyRXP and effbot's ElementTree (loosely based on
J.Clark's expat parser...) - they both use similar approach to build
document structure. And both are fast, with pyRXP being faster but with
less intuitive interface.

--
Jarek Zgoda
http://www.zgoda.biz/|JID:zg...@jabber.atman.pl|http://zgoda.jogger.pl/

Paul Boddie

unread,
Apr 10, 2003, 7:52:36 AM4/10/03
to
"Jeremy Bowers" <je...@jerf.org> wrote in message news:<pan.2003.03.30....@jerf.org>...

> On Sat, 29 Mar 2003 07:54:58 -0500, Peter wrote:
> >

[Tim Bray's comments]

> > Basically, he was conveying that Python people don't find processing XML
> > to be a huge problem in any particular way, and I think that's a fair
> > assessment. IMHO.
>
> I didn't take his statement that way because it doesn't make sense some to
> me. I've done SAX and DOM in both Perl and Python and they are equal pains
> in the ass.

My interpretation of the "Python people's" presumed statement of
contentment, via Mr Bray's interpretation of that statement, is that
there exists a collection of Python libraries/modules which implement
contemporary XML processing standards to a satisfactory level such
that real work can be done with them. While I remain unsettled by
various versioning issues with PyXML and 4Suite (the main packages for
XML in Python), it is true that they provide decent functionality and,
due to their adherence to standards, are accessible to people like Mr
Bray who come from the wider XML community.

As to whether SAX or DOM are good APIs or not, I personally only
"touch down" on the DOM API when I've made use of other techniques
(notably XPath). Having not read the DOM specification cover-to-cover,
I can't claim to be an authority on DOM, but it always surprises me
that people bring out the "DOM means loading everything at once" when
it is conceivable that implementations could actively avoid such
strategies. That's a side issue, however.

[...]

> The gnosis.xml packages *are* easier then standard DOM processing, and at
> least reduce XML to relatively straight-forward tree processing, though as
> the author of those utilities mentioned, I doubt Tim Bray was thinking of
> those since Tim doesn't know Python. However it's possible that the
> "Python people" mailing Tim may have been thinking of that.

I'm not familiar with gnosis.xml, but it seems to me that non-standard
approaches (particularly the "Pythonic" APIs) would not be of
significant interest to someone who wants to remain close to the
standards. Moreover, from what I've seen of the "Pythonic" element
tree APIs, I don't see anything radically better than the DOM - yes,
the notation may be nicer or cleaner, but there isn't anything obvious
with any of them that provides the benefit that something like XPath
can give, as far as I have seen.

[...]

> I'm still probably going to noodle around more later with a new library
> for Python that will actually construct an object model out of objects you
> provide for it, but it would be a long time before it would be publicly
> available because I'd want to use it in a lot of situations before I could
> be confident it was powerful enough.

My personal belief is that the XML and RDF technologies provide
sufficiently interesting standardised means of querying and
manipulating object hierarchies that one would be best suited by
staying as close to the standardised object models as much as
possible.

Paul

Steve Holden

unread,
Apr 12, 2003, 10:29:50 AM4/12/03
to
"Jarek Zgoda" <jzg...@gazeta.usun.pl> wrote in message
news:b7251f$1ma$2...@atlantis.news.tpi.pl...

> Jeremy Bowers <je...@jerf.org> pisze:
>
> >> """The Python people also piped to say "everything's just
> >> fine here" but then they always do, I really must learn that
> >> language."""
> >
> > Would somebody amplify on this please? What XML library do we have
that's
> > easier to use then DOM or SAX for real world tasks that doesn't require
> > reading the whole file into memory to work?
>
> Is it a problem to read whole file to parse the XML? Really? I'm happy
> using ReportLab's pyRXP and effbot's ElementTree (loosely based on
> J.Clark's expat parser...) - they both use similar approach to build
> document structure. And both are fast, with pyRXP being faster but with
> less intuitive interface.
>

It is a problem when the file is several gigabytes long, I suppose, and
people are for some reason using XML to transfer data structures that long
according to reports on this group.

regards
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/pwp/
Did you miss PyCon DC 2003? Would you come to PyCOn DC 2004?

0 new messages