def startElement(self, name, attr):
if name == "foo":
do_foo_stuff()
elif name == "bar":
do_bar_stuff()
elif name == "baz":
do_baz_stuff()
There's similar code in endElement() and characters() and of course, the more
tags that need processing the more unweildly each method becomes.
I could create a dictionary and dispatch to the correct method based on a
dictionary key. But it seems to me there must be a better way.
Is there?
--
Stand Fast,
tjg.
Timothy Grant
www.craigelachie.org
> I've worked with DOM before, but never SAX, I have a script that
> seems to work quite well, but every time I look at it I think it's
> amazingly unweildly and that I must be doing something wrong.
>
> def startElement(self, name, attr):
> if name == "foo":
> do_foo_stuff()
> elif name == "bar":
> do_bar_stuff()
> elif name == "baz":
> do_baz_stuff()
>
> There's similar code in endElement() and characters() and of course,
> the more tags that need processing the more unweildly each method
> becomes.
>
> I could create a dictionary and dispatch to the correct method based on a
> dictionary key. But it seems to me there must be a better way.
Tim,
No, you're not doing anything wrong in your above code. But, yes, code
like that can get pretty unwieldy when you've got deeply nested
structures. A couple of suggested alternative approaches:
1. As you already suggested, build up a dictionary mapping tag names
to functions. But that still means that you have to maintain state
information between your function calls.
2. Represent each new element that comes in as a python object
(object()), i.e. construct a "Python Object Model" or POM. All xml
attributes on the element become python attributes of the python
object. That object then goes onto a simple stack. Each object() also
has a list of children. When a new xml child element is passed,
through a nested startElement call, it too gets turned into a python
object, and made a child of the python object on the top of the stack.
Pop objects off the stack as you receive endElement events.
This paradigm is described here, with working code
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/149368
3. Have a specialised python class for each element type (i.e. tag
name), that implements the "business logic" you want associated with
the element type. Write descriptors or properties for each specialised
class that, for example, only allow child python objects of a certain
type. Create a dictionary mapping tag names to your classes, and
instantiate a new python object of that class everytime you receive
the relevant tag name through a startElement call. Put it onto a
stack, as above, store nested xml elements as children of the
top-of-stack, and pop the stack when you receive endElement calls.
That's just a couple of approaches. There might be other, perhaps more
appropriate, methods, if you could describe what you're trying to
achieve. Are you extracting data from the documents? Or are you
checking that the documents comply with some form of hierarchical
structure? Or something else entirely?
HTH,
--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/mailto/alan
>I've worked with DOM before, but never SAX, I have a script that seems to
>work
>quite well, but every time I look at it I think it's amazingly unweildly and
>that I must be doing something wrong.
>
>def startElement(self, name, attr):
> if name == "foo":
> do_foo_stuff()
> elif name == "bar":
> do_bar_stuff()
> elif name == "baz":
> do_baz_stuff()
>
>There's similar code in endElement() and characters() and of course, the more
>tags that need processing the more unweildly each method becomes.
>
>I could create a dictionary and dispatch to the correct method based on a
>dictionary key. But it seems to me there must be a better way.
Please note that this is not a SAX-specific question. I suggest a subject
of How To Manage Function Dispatching.
Your question comes down to what's the "best" way to dispatch a piece of
logic based on some condition. You have already given 2 good alternatives:
your example, and using a dictionary. The other approach I sometimes use is
defining a class for each condition and then instantiating the class as needed.
Class foo:
def __init__(self):
# do foo stuff
etc for each case, then
try: x = eval(name+'()')
except: raise "Unknown name " + name
Which, of course, is just a dictionary lookup in disguise.
Bob Gailer
bga...@alum.rpi.edu
303 442 2625
def startElement(self, name, attr):
f = getattr(self, "do_" + name + "_stuff", None)
if f is not None:
f()
However, the dictionary interface is probably faster, and you can
build it up using introspection, like
class MyHandler:
def __init__(self):
self.start_d = {}
self.end_d = {}
for name in MyHandler.__dict__:
if name.startswith("do_") and name.endswith("_stuff"):
s = name[3:-6]
self.start_d[s] = getattr(self, name)
elif name.startswith("done_") and ...
s = name[5:-6]
self.end_d[s] = getattr(self, name)
def startElement(self, name, attrs):
f = self.start_d.get(name)
if f:
f()
...
Andrew
da...@dalkescientific.com
I appreciate your input!
On Tuesday 22 July 2003 11:02 am, Bob Gailer wrote:
> At 10:22 AM 7/22/2003 -0700, Timothy Grant wrote:
> >I've worked with DOM before, but never SAX, I have a script that seems to
> >work
> >quite well, but every time I look at it I think it's amazingly unweildly
> > and that I must be doing something wrong.
> >
> >def startElement(self, name, attr):
> > if name == "foo":
> > do_foo_stuff()
> > elif name == "bar":
> > do_bar_stuff()
> > elif name == "baz":
> > do_baz_stuff()
> >
> >There's similar code in endElement() and characters() and of course, the
> > more tags that need processing the more unweildly each method becomes.
--