Writing add on TreeBuilders

18 views
Skip to first unread message

Robert Steed

unread,
Nov 15, 2016, 6:18:08 PM11/15/16
to beautifulsoup
I've written a TreeBuilder for OFX (Open Financial eXchange - a financial markup based on SGML used by Quicken etc).  Why?  Because none of the parsers included with BeautifulSoup4 properly deal with this markup's optional closing tags.

The class is defined as inheriting from bs4's TreeBuilder class:

    class OFXTreeBuilder(TreeBuilder):

I can then "make soup" using:

    soup = BeautifulSoup(fh, builder=OFXTreeBuilder())

So far, it works great.  I think others would find it useful and I'm considering sharing on Github and PyPI.  However, I'm concerned that the bs4 TreeBuilder interface may change.  While the interface architecture is pretty well documented in the code comments on BitBucket, I have no idea how stable the specification is.

What can I expect?

leonardr

unread,
Nov 15, 2016, 8:46:44 PM11/15/16
to beautifulsoup
Robert,

A good question. I would go ahead and share the code. The TreeBuilder interface has been stable since 2009 and I have no plans to change it.

Leonard
Reply all
Reply to author
Forward
0 new messages