XML Stream Parser examples

478 views
Skip to first unread message

Mandolyte

unread,
Apr 5, 2017, 7:44:21 PM4/5/17
to golang-nuts
First, never thought I'd have to parse XML again (it's been over 10 years), but life happens...

After a lot of searching I found only few examples using the streaming API. But I'm not sure the examples will work for me (I'll find out more tomorrow when I get back to the office). The XML I must parse is deeply nested with the same element nodes at different levels; it is in general a bit unpredictable. It is, in essence, a set of rules for rules engine/wizard. The files are moderate in size; I have seen any over 10K lines pretty printed yet.

I need to parse to find a known element, then based on what I find, decode its children elements. This implies I need to treat the child elements as a document (may have to add a fake root node, unless the decoder accepts sequences). But I haven't seen any examples of such an approach.

Thanks for any advice!
Cecil

Matt Harden

unread,
Apr 5, 2017, 8:17:29 PM4/5/17
to Mandolyte, golang-nuts
Here's my favorite way to handle such situations. It can probably be adapted to your situation. https://play.golang.org/p/FQ0g4rytz3

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sam Whited

unread,
Apr 6, 2017, 1:05:41 AM4/6/17
to Mandolyte, golang-nuts
On Wed, Apr 5, 2017 at 6:44 PM, Mandolyte <ceci...@gmail.com> wrote:
> I need to parse to find a known element, then based on what I find, decode
> its children elements. This implies I need to treat the child elements as a
> document (may have to add a fake root node, unless the decoder accepts
> sequences).

If you are parsing and you find the parent node that you were looking
for, you can continue and unmarshal the entire element and all its
children with DecodeElement() [1] by passing it the start element you
just found, so no need to split it out and wrap it in a custom root
node.

However, you got me curious. If you *did* need to wrap an invalid XML
document without a root node, how much work would it be with the Go
standard library as it exists today? Since the XML libraries don't
provide a great way to do this, you'd have to wrap the underlying
reader, which got somewhat tedious and is probably a bit error prone:

https://play.golang.org/p/bPGVuYOVvO

However, if you have an API that lets you handle this sort of thing at
the token level [2, 3], it becomes a little bit nicer (this won't
actually run on the playground, of course):

https://play.golang.org/p/tNhD_0-QEL


—Sam


[1]: https://godoc.org/encoding/xml#Decoder.DecodeElement
[2]: https://golang.org/issue/19480
[3]: https://godoc.org/mellium.im/xmlstream
Reply all
Reply to author
Forward
0 new messages