processing xml when performance is top priority

1,645 views
Skip to first unread message

S Ahmed

unread,
Jan 1, 2014, 11:27:37 AM1/1/14
to golan...@googlegroups.com
Hello,

Realizing that this may be a more verbose way of writing code, what is currently known to be some of the fastest (both in ms and memory usage) methods of parsing xml in go?

Currently I am doing this in java using xerces (sax) where I have to handle events which makes for less readable code in general but it is suppose be much faster than loading the entire DOM in memory.

Looking to learn and hopefully see some sample code on how to do this in Go.

Thanks!

egon

unread,
Jan 1, 2014, 11:59:41 AM1/1/14
to golan...@googlegroups.com


On Wednesday, January 1, 2014 6:27:37 PM UTC+2, gitted wrote:
Hello,

Realizing that this may be a more verbose way of writing code, what is currently known to be some of the fastest (both in ms and memory usage) methods of parsing xml in go?

Be more specific, how much of XML spec do you really need to be worrying about? If speed is of essence ignoring some parts of a spec can give you speed-ups.

Do you just need parsing or do you need to do something with it? Describe the full workflow... are you extracting/inspecting/transforming values, do you need streaming behavior, can you substitute xml for something else etc.?
 

Currently I am doing this in java using xerces (sax) where I have to handle events which makes for less readable code in general but it is suppose be much faster than loading the entire DOM in memory.

sax style will be faster and more memory efficient at the cost of not having fast random access.
 

Looking to learn and hopefully see some sample code on how to do this in Go.


S Ahmed

unread,
Jan 1, 2014, 1:59:51 PM1/1/14
to golan...@googlegroups.com
Using java, I am parsing the xml and mapping the values to a object.  I then persist the xml to mysql and when I read from the db I map it back to a java object.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

egon

unread,
Jan 1, 2014, 3:21:22 PM1/1/14
to golan...@googlegroups.com
Could you use xml -> object -> binary serialize -> db -> binary unserialize -> object? Or do you need xml in the DB for backcomp? xml will always be slower than language specific binary serialization.

So you want to rewrite the java thing in go, or do you need to process the things in DB?

+ egon

S Ahmed

unread,
Jan 1, 2014, 3:54:20 PM1/1/14
to golan...@googlegroups.com
I have a service that is written in java that I am thinking of re-writing in Go, or at least investigating it b/c I like the quick compiles of Go and I think this would be a great project for me to get my hands dirty with Go.

I could do the xml -> object -> binary serialized -> db method as you describe, but first I need to parse the xml to get it into an object.

egon

unread,
Jan 1, 2014, 4:12:41 PM1/1/14
to golan...@googlegroups.com

On Wednesday, January 1, 2014 10:54:20 PM UTC+2, gitted wrote:
I have a service that is written in java that I am thinking of re-writing in Go, or at least investigating it b/c I like the quick compiles of Go and I think this would be a great project for me to get my hands dirty with Go.

I could do the xml -> object -> binary serialized -> db method as you describe, but first I need to parse the xml to get it into an object.

Yes, but then the performance of xml parsing isn't that important anymore. (I assume it's done once per objects and the number of objects is reasonable). So, you can use the trivial approach and use the encoding/xml.Unmarshal http://golang.org/pkg/encoding/xml/#Unmarshal . If you hit memory limit, then use the parsing huge xml files technique.

+ egon
Reply all
Reply to author
Forward
0 new messages