XML Pretty Print

576 views
Skip to first unread message

Mandolyte

unread,
Dec 11, 2017, 10:12:38 AM12/11/17
to golang-nuts
At my previous company I had a Go program that would take any XML document and output it as XML in a nicer human readable format, i.e., pretty printed. I've trying to find it again, but my searches have not turned up any. Perhaps I had written myself, but I didn't think so.

Does anyone have a link to one?

Thanks in advance,
Cecil

C Banning

unread,
Dec 11, 2017, 11:03:12 AM12/11/17
to golang-nuts
https://godoc.org/github.com/clbanning/mxj#BeautifyXml does that - but you've got to lug around a pretty large package just for that functionality.

C Banning

unread,
Dec 11, 2017, 11:08:00 AM12/11/17
to golang-nuts
Well, pretty much all the functionality you need is in clbanning/mxj/xmlseq.go ... so you could cut it out and clean it up a bit.

Sam Whited

unread,
Dec 11, 2017, 11:38:44 AM12/11/17
to golan...@googlegroups.com
My xmlstream package contains a formatting transformer that might do
what you want:

https://godoc.org/mellium.im/xmlstream

Note that this was written to test some changes for Go 1.10 and it may
only build there.

—Sam

Mandolyte

unread,
Dec 15, 2017, 12:27:21 PM12/15/17
to golang-nuts
I was able to do a minimalist pretty print with limitations just using marshal/unmarshal. See the code here under the "identityXform" folder. There is a "readme" in that folder with some notes on this. Also tried my hand at blogging about my findings http://www.mandolyte.info.

Thanks all for responses.
Cecil

Hugh S. Myers

unread,
Dec 15, 2017, 1:47:32 PM12/15/17
to Mandolyte, golang-nuts
Has anyone tried HTML Tidy? I know it Pretty-prints HTML and I remember that the same is claimed for XML?…

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tamás Gulácsi

unread,
Dec 16, 2017, 12:18:19 AM12/16/17
to golang-nuts
A natural choice for xml pretty printing is xmllint from libxmltools-bin.

howar...@gmail.com

unread,
Dec 20, 2017, 10:14:51 AM12/20/17
to golang-nuts
One thing to consider is what level of change to the file is acceptable? If it is just to be human read, than the suggestions already given may do what you want; but it you want to take that result and preserve it for future use, then the mxj and the encoding/xml based methods may produce unwanted alterations, in which case a package like https://github.com/go-xmlfmt/xmlfmt , which doesn't actually parse the xml properly at all, just tries to push it back and forth to line up and look pretty using regular expressions might be more to your taste. As far as I can see, it won't change anything other than whitespace, where a parse/pretty-print cycle may have effects such as replicating namespace entries or prefixes throughout, changing self-closing tags to tag-pairs or vice-versa, etc.

Regexps are pretty cringe-worthy when you see them being used in code that is trying to parse HTML or XML and operate on it, but for this use-case, they might handle pretty-printing potentially invalid XML data more readily than a parser. I.e. the regexp is not going to care if the xml file you got has unbalanced tags, etc, and will probably nicely format the first half of an XML file that got the latter half replaced with garbage from cross-linking or the like.
Reply all
Reply to author
Forward
0 new messages