Parsing XML with namespaces in golang

4,715 views
Skip to first unread message

emarti...@gmail.com

unread,
Jul 23, 2017, 4:51:41 PM7/23/17
to golang-nuts
Hello,

So I'm trying to unmarshal an XML with namespaces in go but i just can't find the right parser to do so.

My code:

package main

import "fmt"
import "encoding/xml"

type Root struct {
    MNResultsResults []ls `xml:"xmlns ls, MNResultsResults>ssResultSet>ssResult"`
}

type ls struct {
    Data float64 `xml:"xmlns ls, cpc"`
}


func main() {
    x := `<?xml version="1.0" encoding="utf-8"?>
 <ls:MNResultsResults xmlns:ls="urn:MNResultsResults">  
 <ls:ssResultSet ls:firstResult="1" ls:numResults="2" ls:totalHits="2">    
    <ls:ssResult ls:id="1">           
    <ls:abstract><![CDATA[some data.]]></ls:abstract>
    <ls:title><![CDATA[some data]]></ls:title>          
     <ls:url><![CDATA[data]]></ls:url>          
      <ls:displayUrl><![CDATA[some data]]></ls:displayUrl>          
       <ls:cpc>float value</ls:cpc>
    </ls:ssResult>       
    <ls:ssResult ls:id="2">       
        <ls:abstract><![CDATA[some data.]]></ls:abstract>       
        <ls:title><![CDATA[some data.]]></ls:title>
        <ls:url><![CDATA[some data]]></ls:url>      
        <ls:displayUrl><![CDATA[some data]]></ls:displayUrl>       
        <ls:cpc>float value</ls:cpc>    
        </ls:ssResult>
        </ls:ssResultSet>
    </ls:MNResultsResults>`
   
    r := Root{}
    xml.Unmarshal([]byte(x), &r)
    fmt.Printf("%+v", r)
}

I've tried several combinations, but i always get a nil result.

Any ideas how to parse this?

Konstantin Khomoutov

unread,
Jul 24, 2017, 4:04:58 AM7/24/17
to emarti...@gmail.com, golang-nuts
On Sun, Jul 23, 2017 at 01:51:41PM -0700, emarti...@gmail.com wrote:

Hi!

> So I'm trying to unmarshal an XML with namespaces in go but i just can't
> find the right parser to do so.
[...]

This (elided for brewity)

----------------8<----------------
package main

import (
"fmt"
"encoding/xml"
)

type Root struct {
XMLName struct{} `xml:"urn:MNResultsResults MNResultsResults"`
Cpcs []float64 `xml:"urn:MNResultsResults ssResultSet>ssResult>cpc"`
}

const s = `<?xml version="1.0" encoding="utf-8"?>
<ls:MNResultsResults xmlns:ls="urn:MNResultsResults">
<ls:ssResultSet ls:firstResult="1" ls:numResults="2" ls:totalHits="2">
<ls:ssResult ls:id="1">
<ls:cpc>42.0</ls:cpc>
</ls:ssResult>
<ls:ssResult ls:id="2">
<ls:cpc>12.333</ls:cpc>
</ls:ssResult>
</ls:ssResultSet>
</ls:MNResultsResults>`

func main() {
r := Root{}
err := xml.Unmarshal([]byte(s), &r)
if err != nil {
panic(err)
}
fmt.Printf("%+v", r)
}
----------------8<----------------

works by printing

{XMLName:{} Cpcs:[42 12.333]}

(Playground link: https://play.golang.org/p/RehqytlFQ9).

The chief idea is that the namespace must be separated from the element
or attribute name it qualifies with a space character.

emarti...@gmail.com

unread,
Jul 24, 2017, 10:44:34 PM7/24/17
to golang-nuts, emarti...@gmail.com
Thanks!.


 XMLName struct{} `xml:"urn:MNResultsResults MNResultsResults"`
        Cpcs []float64 `xml:"urn:MNResultsResults ssResultSet>ssResult>cpc"`

Is there any reason to declare the namespace 2 times? At first we declare XMLName (which doesn't seem to be used) and then we use urn:MNResultsResults again at CPC []

Tamás Gulácsi

unread,
Jul 25, 2017, 12:43:48 AM7/25/17
to golang-nuts
No, you don't declare the namespace, but specify that which namespace the tag you're searching for is in.

Konstantin Khomoutov

unread,
Jul 25, 2017, 2:57:01 AM7/25/17
to emarti...@gmail.com, golang-nuts
On Mon, Jul 24, 2017 at 07:44:33PM -0700, emarti...@gmail.com wrote:

[...]
>>> So I'm trying to unmarshal an XML with namespaces in go but i just can't
>>> find the right parser to do so.
[...]
>> type Root struct {
>> XMLName struct{} `xml:"urn:MNResultsResults MNResultsResults"`
>> Cpcs []float64 `xml:"urn:MNResultsResults
>> ssResultSet>ssResult>cpc"`
[...]
>> const s = `<?xml version="1.0" encoding="utf-8"?>
>> <ls:MNResultsResults xmlns:ls="urn:MNResultsResults">
>> <ls:ssResultSet ls:firstResult="1" ls:numResults="2" ls:totalHits="2">
>> <ls:ssResult ls:id="1">
>> <ls:cpc>42.0</ls:cpc>
[...]
> Is there any reason to declare the namespace 2 times?
> At first we declare XMLName (which doesn't seem to be used)

No, it is used: please read carefully the documentation on the
encoding/xml.Unmarshal (run `go doc xml.Unmarshal` or see [1]).
To cite the unmarshaling rules from there:

| * If the XMLName field has an associated tag of the form
| "name" or "namespace-URL name", the XML element must have the
| given name (and, optionally, name space) or else Unmarshal
| returns an error.

So the "XMLName" of a field is special and is served (among other
purposes, if needed) to tell the decoder what's the name -- possibly
namespaced -- of the XML element which is to be represented by the
data type containing that field.

> and then we use urn:MNResultsResults again at CPC []

You're correct on this point: the second namespace is not needed.
Again, to cite the doc:

| * If the XML element contains a sub-element whose name matches
| the prefix of a tag formatted as "a" or "a>b>c", unmarshal
| will descend into the XML structure looking for elements with the
| given names, and will map the innermost elements to that struct
| field. A tag starting with ">" is equivalent to one starting
| with the field name followed by ">".

From this definition, it's not clear, whether the full (namespaced)
names of the elements are meant or local, or both are recognized --
that is, "XML-namespace name>Another-XML-namespace another-name" is
understood, too. I have assumed the latter, but I'm now not sure.

Seems to work both ways on playground, so I'd drop the NS from the tag of
the Cpc field.

1. https://golang.org/pkg/encoding/xml/#Unmarshal

Matt Harden

unread,
Jul 25, 2017, 10:22:52 PM7/25/17
to Konstantin Khomoutov, emarti...@gmail.com, golang-nuts
When you leave the namespace out, it will match that tag in any namespace (or none).

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages