XML Unmarshal, Choice of Elements

1,746 views
Skip to first unread message

Graham MacDonald

unread,
Jul 17, 2014, 4:48:10 PM7/17/14
to golan...@googlegroups.com
Hi

I'm trying to parse an example SVG file using the encoder/xml Unmarshal function.  This works very well except...  I can't figure out how best to handle the case of multiple different child elements, where order matters.  E.g.:
<g>
 
<path .../>
 
<rect .../>
</g>

My current group struct is something like:
type Group struct {
  Id          string        `xml:"id,attr"`
  Elements    []interface{} `xml:",any"
}

I have Rect and Path structs (named exactly as so - I also tried all lower case names).  When I Unmarshal, this will create a 2 element array of nil interfaces.  This is kind of understandable, as the Unmarshal function doesn't know about my Rect and Path structs.

What's the best way of handling this case?

I've considered having one god-struct with all the possible properties of path, rect (and circle, line, polygon, text, etc). I could then have different interfaces for the individual types.  It's still pretty messy having such a large struct though.

Another thing I've thought about is having the Unmarshal call extract the raw XML and parse each child element of <g> manually.  Again, not great.

Any better ways?

Thanks,
Graham

Shawn Milochik

unread,
Jul 17, 2014, 5:07:56 PM7/17/14
to golan...@googlegroups.com
Can you share some XML? It just happens that I've been doing a lot of XML parsing with deeply-nested and variable (within a common overall schema) files. Maybe I can help a bit if you put up some XML and the desired output.

Graham MacDonald

unread,
Jul 17, 2014, 5:13:50 PM7/17/14
to golan...@googlegroups.com, Sh...@milochik.com
An example SVG file I'm trying to parse is:

<svg width="79px" height="114px" viewBox="0 0 79 114" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:sketch="http://www.bohemiancoding.com/sketch/ns">
    <!-- Generator: Sketch 3.0.4 (8053) - http://www.bohemiancoding.com/sketch -->
    <title>ship</title>
    <desc>Created with Sketch.</desc>
    <defs></defs>
    <g id="Page-1" stroke="none" stroke-width="1" fill="none" fill-rule="evenodd" sketch:type="MSPage">
        <path d="M70.7470703,54.9351921 C59.4438539,23.2101932 39.4404297,-0.0302734375 39.4404297,-0.0302734375 C39.4404297,-0.0302734375 19.8288957,22.9468825 8.09220641,54.9351916 C3.08063764,68.5942062 -0.495117188,83.8962169 -0.495117188,99.7539062 L19.5214844,108.566406 L59.1821013,108.566406 L79.4462891,100.046875 C79.4462891,100.046875 75.1865234,67.3955078 70.7470703,54.9351921 Z" id="Path-1" fill="#D8D8D8" sketch:type="MSShapeGroup"></path>
        <rect id="Rectangle-1" fill="#F6A623" sketch:type="MSShapeGroup" x="22" y="107" width="34" height="7"></rect>
    </g>
</svg>

egon

unread,
Jul 17, 2014, 5:38:39 PM7/17/14
to golan...@googlegroups.com
There is xml.Decoder that should allow you to handle this better.


And you can mix it up with DecodeElement.

+ egon

Shawn Milochik

unread,
Jul 17, 2014, 5:39:30 PM7/17/14
to golan...@googlegroups.com
This works for parsing your file. I don't know if it answers your question, partially because you didn't specify what your desired output is.


Let us know.

Graham MacDonald

unread,
Jul 17, 2014, 6:31:08 PM7/17/14
to golan...@googlegroups.com, sh...@milochik.com
This would work for this particular file, but a group can contain any combination of path, rect, circle, line, polygon, text, etc., so I'd need something more general.  Thanks for the suggestion though!

As for output, at this stage I'm looking for an equivalent representation of the XML in struct form (if I use Unmarshal).

Graham MacDonald

unread,
Jul 17, 2014, 6:32:36 PM7/17/14
to golan...@googlegroups.com
Yeah, it may be that another approach is more suitable in this case, though it would be nice to avoid implementing a state-machine-like decoder.  I may not have a choice though - thanks for the suggestion!

Shawn Milochik

unread,
Jul 17, 2014, 6:39:35 PM7/17/14
to golan...@googlegroups.com
On Thu, Jul 17, 2014 at 6:31 PM, Graham MacDonald <grahamam...@gmail.com> wrote:
This would work for this particular file, but a group can contain any combination of path, rect, circle, line, polygon, text, etc., so I'd need something more general.  Thanks for the suggestion though!

As for output, at this stage I'm looking for an equivalent representation of the XML in struct form (if I use Unmarshal).


If you create structs for all the possible types and add them to Group then it will provide the correct output in all cases. It also creates more explicit code so that anyone reading it understands what data they're dealing with. Of course this doesn't address handling different "types" of groups differently, but neither does the equivalent of Python's "json.loads," which seems to be what you're asking for. 

I believe you're looking for some magic that doesn't exist in Go.

 

Graham MacDonald

unread,
Jul 17, 2014, 6:48:58 PM7/17/14
to golan...@googlegroups.com, sh...@milochik.com
Yes, if I created an array of each possible type as a child of Group, then it could work, but it would lose the ordering, which is important in this case.

Dan Kortschak

unread,
Jul 17, 2014, 7:10:35 PM7/17/14
to Graham MacDonald, golan...@googlegroups.com, sh...@milochik.com
You can do some pretty interesting magic with the xml.Unmarshaler* interfaces. If you define a type that has children that satisfy these interfaces, but which act on, say a slice of data types that you actually want to interact with, instead of themselves the order can be preserved.

Graham MacDonald

unread,
Jul 18, 2014, 2:53:30 AM7/18/14
to golan...@googlegroups.com, grahamam...@gmail.com, sh...@milochik.com
I tried embedding structs, but that didn't work so well with duplicate members (although they could be pulled up into the wrapping struct.

My current solution is an implementation of xml.Unmarshaler, as suggested by kortschak, using the xml.Decoder, as suggested by egon.  Here it is, though I'm sure there's room for improvement.  Thanks!

func (g *Group) UnmarshalXML(decoder *xml.Decoder, start xml.StartElement) error {
for _, attr := range start.Attr {
switch attr.Name.Local {
case "id":
g.Id = attr.Value
case "stroke":
g.Stroke = attr.Value
case "stroke-width":
if intValue, err := strconv.ParseInt(attr.Value, 10, 32); err != nil {
return err
} else {
g.StrokeWidth = int32(intValue)
}
case "fill":
g.Fill = attr.Value
case "fill-rule":
g.FillRule = attr.Value
}
}

for {
token, err := decoder.Token()
if err != nil {
return err
}

switch tok := token.(type) {
case xml.StartElement:
var elementStruct interface{}

switch tok.Name.Local {
case "rect":
elementStruct = &Rect{}
case "path":
elementStruct = &Path{}
}

if err = decoder.DecodeElement(elementStruct, &tok); err != nil {
return err
} else {
g.Elements = append(g.Elements, elementStruct)
}

fmt.Println(tok.Name)

case xml.EndElement:
return nil
}
}
}

egon

unread,
Jul 18, 2014, 3:22:08 AM7/18/14
to golan...@googlegroups.com, grahamam...@gmail.com, sh...@milochik.com
Please use play.golang.com to post runnable&compilable examples, it makes life easier for people trying to help you. Thanks.

+ egon

Graham MacDonald

unread,
Jul 18, 2014, 4:13:43 AM7/18/14
to golan...@googlegroups.com, grahamam...@gmail.com, sh...@milochik.com

egon

unread,
Jul 18, 2014, 4:45:19 AM7/18/14
to golan...@googlegroups.com, grahamam...@gmail.com, sh...@milochik.com
Add the XMLName and the group tags then it can automatically marshal as well. http://play.golang.org/p/gZYcNfZBRl

It would be possible to automate the attribute handling, e.g. http://golang.org/src/pkg/encoding/xml/read.go?s=#L228 but probably it's not worth the effort.

+ egon
Reply all
Reply to author
Forward
0 new messages