I'm new to both YAML and SnakeYAML, so please let me know if I'm
making any hideous blunders. I've got a bit of a weird one, at least I
think it is.
###############
Context
###############
I'm trying to load Unity3D game engine files, which can be saved in
YAML, for a research project I'm working on. I don't have access to
the Unity3D source, so I don't know what they're using to write these
files. I can't change their output, but of course I could 'fix' the
files as a workaround.
I'll get to the file format specifics in a second, but first I'll
describe my overall goal. Our project only cares about some of the
file contents, so I was intending on loading everything with a simple
Constructor. In it, I was thinking I'd make a few Construct classes
that deal with the specific stuff we care about and a GenericConstruct
that just handles Nodes as one of the simple scalar/map/list types.
###################
Expected File Format
###################
The file format is outlined on the Unity3d website (if you care,
http://unity3d.com/support/documentation/Manual/FormatDescription.html),
but the short version is thus:
1. Each game object component in a Unity Scene is saved as a separate
YAML Document. One YAML file per Scene, so multiple documents per
stream. There is minimal nesting (I can't find anything more than a
single nesting level).
2. Those files start with the directives:
%YAML 1.1
%TAG !u! tag:
unity3d.com,2011:
3. Game Object Documents start with:
--- !u!<object_type_num> <anchor>
For example:
--- !u!29 &1
or:
--- !u!157 &4
Where 29 is the constant for a "Scene" component type, 157 for a
"LightmapSettings" component. My understanding is that these should
always expand to "tag:
unity3d.com,2011:29" and "tag:
unity3d.com,
2011:157" for the examples above.
#################
The Problem
#################
So now that's out of the way, here's my problem.
When I call yaml.loadAll(...), it's fine for the first Document in the
stream. On the next one, I get a ParserException that complains that
it "found undefined tag handle !u!". Upon searching variables, it is
true that !u! disappears from the tagHandles instance variable in
org.yaml.snakeyaml.parser.ParserImpl. This looks to be because
processDirectives() is called every document and it trashes the old
tagHandles list.
From my reading of the YAML spec (
http://yaml.org/spec/1.1/#id898785),
it's supposed to preserve directives between documents:
"To ease the task of concatenating character streams, following
documents may begin with a byte order mark and comments, though the
same character encoding must be used through the stream. Each
following document must be explicit (begin with a document start
marker). If the document specifies no directives, it is parsed using
the same settings as the previous document. If the document does
specify any directives, all directives of previous documents, if any,
are ignored."
Since the file I'm parsing contains only explicit Documents and the
only directives appear at the the top of the file before the first
Document, should it not carry forward to the next Document?
Since I'm still new to the format and the library, I hope that I'm
just Doing It Wrong, but I figured I'd ask people who know what's up.
So... any thoughts? Is this an esoteric bug or is there some setting
that I'm missing? Is my implementation strategy of having a generic
construct a bad idea? Should I start an issue ticket? Is there a
recommended workaround?
PS. Let me know if you want any more information about anything. I was
trying to keep this short, but still provide as much info as needed.