I posted some time ago about eventually attempting to implement the
Ruby 1.9 YAML support ("Psych") with SnakeYAML, and tonight I've
finally done it.
The process was almost painless; the SnakeYAML API matches libyaml
closely, and Psych was implemented mostly in Ruby with only a small
core wrapping libyaml's exported functions. There have been only a few
places where things did not match libyaml which I'll summarize below.
I would like to congratulate Andrey on an excellent job combining the
libyaml API and his own implementation.
I hope to post more once I start running tests on Psych, but so far
it's definitely functional.
So, here's the issues I ran into:
* YAML is supposed to always be unicode, so supporting arbitrary
encodings is not a big deal. However, SnakeYAML uses all Java strings,
which means it always pays the cost of decoding byte[] to char[], and
when going back out to Ruby we pay the cost to encode char[] to
byte[]. I don't think there's a way around this, but I have a concern
that we might have to make a fork of SnakeYAML in the future that can
deal with byte[] directly.
* I could see no way to update an Emitter's settings after it has been
created, as in the libyaml functions yaml_emitter_set_canonical,
yaml_emitter_set_indent, etc. For now, I'm updating the DumperOptions
I created the Emitter with and hoping they propagate.
* There was no way to specify encodings, presumably because all
incoming YAML data comes from a Reader and all outgoing YAML goes to a
Writer. This was a gap from the libxml API, but it's more a challenge
of being on the JVM...we always work with UTF-16.
* Event should have a getID, so that the ID enum can be used in a
switch. This would be faster than chaining if (
event.is(...)) calls,
which is what I had to do for now.
That's about it. Thanks again! Please let me know if you have any
questions.