"gopkg.in/yaml.v2" package silently misparses file that is misformatted

103 views
Skip to first unread message

David Karr

unread,
Feb 27, 2025, 6:34:29 PMFeb 27
to golang-nuts
I wrote some code to load a yaml file and do some work with the resulting data.  I'm using the "gopkg.in/yaml.v2" package for it.  This has been working fine for properly formatted YAML.  However, today I discovered that a slightly misformatted YAML file is being happily loaded by this code, without throwing any error, but also making sort of odd decisions on what data to actually load, although seeing what it did I suppose that's debatable.

In my suspect yaml file, I have something like this:

    stuff:
      keys:
        - key1
          - key2
          - key3

Note the incorrect indentation for "key2" and "key3".

When I load this with code like this:

    err = yaml.Unmarshal(configFile, &config)
    if err != nil {
        log.Fatalf("Failed to parse configuration file: %v", err)
    }

This unexpectedly does NOT fail. However, it produces a "keys" list with only one entry, with the following value:

    key1 - key2 - key3

I can sort of see why it would make that decision. Is the lesson here that YAML is intended to be easily readable, but not easily writable? I see that there are some command-line tools for consistent formatting of YAML, but I need this done in code, but I think I'd rather just fail if the formatting is inconsistent. Is there any kind of a "strict yaml parser" that will notice things like this?

Brian Candler

unread,
Feb 28, 2025, 4:57:47 AMFeb 28
to golang-nuts
I tried three different online YAML linters/formatters with that input and they gave the same result as you, as does python:

# python3
Python 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import yaml
>>> data="""
... stuff:
...    keys:
...      - key1
...        - key2
...        - key3
... """
>>> yaml.load(data)
<stdin>:1: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
{'stuff': {'keys': ['key1 - key2 - key3']}}
>>>

> Is the lesson here that YAML is intended to be easily readable, but not easily writable?

I think that's fair, yes.  The YAML specifications (there are several versions) are an utter mess, and the way the ambiguities are resolved can lead to surprising results. Seeing {{ text interpolation }} in YAML just makes me cringe.

If you're looking to generate configurations then there are clearer languages like jsonnet and starlark, although they add a lot more expressive power and composability that you might not need.

twp...@gmail.com

unread,
Feb 28, 2025, 10:44:49 AMFeb 28
to golang-nuts
I've recently switched to https://github.com/goccy/go-yaml and am very happy with it so far. I've not tried it on your use case.

Regards,
Tom

Reply all
Reply to author
Forward
0 new messages