Disable OptionParser's dynamic parsing by default?

101 views
Skip to first unread message

Adrian Irving-Beer

unread,
Jan 29, 2018, 2:40:04 AM1/29/18
to elixir-lang-core
Hi all,

So I had a script that needed an option parser, and naturally, I turned to OptionParser.  Looking at the `OptionParser.parse/2` docs, I went and implemented my parser, using `switches: [host: string, observer: boolean]`.

Having set up a basic `parse_args` function that checked for invalid args, I went and ran my problem with `elixir program.exs --foo`, picking an arbitrary argument and expecting it to fail.  To my surprise, it succeeded, and I ended up with a `foo: true` option.

Now, I imagine anyone who's quite familiar with OptionParser at this point is probably already saying, "sure, it dynamically parsed that; you want to use `strict` and not `switches`".  But the response from a few regulars on the Elixir Slack was "huh what?", followed by a long back-and-forth testing session in which we were all getting different results and having trouble reproducing each other's findings, and (after figuring out what was going on) ultimately resulting in a deep dive into Erlang crash logs and finding that the `foo` atom was defined in the `sots` Erlang module, which was only loaded if you run `elixir <file>` or `mix run <file>` but not if you run `elixir -e "code"`, and may or may not be defined in `iex` depending on your command history, etc etc.

Adding to the confusion is this paragraph from the `parse/2` documentation:

OptionParser also includes a dynamic mode where it will attempt to parse switches dynamically. Such can be done by not specifying the :switches or :strict option.

Yet that's clearly false, since earlier in the docs it said
  • :switches - defines some switches and their types. This function still attempts to parse switches that are not in this list.
and that was the crucial part we missed for a while.

After all this confusion was settled, the general sentiment seemed to be that OptionParser's dynamic mode was something of a misfeature, or at least, probably shouldn't be enabled by default, since it creates extremely unpredictable behaviour based on the atoms that are currently loaded.  (For reference, a brand new `elixir` command, with zero extra code loaded, has about 10,000 atoms in play.  This includes common options like `:help`, etc.)

I was asked to bring this up on the mailing list, so here I am.  Personally, while I can see the utility of this mode for someone who really doesn't care about invalid arguments, the fact is that "not `:strict`" basically means "incredibly arbitrary".  I mean, surely this doesn't seem like sensible default behaviour:

iex(1)> OptionParser.parse(["--foo", "--bar"], switches: [a: :string, b: :boolean])
{[], [], [{"--foo", nil}, {"--bar", nil}]}
iex
(2)> :foo
:foo
iex
(3)> OptionParser.parse(["--foo", "--bar"], switches: [a: :string, b: :boolean])
{[foo: true], [], [{"--bar", nil}]}
iex
(4)> :bar
:bar
iex
(5)> OptionParser.parse(["--foo", "--bar"], switches: [a: :string, b: :boolean])
{[foo: true, bar: true], [], []}

And yes, this can be avoided by reading the docs more closely (aside from the documentation error quoted above), but I wonder if we should be defaulting to the less surprising, less hair-pulling, less "why/how the heck is it doing that?!" option instead?

José Valim

unread,
Jan 29, 2018, 4:56:17 AM1/29/18
to elixir-l...@googlegroups.com
Thanks Adrian for reporting your concerns here!

I want to clarify only one thing though. The choice between switches/strict is opt-in, in this case, there is no default.

The mode you are saying we are defaulting to, is the one where none of switches/strict is given, and we attempt to parse only the atoms we know. Correct?

If so, I agree with your conclusion. We should deprecate it or hide it behind an explicit flag, otherwise it is very prone to errors since it depends on the atoms loaded in the system.

If we are agreeing, please open up an issue so we can tackle this in future releases.




José Valim
Founder and 
Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/b0f159d1-7ee9-4c7f-875d-791217d5b08d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adrian Irving-Beer

unread,
Jan 29, 2018, 12:17:18 PM1/29/18
to elixir-lang-core
I think what set me down my confusing path is that the early examples show `:switches`, which explicitly configures switches to accept.  If you're already familiar with option parsers from other languages, it's tempting to go something along the lines of
  1. oh, this lets me configure the switches I accept, that's what I wanted
  2. and it returns an `invalid` tuple, so I can easily report invalid args
  3. I understand what's going on here!  Let's do this.  [stops reading docs]
And that's what I did, oops. :)  But then you fall into the trap of having the dynamic behaviour still enabled in addition to the options you just explicitly defined, because you didn't use `:strict`.

As such, I would say that it comes down to a combination of naming — `:switches` seems like explicit configuration, and it technically is, but it doesn't disable the implicit parsing — plus expectations from option parsers in other languages, plus examples using `:switches` where they could be using `:strict` instead.

It's not that the functionality is in any way broken, just that it's confusing.  The results of dynamic mode are so weird to anyone who was just expecting a bog standard option parser, so they're just as likely to assume "oh, the option parser defines some options by default in addition to the switches I specified?" (since `--help` is one of those atoms accepted) — or "oh, wtf, option definitions are leaking into my option parser?" (especially since not everyone gets the same results, depending on atoms loaded) — as they are to assume "hmm, I should go back and read the docs more thoroughly".

If I were defining the options from scratch, I would suggest that
  • `:switches` and `:strict` be changed into just `:switches`;
  • that there be a `dynamic: true` option to enable the dynamic behaviour; and,
  • that it be an error to not specify either `switches: [...]` or `dynamic: true` (but you could still do both if you wanted).
However, even if this is where we wanted to go, changing to something along these lines would be difficult now, since there's no reliable way to issue a deprecation warning — specifying `:switches` alone would not make it clear whether a user intended to use the new behaviour (equivalent to the old `strict: [...]`) or the old behaviour (which should now be expressed as `parse(switches: [...], dynamic: true)`).

If we want to actually change the OptionParser API to make the dynamic mode explicitly enabled, we might do so by adding said explicit `dynamic: true` option and issuing a deprecation warning if  neither `strict` nor `dynamic` are set.

(All this of course assumes that we actually want to keep the dynamic behaviour at all.  If we choose to get rid of it, then things get a lot simpler.)
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

José Valim

unread,
Jan 29, 2018, 12:48:00 PM1/29/18
to elixir-l...@googlegroups.com
I believe there may be some confusion still. :)

The dynamic mode is not the same as the :switches option. The :switches option is not prone to the "unknown atom" issue because the atoms you expect are explicitly listed. The atom issue only happens when none of switches and strict options are given.

For the switches / strict issue I believe advertising the strict option first and foremost should be enough to get rid of most of the confusion since you always need to pick one.

So the only mode that is default is the "dynamic" one, when no argument is given, because that's really the default one.



José Valim
Founder and 
Director of R&D

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/73988e91-d507-43a7-a54e-f6572fa571c4%40googlegroups.com.

Adrian Irving-Beer

unread,
Jan 29, 2018, 1:30:05 PM1/29/18
to elixir-lang-core
`:switches` does explicitly configure switches you want to accept, but it doesn't disable the dynamic parsing (by design), so you still get options you don't expect.  That was the main source of confusion.

Obviously, had I read further, I would have discovered `:strict` and that's what I wanted!  But I suppose I tend to expect things to be strict by default if I'm supplying an explicit configuration.

I do agree that advertising `:strict` first would probably solve this.

Also, this part could possibly use rephrasing:


OptionParser also includes a dynamic mode where it will attempt to parse switches dynamically. Such can be done by not specifying the :switches or :strict option.

Reading this literally, it does make sense — not specifying `:switches` or `:strict` is one way to parse switches dynamically.  But to me (and again, I should have RTFM more thoroughly, mea culpa!) it just seemed to reinforce the notion that I was doing things correctly and shouldn't be getting unknown options.  I wonder if this would be more clear:

OptionParser also includes dynamic parsing where it will attempt to parse unknown switches automatically.  This occurs whenever the :strict option is not specified, and after any :switches options have been parsed.
Reply all
Reply to author
Forward
0 new messages