On Jul 15, 5:48 am, Matt Brubeck <
mbrub...@limpet.net> wrote:
> On Jul 15, 12:16 am, Joshua Haberman <
jhaber...@gmail.com> wrote:
>
> > I wish I had an issue tracker where I could put this.
>
> Maybe ticgit or git-issues as a temporary solution (or just keep using
> TODO)?
Ok, I managed to finally get registered at
code.google.com, so now I
have an issue tracker:
http://code.google.com/p/gazelle/issues/list
Now that I have it though, I'm a little bit less sure that removing
this limitation is for the best. Allowing this for rules is a
requirement, since rules can be mutually recursive. If you have:
a -> "X" b?;
b -> "Y" a?;
This is a perfectly valid grammar, but cannot be expressed as such
without allowing rules to be referenced before their use. Named
terminals, on the other hand, don't reference anything else and can
always come before everything else.
I'm trying to walk a fine line between having the language's
limitations encourage good style and having them be a straight-
jacket. Is it oppressive to have to list named terminals before their
use? Difficulty of implementation is not a significant issue; I'm
just thinking there may be benefits to know when you read a grammar
that if you see a symbol used that hasn't been defined yet, then you
know it's a rule and not a named terminal.
Another possibility is to syntactically enforce a convention like:
"named terminals are in all caps, anything else is a nonterminal." On
one hand the consistency that would provide appeals to me, on the
other hand I think it could be nice to allow grammar files to follow
the conventions of the standard they are implementing, to make it
easier to compare the two.
I read an essay a while back that I wish I could find now, where the
guy argues that languages like C would be better if they were so
stringent about style that code would fail to compile if it didn't eg.
indent properly. On one hand that sounds extreme, but on the other
hand most significant projects end up establishing a convention anyway
and trying to make sure everyone follows it. Consistency encourages
readability. Why make each project do this work of creating and
establishing a convention? If the convention is a part of the
language, then there will be consistency automatically across everyone
who uses the language.
So to bring this back to a more concrete discussion, would requiring
named terminals to be defined before their use be a gentle nudge in
the right direction that encourages everyone to structure their
grammars with nonterminals first, or would it be a draconian
limitation that makes the language more temperamental than it's
worth?
> > My first inclination is to say that the manual is wrong
> > and that naming strings should not be supported. [...]
> > On the other hand, I wouldn't be surprised if someone could come up
> > with a case where naming strings has a real benefit. Do you have one?
>
> The only reason I've come up with would be a long string that is
> repeated often, or likely to change. But I don't have any real-world
> examples. And anyways, you can always use a trivial regex to match a
> single string if you really want to. So I don't see any harm in
> keeping (and documenting) the current functionality.
Cool, I've documented it:
http://github.com/haberman/gazelle/commit/4f9352caaed6fcc6b8c70f2ec8cb053d2f64e3cb
Josh