[erlang-questions] BNF/EBNF Grammar for Erlang

159 views
Skip to first unread message

Ryan Molden

unread,
Nov 11, 2011, 11:08:34 AM11/11/11
to erlang-q...@erlang.org
Howdy fellow Erlangeers.  I was wondering if anyone knew of a BNF/EBNF grammar for Erlang?  Joe Armstrong pointed me at the YRL grammar at github; unfortunately, I am YRL illiterate so it looked mostly like gibberish to me.

Ryan

Max Lapshin

unread,
Nov 11, 2011, 12:03:06 PM11/11/11
to Ryan Molden, erlang-q...@erlang.org
You should take a look at neotoma
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Ryan Molden

unread,
Nov 11, 2011, 12:38:03 PM11/11/11
to Max Lapshin, Ryan Molden, erlang-q...@erlang.org
Not clear how that helps, is there a grammar in there somewhere I am
missing? I looked around a little on the github page for it with no
success.

Ryan
From: Max Lapshin
Sent: 11/11/2011 9:03 AM
To: Ryan Molden
Cc: erlang-q...@erlang.org
Subject: Re: [erlang-questions] BNF/EBNF Grammar for Erlang

OvermindDL1

unread,
Nov 11, 2011, 1:12:16 PM11/11/11
to Ryan Molden, erlang-q...@erlang.org


On Nov 11, 2011 10:38 AM, "Ryan Molden" <ryanm...@gmail.com> wrote:
>
> Not clear how that helps, is there a grammar in there somewhere I am
> missing? I looked around a little on the github page for it with no
> success.

It uses standard PEG, like EBNF/BNF, but not ambiguous like those are, check both the range in the root directory, and check the extra directory for samples.  As for PEG, even Wikipedia has a good detailed description, but if you know EBNF, you basically already know it except for one or two operators.

Ryan Molden

unread,
Nov 11, 2011, 1:53:00 PM11/11/11
to OvermindDL1, Ryan Molden, erlang-q...@erlang.org
Yeah I saw a peg called erlang.peg but it didn't seem like Erlang as it was missing most all the terminals (keywords, operators, etc...). Maybe I overlooked them or was looking in the wrong file, I will dig around some more, thanks (both).

Ryan

From: OvermindDL1
Sent: 11/11/2011 10:12 AM
To: Ryan Molden
Cc: erlang-q...@erlang.org; Max Lapshin

Subject: Re: [erlang-questions] BNF/EBNF Grammar for Erlang

Vlad Dumitrescu

unread,
Nov 11, 2011, 2:06:31 PM11/11/11
to Ryan Molden, erlang-q...@erlang.org

Hi,

You could look at
http://cedet.bzr.sourceforge.net/bzr/cedet/code/trunk/annotate/head%3A/semantic/bovine/erlang.by,
but it seems to not have been updated from 2005 so it's probably only
useful as a starting point.

regards,
Vlad

Ryan Molden

unread,
Nov 11, 2011, 3:28:12 PM11/11/11
to OvermindDL1, Ryan Molden, erlang-q...@erlang.org
Yeah, no luck.  Neotoma (near as I can tell) is based at seancribbs / neotoma (with numerous forks).  It is a parser generator for Erlang that *understands* PEGs, it does not, itself, contain a PEG for the actual Erlang language, just some trivial PEG examples for things like JSON and some simple DSLs. 
 
I am looking for an up-to-date BNF/EBNF/PEG grammar for the Erlang language itself. 
 
Vlad's recommendation looks like the closest yet, so maybe I will move forward with it and then just 'sanity check' against the compiler if there is confusion or things that differ from the (somewhat old) grammar in the 4.7 doc on the erlang.org site.
 
Ryan

Richard Carlsson

unread,
Nov 11, 2011, 3:36:58 PM11/11/11
to erlang-q...@erlang.org
On 11/11/2011 09:28 PM, Ryan Molden wrote:
> Yeah, no luck. Neotoma (near as I can tell) is based at seancribbs /
> neotoma (with numerous forks). It is a parser generator for Erlang
> that *understands* PEGs, it does not, itself, contain a PEG for the
> actual Erlang language, just some trivial PEG examples for things like
> JSON and some simple DSLs.
> I am looking for an up-to-date BNF/EBNF/PEG grammar for the Erlang
> language itself.

I don't quite grasp what you're looking for. If you can't understand
https://github.com/erlang/otp/blob/master/lib/stdlib/src/erl_parse.yrl,
(which is about as clear as it can get) how can you hope to understand
*any* complete BNF grammar for Erlang?

/Richard

Joe Armstrong

unread,
Nov 11, 2011, 3:38:12 PM11/11/11
to Ryan Molden, erlang-q...@erlang.org
I don't think there are any BNF/EBNF grammars for erlang
there might be grammars for subsets of the language but not the
entire language.  The problem with NBF/EBNF/PEG grammars
is that decent error reporting is very difficult. In practice
virtual all languages use hand written recursive descent parsers
or LALR(1) parsers for which acceptable error recovery strategies exist.

The official grammar is: (official means this is the actual grammar used by the erlang compiler and all tools) 

It you look at the productions they are very similar to yacc
productions:

For example a tuple is defined like this (line 330,331)

tuple -> '{' '}' : {tuple,?line('$1'),[]}.
tuple -> '{' exprs '}' : {tuple,?line('$1'),'$2'}.

If you forget about the stuff after the ':' this
reads

tuple -> '{' '}'
tuple -> {' exprs '}'

ie a tuple is either {} or { exprs }

The part after the ':' define the parse tree that
is returned if the expression is recognised.

A production like:

a -> b c d : {something, '$2'}

means is we match an 'a' then the parse
tree we want returned is {something, '$2'}

'$1' is the parse tree of b, '$2' is the parse tree
of c etc.

and exprs (line 445/446) is defined

exprs -> expr : ['$1'].
exprs -> expr ',' exprs : ['$1' | '$3'].

ie exprs is an expr or a comma separated
sequence of expr's

In the original yacc this would be written
something like

exprs: expr {$$ = $1}
| expr ',' exprs {$$ = [$1|$3]}


the command

> erlc erlang.yecc

compiles the grammar into a beam file

Cheers

/Joe




On Fri, Nov 11, 2011 at 5:08 PM, Ryan Molden <ryanm...@gmail.com> wrote:
Howdy fellow Erlangeers.  I was wondering if anyone knew of a BNF/EBNF grammar for Erlang?  Joe Armstrong pointed me at the YRL grammar at github; unfortunately, I am YRL illiterate so it looked mostly like gibberish to me.

Ryan

Ryan Molden

unread,
Nov 11, 2011, 3:55:57 PM11/11/11
to Richard Carlsson, erlang-q...@erlang.org
I'll assume that wasn't meant to be as condescending as it came off to me.
 
>which is about as clear as it can get
 
Yeah, something about
binary_type -> '<<' '>>' : {type, ?line('$1'),binary,
[abstract(0, ?line('$1')),
abstract(0, ?line('$1'))]}.
 
binary_type -> '<<' bin_base_type '>>' : {type, ?line('$1'),binary,
['$2', abstract(0, ?line('$1'))]}.
 
Doesn't scream out to me 'as clear as it gets'!  But reasonable people can disagree.  As for how I would understand a full BNF/EBNF...well I am familar with that format so I think I would somehow manage, even with my lack of YRL familiarity.
 
Joe's response about how to parse the YRL (specifically pointing out I could ignore the stuff to the right of the ':', which was most of the 'noise' for me) was much more helpful.

Ryan

Ryan Molden

unread,
Nov 11, 2011, 3:56:58 PM11/11/11
to Joe Armstrong, erlang-q...@erlang.org
Great, thanks I will give the YRL a shot (ignoring 'everything to the right of the ':' may really be all that is needed, it wasn't clear to me if that was somehow essentially in expressing what the YRL was trying to represent).
 
Ryan

Joe Armstrong

unread,
Nov 12, 2011, 12:44:50 PM11/12/11
to Ryan Molden, erlang-q...@erlang.org
If you dig deeper you have two alternatives:

1) learn yecc (it's like bison/yacc etc.) Yacc grammers are
    easy (ish) to read (one you understand the syntax) but difficult  
   to debug if you get  them wrong.

2) Search for an EBNF/BNF/Peg grammar from Erlang

Of the two 1) is the quickest alternative. I have never ever seen
a complete type 2) grammar for Erlang - subsets yes - complete grammars no. It would take a considerable amount of work to
make a type 2) grammar for Erlang and even if you found one you would never know if the grammar described the same language
as a type 1) grammar - equivalence of grammars is undecidable
in general.

If you want to play with the parser the following function is
useful:

string2exprs(Str) ->
    case erl_scan:tokens([], Str ++ ". ", 1) of
{done, {ok, Toks, _}, []} ->
   case erl_parse:parse_exprs(Toks) of
{ok, Exprs} ->
   {ok, Exprs};
{error,{Line,Mod,Arg}} ->
   EStr = io_lib:format("~s",[apply(Mod,format_error,[Arg])]),
   Msg = lists:flatten(EStr),
   io:format("~n***PARSE ERROR in line:~w ~s~n", [Line,Msg]),
   io:format("Str=~s~n",[Str]),
   error
   end;
Other ->
   io:format("~n***SCAN ERROR:~p~n", [Other]),
   error
    end.

This is a very simple interface to the generated parser. So for example, if you add this to the module mymod you can run it like this:

> mymod:string2exprs("case foo(X) of 1 -> sqrt(Y) end"). 
{ok,[{'case',1,
             {call,1,{atom,1,foo},[{var,1,'X'}]},
             [{clause,1,
                      [{integer,1,1}],
                      [],
                      [{call,1,{atom,1,sqrt},[{var,1,'Y'}]}]}]}]}

If you read the above code you'll see how to get from the
world of strings to tokens using erl_scan:tokens, and from tokens
to parse trees using erl_parse:parse_exprs.

the module erl_parse is automatically generated from the grammar.

The above term is generated by the mysterious right hand sides following the colons in the productions:

The appropriate line in the grammar that did this was lines
378-380 ie.

case_expr -> 'case' expr 'of' cr_clauses 'end' :
{'case',?line('$1'),'$2','$4'}.

Have fun

/Joe

Ryan Molden

unread,
Nov 13, 2011, 12:26:48 PM11/13/11
to Joe Armstrong, erlang-q...@erlang.org
Great, thanks for all the help! 

As backstory I have been using an Erlang related project as a driver for also learning boost::spirit, basically I have an Erlang lexer at the moment that handles everything I have thrown at it thus far. I was looking for the grammar as a way to sanity check what I have. The YRL grammar has allowed for that, and shown me a couple of terminals I didn't know about before. 

As an aside can you give me info/pointers on where to find info on the following terminals?

':-'   This seems to be used around 'rules' (which I have never seen in any Erlang program I have looked at, but I am a newb).  Examples or even just a name for this syntactic construct would be nice.  My lexer token currently calls it RULETHINGY :)

'..'    Never seen this, but it is listed in the YRL file as a terminal.  Name and meaning would be nice.

'...'   Never seen this, but it is listed in the YRL file as a terminal.  Name and meaning would be nice.

Ryan

Kostis Sagonas

unread,
Nov 13, 2011, 12:40:08 PM11/13/11
to erlang-q...@erlang.org
On 11/13/11 19:26, Ryan Molden wrote:
>
> As an aside can you give me info/pointers on where to find info on the
> following terminals?
>
> ':-' This seems to be used around 'rules' (which I have never seen in
> any Erlang program I have looked at, but I am a newb). Examples or even
> just a name for this syntactic construct would be nice. My lexer token
> currently calls it RULETHINGY :)
>
> '..' Never seen this, but it is listed in the YRL file as a
> terminal. Name and meaning would be nice.
>
> '...' Never seen this, but it is listed in the YRL file as a
> terminal. Name and meaning would be nice.

The last two are used in the language of type declarations that Erlang
has. The first is used to declare integer ranges, as in:

-type small_int() :: 1..42.

and the second one to declare non-empty lists of some type, as in:

-type sil() :: [small_int(),...].

Kostis

Richard Carlsson

unread,
Nov 13, 2011, 12:44:20 PM11/13/11
to erlang-q...@erlang.org
On 2011-11-13 18:26, Ryan Molden wrote:
> As an aside can you give me info/pointers on where to find info on the
> following terminals?
>
> ':-' This seems to be used around 'rules' (which I have never seen in
> any Erlang program I have looked at, but I am a newb). Examples or even
> just a name for this syntactic construct would be nice. My lexer token
> currently calls it RULETHINGY :)
>
> '..' Never seen this, but it is listed in the YRL file as a
> terminal. Name and meaning would be nice.
>
> '...' Never seen this, but it is listed in the YRL file as a
> terminal. Name and meaning would be nice.


The ':-' rules are something I've never seen in the wild. I think they
were part of the ancient and obsolete Mnemosyne syntax for making Mnesia
lookups. Nowadays, QLCs are used instead. You can safely assume that you
can ignore everything to do with the rules syntax.

/Richard

Richard O'Keefe

unread,
Nov 13, 2011, 7:45:44 PM11/13/11
to Ryan Molden, erlang-q...@erlang.org

On 12/11/2011, at 9:55 AM, Ryan Molden wrote:

> I'll assume that wasn't meant to be as condescending as it came off to me.
>
> >which is about as clear as it can get
>
> Yeah, something about
> binary_type -> '<<' '>>' : {type, ?line('$1'),binary,
> [abstract(0, ?line('$1')),
> abstract(0, ?line('$1'))]}.


>
> binary_type -> '<<' bin_base_type '>>' : {type, ?line('$1'),binary,
> ['$2', abstract(0, ?line('$1'))]}.

Read what's before the colon:
binary_type -> '<<' '>>'.
binary_type -> '<<' bin_base_type '>>'.

This *is* BNF, up to punctuation.
I've attached a copy of the R12B-5 grammar with the "what to build"
part stripped out. This really is about as clear as it can get.

erlang.grammar

Ryan Molden

unread,
Nov 14, 2011, 10:55:07 AM11/14/11
to Richard O'Keefe, Ryan Molden, erlang-q...@erlang.org
Yep, it is much more clear ignoring the RHS of the :, I just didn't
realize that bit was meaningless to me and am not in the general habit
of ignoring vast swaths of what appears in the grammar file.

I also processed the YRL file to just delete all the stuff to the right
of the :, but thanks for the attachment anyhow.

Ryan
From: Richard O'Keefe
Sent: 11/13/2011 4:45 PM
To: Ryan Molden
Cc: Richard Carlsson; erlang-q...@erlang.org


Subject: Re: [erlang-questions] BNF/EBNF Grammar for Erlang

On 12/11/2011, at 9:55 AM, Ryan Molden wrote:

Richard O'Keefe

unread,
Nov 14, 2011, 6:02:59 PM11/14/11
to Ryan Molden, erlang-q...@erlang.org

On 15/11/2011, at 4:55 AM, Ryan Molden wrote:

> Yep, it is much more clear ignoring the RHS of the :, I just didn't
> realize that bit was meaningless to me and am not in the general habit
> of ignoring vast swaths of what appears in the grammar file.

Well, any grammar that's actually *used* as the front end of something
interesting, whether it's a Prolog DCG, or Yacc, or ML-Yacc, or Yecc,
or whatever is going to have *both* pure syntax *and* some semantics
in there. Here's one of the examples from the GENTLE kit:


'root' expression(-> X) print(X)

'nonterm' expression(-> INT)

'rule' expression(-> X ): expr2(-> X)
'rule' expression(-> X+Y): expression(-> X) "+" expr2(-> Y)
'rule' expression(-> X-Y): expression(-> X) "-" expr2(-> Y)

'nonterm' expr2(-> INT)

'rule' expr2(-> X ): expr3(-> X)
'rule' expr2(-> X*Y): expr2(-> X) "*" expr3(-> Y)
'rule' expr2(-> X/Y): expr2(-> X) "/" expr3(-> Y)

'nonterm' expr3(-> INT)

'rule' expr3(-> X ): Number(-> X)
'rule' expr3(-> - X): "-" expr3(-> X)
'rule' expr3(-> + X): "+" expr3(-> X)
'rule' expr3(-> X): "(" expression(-> X) ")"

'token' Number(-> INT)

If you ignore the (-> ...) parts, you have yet another minor variant
of BNF. The (-> ...) parts give you the semantics. Utterly commonplace.

Robert Raschke

unread,
Nov 15, 2011, 6:23:07 AM11/15/11
to erlang-q...@erlang.org
On Mon, Nov 14, 2011 at 11:02 PM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:

Well, any grammar that's actually *used* as the front end of something
interesting, whether it's a Prolog DCG, or Yacc, or ML-Yacc, or Yecc,
or whatever is going to have *both* pure syntax *and* some semantics
in there.  Here's one of the examples from the GENTLE kit:


'root' expression(-> X) print(X)

'nonterm' expression(-> INT)

  'rule' expression(-> X  ): expr2(-> X)
  'rule' expression(-> X+Y): expression(-> X) "+" expr2(-> Y)
  'rule' expression(-> X-Y): expression(-> X) "-" expr2(-> Y)

'nonterm' expr2(-> INT)

  'rule' expr2(-> X  ): expr3(-> X)
  'rule' expr2(-> X*Y): expr2(-> X) "*" expr3(-> Y)
  'rule' expr2(-> X/Y): expr2(-> X) "/" expr3(-> Y)

'nonterm' expr3(-> INT)

  'rule' expr3(-> X  ): Number(-> X)
  'rule' expr3(-> - X): "-" expr3(-> X)
  'rule' expr3(-> + X): "+" expr3(-> X)
  'rule' expr3(-> X): "(" expression(-> X) ")"

'token' Number(-> INT)

If you ignore the (-> ...) parts, you have yet another minor variant
of BNF.  The (-> ...) parts give you the semantics.  Utterly commonplace.


Wow, it's been a while since I've heard GENTLE mentioned. It is a great simple toolkit (built on top of standard yacc and lex) for implementing translators/interpreters/compilers of all sorts. Highly recommended. Available here: http://gentle.compilertools.net/

Apologies for being off-topic,
Robby

Reply all
Reply to author
Forward
0 new messages