Nemerle.PEG

176 views
Skip to first unread message

AdamSpeight2008

unread,
Jul 1, 2011, 4:25:52 PM7/1/11
to Nemerle Forum
Can anyone point me in the direction of documentation (preferably in
English, or google translate-able into English) on how to use the
Nemerle.PEG?

There is http://rsdn.ru/article/nemerle/PegGrammar-en.xml

I'm trying to write a simple parser for BrainFuck, (Note I can write a
interpretor for BF) , but I don't know / understand what need to done
to wire it up,

Clues \ Hints \ Guidance would be most helpful.
So far I have.

[code]

using Nemerle.Collections;
using Nemerle.Peg;
using Nemerle.Text;
using Nemerle.Utility;
using Nemerle;

using System;
using System.Collections.Generic;
using SCG = System.Collections.Generic;

namespace BF
{
[PegGrammar(Options = EmitDebugSources, bf_expression,
grammar
{
cmds = '+'/'-'/'<'/'>'/'?'/'.' ;
cmd_while = '[' bf_expression* ']';
bf_expression : string = (cmd_while / cmds+)+;
})]
class P
{ cmds(_ : NToken) : string { "" }
cmd_while(_ : NToken) : string { "" }
bf_expression(_ : NToken) : string { "" }
}
}
[\code]

hardcase

unread,
Jul 3, 2011, 6:36:31 AM7/3/11
to Nemerle Forum
Hi!

Before you start you must understand purpose of PegGrammar generated
parser: only creating Abstract Syntax Tree (AST) of parsed text, not
interpreting it.
After you get BF AST you can make any transformations you want:
perform semantic checks, optimization, produce object code.

Look at JSON parser:
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/peg-parser/Json/Nemerle.Json/JsonParser.n
JSON parser output is JSON AST:
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/peg-parser/Json/Nemerle.Json/Ast.n

Look at C# preprocessor parser (quite complex example):
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/csharp-parser/CSharpParser/PreParser.n
It's output is C# preprocessor AST:
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/csharp-parser/CSharpParser/PreParser_AST.n
I think you should also look at C# preprocessor - it performs
evaluation of this AST:
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/csharp-parser/CSharpParser/Preprocessor.n


So, your BF implementation design should have these components:
1) BF runtime and object code (threaded code or IL or something
else :)
2) AST of BF code
3) Parser that creates AST from plain text (PegGrammar macro is useful
here)
3) Translator from AST into BF object code ('compiler')


On 2 июл, 00:25, AdamSpeight2008 <adamspeightspost...@googlemail.com>
wrote:
> Can anyone point me in the direction of documentation (preferably in
> English, or google translate-able into English) on how to use the
> Nemerle.PEG?
>
> There ishttp://rsdn.ru/article/nemerle/PegGrammar-en.xml

AdamSpeight2008

unread,
Jul 3, 2011, 7:31:10 PM7/3/11
to Nemerle Forum
Ok, Now I've got a Parser that create an 'AST' (see code below) from
valid BF code.
Any suggestions on how return the positions of invalid chars.

My thought is to encode a rule, which matches anything that isn't not
a CMD / While Loop. ( How to encode a not?)

[code]

using Nemerle.Collections;
using Nemerle.Peg;
using Nemerle.Text;
using Nemerle.Utility;
using Nemerle;

using System;
using System.Collections.Generic;
using SCG = System.Collections.Generic;

using Nemerle.Peg.AstUtils;

namespace BF
{
variant bfc
{ | Add { pos: int }
| Sub { pos: int }
| Dec { pos: int }
| Inc { pos: int }
| C_in { pos: int }
| Cout { pos: int }
| W_beg { pos: int }
| W_end { pos: int }
| While { begin: W_beg;
end: W_end;
expr: List[bfc];
}
}


[PegGrammar(Options = EmitDebugSources, cmds,
grammar
{newLine = "\n"
/ "\r\n"
/ "\r"
/ "\u2028" /* line separator */
/ "\u2029"; /* paragraph separator */

cmd_cmd : bfc = '+'/'-'/'<'/'>'/'?'/'.';
begin_while : bfc = '[';
end_while : bfc = ']';
while_loop : bfc = begin_while cmds* end_while;
cmds : List[bfc] =(while_loop / cmd_cmd)+ newLine? ;
})]


class Parser
{

cmd_cmd(t:NToken):bfc
{ match (this._parsingSource.OriginalText[t.StartPos])
{ | '+' => bfc.Add(t.StartPos);
| '-' => bfc.Sub(t.StartPos);
| '>' => bfc.Inc(t.StartPos);
| '<' => bfc.Dec(t.StartPos);
| '?' => bfc.C_in(t.StartPos);
| '.' => bfc.Cout(t.StartPos);
| _ => throw InvalidOperationException()
}
}

begin_while (t:NToken):bfc { bfc.W_beg(t.StartPos) }

end_while (t:NToken):bfc { bfc.W_end(t.StartPos) }

cmds (cs:List[bfc], n : NToken) : List[bfc] { cs}

while_loop (p0:bfc ,p1 :List[List[bfc]], p2 : bfc) : bfc.While
{ bfc.While(p0:>bfc.W_beg,p2:>bfc.W_end, p1[0] ) }

}

}


[/code]

VladD2

unread,
Jul 4, 2011, 3:17:29 PM7/4/11
to nemer...@googlegroups.com
First of all, BF is have very simple syntax. The PegGrammar can't show it's power on BF grammar.

Note about your AST...

1. Instade adding "pos: int" field to each node reasonably to inherit "bfc" from Nemerle.Peg.Located:
variant BfAst : Nemerle.Peg.Located
{
  | Add
  | Sub
  | Dec
  ...
}

2. AST for loop (while) should look like:
| While { Commands : list[BfAst]; }

3. What is "Add" and "Sub"? BF have only increment/decrement commands move to right/left cell. I think, AST node for "move" commands should have names MoveRight and MoveLeft.

> Any suggestions on how return the positions of invalid chars. 
def (pos, ast) = _parser.TryParse(...);
when (pos < 0)
{
def (errorPos, ids) = _parser.GetMaxRollbackPosAndIds();
def errorMsg = "Error in inpute.\r\nExpected:\n"
    + $<#    ..$(ids; "\r\n    "; id => _parser.GetRuleName(id))#>;
    ...
}

If you need do error recovery, you can use spetial rules or FailureRecovery.

Adam Speight

unread,
Jul 4, 2011, 8:15:52 PM7/4/11
to nemer...@googlegroups.com

VladD2, 

It's because of that simplicity of grammar, which make it an attractive.  (I've extended BF to have comments {} and be multi-line"

A few queries on your suggestions.

1. I'm correct in thinking that this will allow me to do syntax-highlighting later, as it preserves the source-text positional information? 
3. The AST names are not that important, as I can change them at later date. 

On error Recovery, is there an example of FailureRecovery in used?
 

hardcase

unread,
Jul 5, 2011, 4:12:11 AM7/5/11
to Nemerle Forum
Yep. Position information can be used in highlighting. C# AST
converter use positions in translation to Nemerle AST (so degugging
over C# sources compiled via ncc.exe is possible).


Syntax error recovery can be found in C# parser 'statement' rule:
http://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/csharp-parser/CSharpParser/Parser.n#505


On Jul 5, 4:15 am, Adam Speight <adamspeightspost...@googlemail.com>
wrote:
Reply all
Reply to author
Forward
0 new messages