I’m trying to make a C# parser for the Hello.h4 grammar, using only the antlr generator (no VS integration or other tooling, other than antlr4-csharp-4.0.1-SNAPSHOT-complete.jar and Antlr4.Runtime.v4.5.dll). I can get it to create the parser and a listener in C#, but it does not produce a C# lexer class. I tried to roll my own based on what was in the Java lexer, but, no dice.
Can anyone point me in the right direction?
Here’s what I tried… I started with the following Hello.g4 grammar:
grammar Hello;
options { language=CSharp_v4_5; }
HELLOWORD : 'hello' ;
r : HELLOWORD ID ; // match keyword hello followed by an identifier
ID : [a-z]+ ; // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ;
Then I ran the org.antlr.v4.Tool on it and got the following files:
HelloLexer.java
HelloLexer.tokens
HelloParser.cs
HelloListener.cs
HelloBaseListener.cs
Hello.tokens
Notice there is no HelloLexer.cs file. Then I created a C# console program that calls the following method:
private void RunParser() {
AntlrInputStream inputStream = new AntlrInputStream("hello world\n");
MyLexer helloLexer = new MyLexer(inputStream);
CommonTokenStream commonTokenStream = new CommonTokenStream(helloLexer);
HelloParser helloParser = new HelloParser(commonTokenStream);
MyListener myListener = new MyListener();
helloParser.AddParseListener(myListener);
HelloParser.RContext rContext = helloParser.r();
}
That’s based on some V3 examples I found. Is it about right?
In the absence of a generated Lexer class I hacked up the following:
public class MyLexer : Lexer {
public MyLexer(ICharStream input) : base(input) {
Interpreter = new LexerATNSimulator(HelloParser._ATN);
}
public override string[] RuleNames {
get { return HelloParser.ruleNames; }
}
public override string GrammarFileName {
get { return "Hello.g4"; }
}
}
When I run the program, it crashes with the following error:
ERROR: System.IndexOutOfRangeException: Index was outside the bounds of the array.
at Antlr4.Runtime.Atn.LexerATNSimulator.Match(ICharStream input, Int32 mode)
at Antlr4.Runtime.Lexer.NextToken()
at Antlr4.Runtime.BufferedTokenStream.Fetch(Int32 n)
at Antlr4.Runtime.BufferedTokenStream.Sync(Int32 i)
at Antlr4.Runtime.BufferedTokenStream.Setup()
at Antlr4.Runtime.BufferedTokenStream.LazyInit()
at Antlr4.Runtime.CommonTokenStream.Lt(Int32 k)
at Antlr4.Runtime.Parser.EnterRule(ParserRuleContext localctx, Int32 state, Int32 ruleIndex)
at HelloParser.r() in f:\Project\Grammars\Hello\Hello\HelloParser.cs:line 53
at Hello.Program.RunParser() in f:\Project\Grammars\Hello\Hello\Program.cs:line 33
at Hello.Program.Run() in f:\Project\Grammars\Hello\Hello\Program.cs:line 17
Despite scouring the web I can’t find a simple ANTLR4 C# target example that is complete and workable. Any assistance would be much appreciated!
Hi all,
I’m using last released C# target under VS 2008. It works like a charm.
After parsing, I’m using the following code to verify if I had errors:
if (parser.NumberOfSyntaxErrors > 0)
Console.WriteLine("Errors found.");
else
Console.WriteLine("No errors found!");
This piece of code will test only parser errors, but I need to verify if I had lexical errors too.
How can I do this?
TIA.
Nilo - Brazil
Phurst,
Not exactly HelloLexer.cs, ‘cause the Grammar I’m testing is called Combined1. So, ANTLR tool generates a Combined1Lexer.cs file. Code follows.
// Generated from Combined1.g4 by ANTLR 4.0.1-SNAPSHOT
namespace Test1 {
using Antlr4.Runtime;
using Antlr4.Runtime.Atn;
using Antlr4.Runtime.Misc;
using DFA = Antlr4.Runtime.Dfa.DFA;
public partial class Combined1Lexer : Lexer {
public const int
T__0=1, WS=2, ID=3;
public static string[] modeNames = {
"DEFAULT_MODE"
};
public static readonly string[] tokenNames = {
"<INVALID>",
"';'", "' '", "ID"
};
public static readonly string[] ruleNames = {
"T__0", "WS", "ID"
};
protected const int EOF = Eof;
protected const int HIDDEN = Hidden;
public Combined1Lexer(ICharStream input)
: base(input)
{
_interp = new LexerATNSimulator(this,_ATN);
}
public override string GrammarFileName { get { return "Combined1.g4"; } }
public override string[] TokenNames { get { return tokenNames; } }
public override string[] RuleNames { get { return ruleNames; } }
public override string[] ModeNames { get { return modeNames; } }
public override void Action(RuleContext _localctx, int ruleIndex, int actionIndex) {
switch (ruleIndex) {
case 1 : WS_action(_localctx, actionIndex); break;
}
}
private void WS_action(RuleContext _localctx, int actionIndex) {
switch (actionIndex) {
case 0: _channel = HIDDEN; break;
}
}
public static readonly string _serializedATN =
"\x5\x4\x5\x14\b\x1\x4\x2\t\x2\x4\x3\t\x3\x4\x4\t\x4\x3\x2\x3\x2\x3\x3"+
"\x3\x3\x3\x3\x3\x3\x3\x4\x6\x4\x11\n\x4\r\x4\xE\x4\x12\x2\x2\x2\x5\x3"+
"\x2\x3\x1\x5\x2\x4\x2\a\x2\x5\x1\x3\x2\x3\x3\x63|\x14\x2\x3\x3\x2\x2\x2"+
"\x2\x5\x3\x2\x2\x2\x2\a\x3\x2\x2\x2\x3\t\x3\x2\x2\x2\x5\v\x3\x2\x2\x2"+
"\a\x10\x3\x2\x2\x2\t\n\a=\x2\x2\n\x4\x3\x2\x2\x2\v\f\a\"\x2\x2\f\r\x3"+
"\x2\x2\x2\r\xE\b\x3\x2\x2\xE\x6\x3\x2\x2\x2\xF\x11\t\x2\x2\x2\x10\xF\x3"+
"\x2\x2\x2\x11\x12\x3\x2\x2\x2\x12\x10\x3\x2\x2\x2\x12\x13\x3\x2\x2\x2"+
"\x13\b\x3\x2\x2\x2\x4\x2\x12";
public static readonly ATN _ATN =
ATNSimulator.Deserialize(_serializedATN.ToCharArray());
}
} // namespace Test1
For completeness, here is the .g4 file that generates it:
grammar Combined1;
@parser::members
{
protected const int EOF = Eof;
}
@lexer::members
{
protected const int EOF = Eof;
protected const int HIDDEN = Hidden;
}
// ==================================================
// Parser Rules
// ==================================================
start:
command+ EOF ;
command:
(ID)+ ';' ;
// ==================================================
// Lexer Rules
// ==================================================
WS : ' ' -> channel(HIDDEN) ;
ID : [a-z]+ ;
Note that’s a very simplistic grammar. I’m just starting my tests with the C# target using VS 2008.
Hope that helps.
Regards,
Nilo
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Hi,
You’ll need to integrate grammar generation into the build process before it will work. The C# target is designed for rock-solid reliable use in the build tools, not for manual use on a command line.
Since you are using .NET 4+, you can use NuGet to automatically download the tools, runtime, and even configure your project in one step. You need to configure NuGet to search for prerelease packages, and install ANTLR 4 version 4.1.0-alpha002 (or whatever the latest it reports).
Also, you do not need to specify the language option in your grammar. The build tools will override the value you specify there anyway.
Thank you,
Sam Harwell
--
I appreciate the helpful reply Sam. “Rock Solid” --- I like the sound of that.
So I switched to attempting this in Visual Studio 2012. Piecing instructions together from various places (see below) I managed to create a grammar in VS and attempt a build. The build fails with the following error:
AC1000: Unknown build error: Could not locate a Java installation.
Looking at the AntlrClassGenerationTaskInternal.cs code it appears to be looking in the registry (not the JAVA_HOME environment variable) in the HKEY_LOCAL_MACHINE\SOFTWARE key, for a subkey JavaVendor\JavaInstallation. When I examine my registry I find a key:
HKEY_LOCAL_MACHINE\SOFTWARE\JavaSoft\Java Development Kit\1.7
I assume AntlrClassGenerationTaskInternal is looking for the wrong vendor or installation name.
This is what I did:
Install Java (JDK 1.7) in C:\Program Files\Java\jdk1.7.0_17.
Set JAVA_HOME environment variable to the above dir.
Install ANTLR Language Support extension (a vsix file):
http://visualstudiogallery.msdn.microsoft.com/25b991db-befd-441b-b23b-bb5f8d07ee9f
Run VS2012.
Update NuGet to v 2.5 + (actually 2.6)
Create a VS Solution “HelloVS”
Install ANTLR 4 support:
In NuGet Official Package Sources, Include Prerelease, search for “ANTLR”.
Install the “ANTLR 4” package.
Add a grammar and edit it to be Hello.g4:
grammar Hello;
HELLOWORD : 'hello' ;
r : HELLOWORD ID ; // match keyword hello followed by an identifier
ID : [a-z]+ ; // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ;
Build:
1>------ Build started: Project: HelloVS, Configuration: Debug Any CPU ------
1>Build started 8/15/2013 9:32:46 AM.
1>E:\ANTLR\HelloVS\packages\Antlr4.4.1.0-alpha003\build\Antlr4.targets(132,5): error AC1000: Unknown build error: Could not locate a Java installation.
1>
1>Build FAILED.
I spelunked trough the source code and figured out how the targets set the JavaVendor and JavaInstallation targets. I discovered that it is actually looking for “JavaSoft” and “Java Runtime Environment” respectively. I did not have the Java JRE installed. Having fixed that, the build now runs.
Two files are generated (below the Hello.g4 file in Solution Explorer):
Hello.g4.lexer.cs
Hello.g4.parser.cs
Both of these files are empty and I get an error in the build (see below). So I assume there is something wrong with my grammar.
Unknown build error: Executing command: “C:\Program Files (x86)\Java\jre7\bin\java.exe” –cp E:\ANTLR\HelloVS\packages\Antlr4.4.1.0-alpha003\build\..\tools\antlr4-csharp-4.1-SNAPSHOT-complete.jar org.antlr.v4.CsharpTool –o obj\Debug\ -listener –visitor –Dlanguage=Csharp_v4_5 –package HelloVS E:\ANTLR\HelloVS\HelloVS\Hello.g4
I decided to use a proven grammmar instead. I created a new VS project “JavaVS” and took the Java grammar from the Antlr4.Runtime.Test.v4.5 project in the Antlr4 source code. That project builds without errors and creates a lexer and parser file. But again both of these files are empty. Neither project generates a visitor class.
Any idea what do I need to do to get a populated parser and lexer?
EnterEveryRuleEnterRVisitTerminal helloVisitTerminal worldExitRExitEveryRule
private void RunParser() {
AntlrInputStream inputStream = new AntlrInputStream("hello world\n");
HelloLexer helloLexer = new HelloLexer(inputStream);
CommonTokenStream commonTokenStream = new CommonTokenStream(helloLexer);
HelloParser helloParser = new HelloParser(commonTokenStream);
MyListener myListener = new MyListener();
helloParser.AddParseListener(myListener);
HelloParser.RContext rContext = helloParser.r();
}
And here's the HelloListener class:
public class MyListener : HelloBaseListener {
public override void EnterEveryRule(Antlr4.Runtime.ParserRuleContext ctx) {
Console.WriteLine("EnterEveryRule ");
}
public override void ExitEveryRule(Antlr4.Runtime.ParserRuleContext ctx) {
Console.WriteLine("ExitEveryRule");
}
public override void VisitErrorNode(Antlr4.Runtime.Tree.IErrorNode node) {
Console.WriteLine("VisitErrorNode");
}
public override void VisitTerminal(Antlr4.Runtime.Tree.ITerminalNode node) {
Console.WriteLine("VisitTerminal {0}", node.Symbol.Text);
}
public override void EnterR(HelloParser.RContext context) {
Console.WriteLine("EnterR");
}
public override void ExitR(HelloParser.RContext context) {
Console.WriteLine("ExitR");
}
}
So I think I'm in business now.
Thanks for your help!
Hi Oscar,
It should be possible to update the build task to support systems running Mono. It may only require changing the piece of code that locates the Java executable on the current system (it uses the Registry now). Unfortunately I don’t have such a system right now so I haven’t been able to test it. However, I may be able to get ahold of one at work for experimentation, or run a virtual machine locally.
Thanks,
Sam
Well, there’s the command line! :)
Those two files are created by the code template so it’s easy for you to add members to the classes without having to use an @members{} block in the grammar file. The actual generated code files are placed in the intermediate output directory (obj/Debug and obj/Release by default).
Thanks,
Sam
From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of phurst
Sent: Thursday, August 15, 2013 11:03 AM
To: antlr-di...@googlegroups.com
--