Invoke multiple parser based on the same grammar

109 views
Skip to first unread message

nomawie

unread,
Dec 15, 2014, 9:44:13 AM12/15/14
to antlr-di...@googlegroups.com
Hi,

currently I am developing a pretty complex DSL "MyDSL". An easy example:

InitialSection
 myvariable [3]
 

MainSection
 Description 1
   "Set the variable to 1"
 Action 1
  myvariable [1]
 Action 2:
   Import from file "c:\file.txt"

Finish
 myvariable

The grammar is not very difficult but I want to invoke a parser from within the MainParser which reads only the elements specified under "Action X". This is due to the fact that I have "Imports". In the example the file "file.txt" contains an incomplete part on which the MainParser would throw errors hence there is no "MainSection", no "Finish" and so on. But I need to be able to parse the incomplete action-part and initialize some model elements independently and preliminary to the MainParser. Currently I solved this by creating a grammar for the content of "Action X" like "myvariable [1]" or "Import from...". This grammar is stored in the file "action.g4". Additionally to the MainParser I created an "actionParser.g4" which imports "action.g4". Based on actionParser.g4 I generate a parser which does exactly the job. Also "action.g4" gets imported in the MainParser.g4. From within the MainParser as far as I enter the context of Action, I pass the text to the ActionParser.

@Override
public void exitAction(ActionContext ctx)
   
String text = ctx.getText();
   
ActionParser actParser =new ActionParser(text);
   
List<Declaration> decl = actParser.getDeclarations();
   
...
}

In principle it works. But what I do not like about this solution is that I have to make use of the getText()-method. I do not know what is the best resp. the preferable way. The ActionParser as well as the MainParser have the same grammar so both have the method.
public void exitAction(ActionContext ctx)

But hence everything gets generated, the ActionContext object is different and I cannot pass the context to the ActionParser. I think this solution is not very nice due to redundant elements. Is there a better way to solve this problems? 

From handwritten parsers I know that invoking parsers within parsers is a usual way of implementing things and it also eases maintainability. So I would wonder if there is no more elegant way of doing this.


Right_Then

unread,
Dec 26, 2014, 3:00:03 PM12/26/14
to

Hello nomawie,



In principle it works. But what I do not like about this solution is that I have to make use of the getText()-method. I do not know what is the best resp. the preferable way. The ActionParser as well as the MainParser have the same grammar so both have the method.
public void exitAction(ActionContext ctx)

But hence everything gets generated, the ActionContext object is different and I cannot pass the context to the ActionParser. I think this solution is not very nice due to redundant elements. Is there a better way to solve this problems? 
 
I am no expert and apologize in advance to sound silly but here is my suggestion based on how i am doing things for myself in Antlr4.
After making grammar for my Syntax i made a grammar for Antlr grammar itself (my Syntax grammar parser ), based on what info would be handy to me. Then generate much of boiler plate for constructs/classes/idioms
that might be useful to me just like Antlr generates custom listener and visitor. So i gathered all Lexer rule names, Parser Rule Names, #RuleNames( the names we give in options to generate function for that option ) and what rules constitutes these rules. That way much of the decision processes is automated.You would not have the above problem if you were to generate your own listener based on your choices and it is easy to auto generate it, if you are not using embedded actions.

Thanks
Regards

Right_Then


Terence Parr

unread,
Jan 2, 2015, 12:52:44 PM1/2/15
to antlr-di...@googlegroups.com
Hi. Easiest way to handle include files such as you have is to have the token stream/lexer simply present a complete series of tokens to the parser. Parser should have no idea that there is an import statement.
Ter

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Dictation in use. Please excuse homophones, malapropisms, and nonsense. 

nomawie

unread,
Jan 8, 2015, 12:20:02 PM1/8/15
to
Hi,

thanks for your answers and sorry for my late response, I've been in holiday over new year. I changed my implementation so that I pass the tokenstream from one parser the other. But the problems remains the same. I have my main parser which can read the whole file derived from the grammar MyDSL.
grammar MyDSLParser;

import MainSection, Finish;

mydsl
: initialsection
 mainsection
 finish
;


And I have additional parser for parsing the subsections of my dsl e.g. for parsing the mainsection based on the same grammar specified in "MainSection.g4".

grammar MainSectionParser;

import MainSection;

content :  mainsection
;


Based on the "MainSectionParser.g4" I generate a parser for reading the section "mainsection" to which I pass the tokenstream as far as the method exitMainsection(...) gets called from MyDSLParser. But hence I generate two parsers, all the elements in the section "mainsection" get generated two times e.g. MainSectionParserBaseListener as well as MyDSLParserBaseListener have a method exitAction(ActionContext ctx). But the ActionContext is different for both of them although it is the same grammar. This is the problem that I would like to solve more elegant.

Nevertheless I know it works by passing the tokenstream, I don't think that this solution is elegant.

Jim Idle

unread,
Jan 8, 2015, 8:19:18 PM1/8/15
to antlr-di...@googlegroups.com
The way to do this is:

  • Have a single grammar for the whole thing that assumes it will get the whole input
  • In your lexer grammar, look for the import statement
  • When seen, push the the current input stream on a stack, create a new one based on the import file
  • Set the new input stream as the input for the lexer
  • Carry on
  • At the end of the input stream, pop the prior input stream off the stack and tell the lexer to use it
  • You can do this recursively of course

I believe that ANTLR4 tokens now contain a reference to the input stream that generated them, so your error messages should be good. In ANTLR3 I had to keep a list of them and then inject syntax:  

# inputstreamindex line col

When new streams started or finished. 

You don't need separate grammars unless you really would have a need to parse these sections on their own in some other context and for some reason cannot just start the parse at the rule in the general grammar, which seems an unlikely situation.

Jim




On Fri, Jan 9, 2015 at 1:20 AM, nomawie <wiech...@embedded.rwth-aachen.de> wrote:
Hi,

thanks for your answers and sorry for my late response, I've been in holiday over new year. I changed my implementation so that I pass the tokenstream from one parser the other. But the problems remains the same. I have my main parser which can read the whole file derived from the grammar MyDSL.
grammar MyDSLParser;

import MainSection, Finish;

mydsl
: initialsection
 mainsection
 finish
;


And I have additional parser for parsing the subsections of my dsl e.g. for parsing the mainsection based on the same grammar specified in "MainSection.g4".

grammar MainSectionParser;

import MainSection;

content :  mainsection
;


Based on the "MainSectionParser.g4" I generate a parser for reading the section "mainsection" to which I pass the tokenstream. But hence I generate two parsers, al the elements in the section "mainsection" get generated two times e.g. MainSectionParserBaseListener as well as MyDSLParserBaseListener have a method exitAction(ActionContext ctx). But the ActionContext is different for both of them although it is the same grammar. This is the problem that I would like to solve more elegant.

Nevertheless I know it works by passing the tokenstream, I don't think that this solution is elegant.

--

nomawie

unread,
Jan 9, 2015, 4:12:49 AM1/9/15
to antlr-di...@googlegroups.com



You don't need separate grammars unless you really would have a need to parse these sections on their own in some other context and for some reason cannot just start the parse at the rule in the general grammar, which seems an unlikely situation.


Thanks for your hints, I'll have a look at this. But the case you mentioned above, is exactly the use case I have and the reason why I want to have to separate parsers which can work together as one. Otherwise I could solve this using abstractions in the code. I need to be able to parse incomplete parts of the DSL e.g. only the mainsection, verify its syntax and generate code from an incomplete specification, which will be used later somewhere else. 

Jim Idle

unread,
Jan 9, 2015, 4:35:38 AM1/9/15
to antlr-di...@googlegroups.com
Yes, but why can't you do that by just invoking the relevant rule in your complete grammar?


On Jan 9, 2015, at 5:12 PM, nomawie <wiech...@embedded.rwth-aachen.de> wrote:




You don't need separate grammars unless you really would have a need to parse these sections on their own in some other context and for some reason cannot just start the parse at the rule in the general grammar, which seems an unlikely situation.


Thanks for your hints, I'll have a look at this. But the case you mentioned above, is exactly the use case I have and the reason why I want to have to separate parsers which can work together as one. Otherwise I could solve this using abstractions in the code. I need to be able to parse incomplete parts of the DSL e.g. only the mainsection, verify its syntax and generate code from an incomplete specification, which will be used later somewhere else. 

--

nomawie

unread,
Jan 9, 2015, 4:55:25 AM1/9/15
to antlr-di...@googlegroups.com
I am not sure if I understand you correctly. I invoke the same rule hence I import the g4-file. Both parsers have the method e.g.
exitMainsection(MainsectionContext ctx)
but they are implemented differently. In my parser which can read the whole file I work with a different model than in the parser which reads the incomplete specification. In my complete parser I do the following
subparser = new MainsectionParser(ctx.getOriginalText(start,end));
end pass the result to the subparser. But additionally I change some pointers and the like. What I would like to do is to pass the context like
subparser = new MainsectionParser(ctx);
But this is not possible due to different MainsectionContext-Objects although they are based on the same rule but generated twice.
Reply all
Reply to author
Forward
0 new messages