Could someone send me a example of main function calling the Lexer and Parser of ANTLR4 C++ target?

572 views
Skip to first unread message

Cleverson Ledur

unread,
Jun 24, 2016, 1:19:03 PM6/24/16
to antlr-discussion
I am begginer with ANTLR and I am facing difficulties when trying to compile examples with the ANTLR4 C++ target. I didn't find in the documentation how is the name of functions to call in order to pass a string to the Lexer and the tokens to the Parser when using the C++ target ANTLR4.

I found an ANTLR3 project that have the following main code:

#include    <iostream>
#include    <TParser.hpp>

using namespace User;
int test_main(int argc, char *argv[])
{

    ANTLR_UINT8*        fName;
    TLexer::InputStreamType input(fName, ANTLR_ENC_8BIT);
    TLexer lxr(&input);     // CLexerNew is generated by ANTLR
    TParser::TokenStreamType tstream(ANTLR_SIZE_HINT, lxr.get_tokSource() );
    TParser psr(&tstream);  // CParserNew is generated by ANTLR3
    

    if (argc < 2 || argv[1] == NULL)
    {
fName =(ANTLR_UINT8*)"./input"; 
    }
    else
    {
fName = (ANTLR_UINT8*)argv[1];
    }


psr.program();

    return 0;
}


int main()
{
test_main(1, NULL);

return 0;
}


Could someone update this code with the new sintax in ANTLR4 C++ target? In ANTLR4, the ANTLR_ENC_8BIT type name and the User namespace are not recognized. 

Thank you very much.

Kevin Cummings

unread,
Jun 24, 2016, 5:29:25 PM6/24/16
to antlr-di...@googlegroups.com
There is an example in runtime/src/Cpp/demo/main.cpp

--
Kevin J. Cummings
Registered Linux User #1232

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cleverson Ledur

unread,
Jun 26, 2016, 8:18:00 PM6/26/16
to antlr-discussion
Hi Kevin,

Thank you for suggesting this demo. I tried to compile this, but without success.

I had some problems when trying to compile this. Basically I generated the files using the generate.sh file through uncommenting the following commands and inserting $LOCATION path to ANTLR jar file:

java -jar $LOCATION -Dlanguage=Cpp -listener -visitor -o generated/ -package antlrcpptest TLexer.g4 TParser.g4


After I copied the main.cpp file to "generated" folder, to have all ".cpp" files in the same folder.

Then, inside generated folder, I tried the following compilation command:

g++ -fPIC -shared -g -O3 -Wall -std=c++11 *.cpp -I. -I ../../runtime/src/ -I ../../runtime/src/atn/ -I ../../runtime/src/dfa/ -I ../../runtime/src/misc/ -I ../../runtime/src/tree/ -I ../../runtime/src/support/ -I ../../runtime/src/tree/pattern/ -I ../../runtime/src/tree/xpath/  -o parser



When I tried to execute ./parser, this gives me a segmentation fault.

I ran gdb and the output is the following:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000017dd6 in ?? ()
(gdb) backtrace
#0  0x0000000000017dd6 in ?? ()
#1  0x000055555556bfa0 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /usr/include/c++/5/iostream:74
#2  _GLOBAL__sub_I_main.cpp(void) () at main.cpp:33
(gdb)


Did you have success compiling this demo? Could you describe how you did or suggest what am I doing wrong?

Thank you very much.

Jim Idle

unread,
Jun 26, 2016, 10:25:41 PM6/26/16
to antlr-di...@googlegroups.com
Those gcc invocation options do not look correct. The -shared option is used to create a shared object, which you then link or load dynamically with a main() program.

Also, if you are using pre-built runtime code, then you need to make sure that the abi in use is compatible and that you are using the same flags as the runtime library. I would suggest that building the runtime library locally might help you.

Jim

Kevin Cummings

unread,
Jun 26, 2016, 10:46:31 PM6/26/16
to antlr-di...@googlegroups.com
On 06/26/16 20:18, Cleverson Ledur wrote:
> Hi Kevin,
>
> Thank you for suggesting this demo. I tried to compile this, but without
> success.

I was able to use the code in the demo, to help port a program I already
had running in ANTLR2 and ANTLR3. So, basically, it was just a question
of "what is the magic incantation(s) to use with ANTLR4".

Mike helped me choose the proper C++ support libraries (wide streams).

> I had some problems when trying to compile this. Basically I generated
> the files using the generate.sh file through uncommenting the following
> commands and inserting $LOCATION path to ANTLR jar file:

> java -jar $LOCATION -Dlanguage=Cpp-listener -visitor -o
> generated/-packageantlrcpptest TLexer.g4 TParser.g4

> After I copied the main.cpp file to "generated" folder, to have all
> ".cpp" files in the same folder.

> Then, inside generated folder, I tried the following compilation command:
>
> |
> g++-fPIC -shared -g -O3 -Wall-std=c++11*.cpp -I.-I ../../runtime/src/-I
> ../../runtime/src/atn/-I ../../runtime/src/dfa/-I
> ../../runtime/src/misc/-I ../../runtime/src/tree/-I
> ../../runtime/src/support/-I ../../runtime/src/tree/pattern/-I
> ../../runtime/src/tree/xpath/ -o parser
> |

I have a Makefile that needed some tweaking for ANTLR4, the basics are
the following:

CC = g++
CFLAGS = -fPIC -shared -g -O3 -Wall -std=c++11 $(NUMERRORS)
LDFLAGS =
IFLAGS =-I../antlrcpp -I. -I$(RUNTIME) -I$(RUNTIME)/atn\
-I$(RUNTIME)/dfa -I$(RUNTIME)/misc -I$(RUNTIME)/tree\
-I$(RUNTIME)/tree/pattern -I$(RUNTIME)/tree/xpath\
-I$(RUNTIME)/support

Then I used:

$(CC) $(IFLAGS) $(CFLAGS) -o $@ -c $<

to compile each file, and:

$(CC) -o program $(obj_FILES) ./libantlr4.a

to link them into an executable.

My program (now) runs fine.

> When I tried to execute ./parser, this gives me a segmentation fault.

What is the file type of your file parser? On Linux for me, it is an
ELF executable. Did all of your C++ files compile without errors? Did
the linking succeed with unresolved references? What did you use for
the antlr4 runtime library? Are you using the shared library or the
object archive?

> I ran gdb and the output is the following:
> |
>
> Programreceived signal SIGSEGV,Segmentationfault.
> 0x0000000000017dd6in??()
> (gdb)backtrace
> #0 0x0000000000017dd6 in ?? ()
> #1 0x000055555556bfa0 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at /usr/include/c++/5/iostream:74
> #2 _GLOBAL__sub_I_main.cpp(void) () at main.cpp:33
> (gdb)

Didn't Mike say we should use wiostream?

> Did you have success compiling this demo? Could you describe how you did
> or suggest what am I doing wrong?

Can't say without seeing more of your specifics. Note that the demo
program uses wstring to store the string to be parsed....

> Thank you very much.

--

Mike Lischke

unread,
Jun 27, 2016, 2:58:35 AM6/27/16
to antlr-di...@googlegroups.com

I had some problems when trying to compile this. Basically I generated the files using the generate.sh file through uncommenting the following commands and inserting $LOCATION path to ANTLR jar file:

java -jar $LOCATION -Dlanguage=Cpp -listener -visitor -o generated/ -package antlrcpptest TLexer.g4 TParser.g4


After I copied the main.cpp file to "generated" folder, to have all ".cpp" files in the same folder.

Then, inside generated folder, I tried the following compilation command:

g++ -fPIC -shared -g -O3 -Wall -std=c++11 *.cpp -I. -I ../../runtime/src/ -I ../../runtime/src/atn/ -I ../../runtime/src/dfa/ -I ../../runtime/src/misc/ -I ../../runtime/src/tree/ -I ../../runtime/src/support/ -I ../../runtime/src/tree/pattern/ -I ../../runtime/src/tree/xpath/  -o parser

You could try to use the provided cmake file to build the demo (https://github.com/DanMcLaughlin/antlr4/tree/master/runtime/Cpp). Read the README.md file which describes the few steps to build a Linux binary.

Mike Lischke

unread,
Jun 27, 2016, 3:06:04 AM6/27/16
to antlr-di...@googlegroups.com
Hi Kevin,


> I was able to use the code in the demo, to help port a program I already
> had running in ANTLR2 and ANTLR3. So, basically, it was just a question
> of "what is the magic incantation(s) to use with ANTLR4".
>
> Mike helped me choose the proper C++ support libraries (wide streams).

There are no wide streams used anymore. I couldn't get them working for UTF-32 (UTF-16 worked well). So I went a simpler way: use a normal istream, load the content into a std::string and convert that to UTF-32, which is just 2 lines of code now (https://github.com/DanMcLaughlin/antlr4/commit/a2f5cf12fd637505714bde6ceb260500ed18c73e#diff-fe97f5b0769e71c265ec5cca48e44e74L55).

> I have a Makefile that needed some tweaking for ANTLR4, the basics are
> the following:
>
> CC = g++
> CFLAGS = -fPIC -shared -g -O3 -Wall -std=c++11 $(NUMERRORS)
> LDFLAGS =
> IFLAGS =-I../antlrcpp -I. -I$(RUNTIME) -I$(RUNTIME)/atn\
> -I$(RUNTIME)/dfa -I$(RUNTIME)/misc -I$(RUNTIME)/tree\
> -I$(RUNTIME)/tree/pattern -I$(RUNTIME)/tree/xpath\
> -I$(RUNTIME)/support

It's no longer needed to include all sub paths from the runtime. All #includes have been adjusted to use a relative path where needed.

>
>> Did you have success compiling this demo? Could you describe how you did
>> or suggest what am I doing wrong?
>
> Can't say without seeing more of your specifics. Note that the demo
> program uses wstring to store the string to be parsed....

Actually, the demo uses an UTF-8 encoded std::string (note the u8 prefix on the demo input, which VS2013 doesn't know, but nonetheless creates an UTF-8 string implicitly).


Mike
--
www.soft-gems.net

Devlin Poster

unread,
Jun 30, 2016, 10:44:43 AM6/30/16
to antlr-discussion

There are no wide streams used anymore. I couldn't get them working for UTF-32 (UTF-16 worked well). So I went a simpler way: use a normal istream, load the content into a std::string and convert that to UTF-32, which is just 2 lines of code now (https://github.com/DanMcLaughlin/antlr4/commit/a2f5cf12fd637505714bde6ceb260500ed18c73e#diff-fe97f5b0769e71c265ec5cca48e44e74L55).

I looked at the code and found stuff like "codecvt_utf8_utf16"

Could you elaborate on why UTF-8 isn't used throughout the code base? It's UTF-8 input, so just use regular string/istream and you should be set.

The internal conversion between utf8 and utf16 is costly and I don't see the point. What am I missing?

Mike Lischke

unread,
Jun 30, 2016, 11:10:52 AM6/30/16
to antlr-di...@googlegroups.com
Devlin, it's pretty simple: the Lexer state machine executes on a single input symbol for each state + step. I cannot pass in multibyte input there.


Reply all
Reply to author
Forward
0 new messages