C++ ANTLR4 runtime done

784 views
Skip to first unread message

Mike Lischke

unread,
Jun 17, 2016, 5:11:49 AM6/17/16
to antlr-di...@googlegroups.com
Hi,

after some further fine tuning and improvements I consider the C++ runtime to be stable enough for real world projects. Of course there might here and there be small changes in the API in the future, mostly for more fine tuning (like lowering usage of smart pointers), but otherwise it's ready for prime time. I haven't seen any bug reports so far and I'm actively integrating it in a big application, so it looks as if it's (mostly) done.

I've written a blog post (http://www.soft-gems.net/index.php/tools/49-the-antlr4-c-target-is-here) with all the relevant details about the target and will extend that if necessary. I also added a markdown document in the ANTLR4 docs (https://github.com/DanMcLaughlin/antlr4/blob/master/doc/cpp-target.md), similar like what exists for the other targets.

What's left is just organizational stuff (like merging up to the main ANTLR repo and adding the C++ target to the ANTLR release process). Until this is done I offer a snapshot jar for download that includes the C++ target support: http://www.soft-gems.net/files/antlr4-4.5.4-SNAPSHOT.jar. As mentioned previously the C++ runtime folder also contains a demo project with individual project files for cmake, Visual Studio and XCode. That should make it easy to get started with it.

Hope you like it. Enjoy,

Mike
--
www.soft-gems.net

Jim Idle

unread,
Jun 17, 2016, 5:39:56 AM6/17/16
to antlr-di...@googlegroups.com
Good work!

I'll try to get some time to look at it and maybe help with the performance (if it needs it that is). I spent qutie a lot of time on the v3 C performance and maybe some ideas can carry over.  

Cheers,

Jim






-- 
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Lischke

unread,
Jun 17, 2016, 9:16:54 AM6/17/16
to antlr-di...@googlegroups.com
> I'll try to get some time to look at it and maybe help with the performance (if it needs it that is). I spent qutie a lot of time on the v3 C performance and maybe some ideas can carry over.

Jim, that would be great. I profiled the test parser (in XCode instruments) to see where it spends its time and the top 4-5 points always were shared_ptr allocation + deallocation and related code. So, I'd like to get those smart pointers out from the library as much as possible (without sacrificing memory stability of course). I was even thinking about implementing a simple GC. A class that would store mem pointers and maintain a retain count, but decided against that when I finally had the breakthrough in speed.

Mike
--
www.soft-gems.net

Jim Idle

unread,
Jun 17, 2016, 9:57:13 AM6/17/16
to antlr-di...@googlegroups.com
Yeah I would avoid implement your own GC - it's freight with difficulty. Now you have your base code you can work on the allocations. Generally that is the issue with all C++ code in my humble opinion. There some other stl implementations, mostly for gaming that do a good job in this regard so maybe if people want speed then they could use them.

 It's not clear to me that v4 needs to be that fasr though. Few of us are writing compilers that need the front end to be exceptionally fast. But time will always make things faster (no pun intended;)

You may be able to make use of unique_ptr?






-- 
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Lischke

unread,
Jun 17, 2016, 10:50:26 AM6/17/16
to antlr-di...@googlegroups.com
> Yeah I would avoid implement your own GC - it's freight with difficulty. Now you have your base code you can work on the allocations. Generally that is the issue with all C++ code in my humble opinion. There some other stl implementations, mostly for gaming that do a good job in this regard so maybe if people want speed then they could use them.

Heh, that would be a funny thing, tho not for my stuff :-)

>
> It's not clear to me that v4 needs to be that fasr though. Few of us are writing compilers that need the front end to be exceptionally fast. But time will always make things faster (no pun intended;)

Well, even simple error checks must be quick e.g. in an IDE. You don't want to wait for 2 and more secs til the parser has determined the correctness e.g. for a complex class. In my tests I used a 4500 tokens large expression, which started out at 17s just for lexing. By avoiding many shared_ptr copies in the runtime I improved that by 30% - down to ~12s. By removing just 2 left edge predicates in the grammar this went down to 70ms (after the warmup phase, which is ~0.5s for this grammar). What an improvement!

In (My)SQL I often have huge files. 250MB in size, 3.5 millions lines long with hundred thousands of statements (aka. a dump), which I all need to send through the parser for error checking. That shouldn't take longer than a couple seconds. FYI: it's about 20s with the ANTLR3 C target. As I said before: that really rocks :-). I'm not able to measure with the new target yet.

>
> You may be able to make use of unique_ptr?

Yes, I already converted a number of the shared pointers to unique_ptr. Especially in the token stream classes (buffered and unbuffered token stream + token list source) and pass around raw pointers from them.

Mike
--
www.soft-gems.net

Jonathan Coveney

unread,
Jun 17, 2016, 10:53:21 AM6/17/16
to antlr-di...@googlegroups.com
A heroic effort. Well done!!

Sam Harwell

unread,
Jun 17, 2016, 10:57:24 AM6/17/16
to antlr-di...@googlegroups.com
I've found that compiling with full optimizations makes a big difference for smart pointer types. For reference, were optimizations enabled when you ran the tests described below?

I know that some of the graph-like data structures use (traverse) a large number of references so it makes sense if that's proving costly.

Sam


From: 'Mike Lischke' via antlr-discussion <antlr-di...@googlegroups.com>
Sent: Jun 17, 2016 9:50 AM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] C++ ANTLR4 runtime done

Mike Lischke

unread,
Jun 17, 2016, 11:19:39 AM6/17/16
to antlr-di...@googlegroups.com
I've found that compiling with full optimizations makes a big difference for smart pointer types. For reference, were optimizations enabled when you ran the tests described below?

I know that some of the graph-like data structures use (traverse) a large number of references so it makes sense if that's proving costly.

Good point. It's not a big difference in clang, but with Visual Studio it's quite different. However, I haven't timed the tests there yet.

However, this only matters really if the allocations are repeated often, e.g. when a DFA state is not cached because there is a left edge predicate. I have > 700 lexer tokens in my grammar and had only 2 with such predicates. Yet everything was thrown away and recomputed for every ATN execution step, which is of course extremely slow. For reference I also built a Java parser from my grammar and the same query (the one with 4500 tokens) ran in 2.5s (before I optimized it). Still a lot faster than the 12s in C++. Quite depressing if Java code executes faster than C++ code :-) After optimization the world was fine again (70ms in C++ and 600ms in Java) :-D

Mike

Eric Vergnaud

unread,
Jun 17, 2016, 12:59:10 PM6/17/16
to antlr-discussion
Great job!

Let me know if you need help with the "organizational stuff".
We have high expectations re automated testing using Travis CI, really looking forward to a common set of automated tests for all runtime targets.

Eric

kang joni

unread,
Jun 18, 2016, 1:12:12 AM6/18/16
to antlr-discussion


On Friday, June 17, 2016 at 10:19:39 PM UTC+7, Mike Lischke wrote:

However, this only matters really if the allocations are repeated often, e.g. when a DFA state is not cached because there is a left edge predicate. I have > 700 lexer tokens in my grammar and had only 2 with such predicates. Yet everything was thrown away and recomputed for every ATN execution step, which is of course extremely slow. For reference I also built a Java parser from my grammar and the same query (the one with 4500 tokens) ran in 2.5s (before I optimized it). Still a lot faster than the 12s in C++. Quite depressing if Java code executes faster than C++ code :-) After optimization the world was fine again (70ms in C++ and 600ms in Java) :-D

Mike
congrats :)
 

Mike Lischke

unread,
Jun 18, 2016, 7:03:01 AM6/18/16
to antlr-di...@googlegroups.com
Hey Eric,

>
> Let me know if you need help with the "organizational stuff".
> We have high expectations re automated testing using Travis CI, really looking forward to a common set of automated tests for all runtime targets.

You could indeed help there. First is runtime tests. Can you check the python + js runtime tests and fix them where necessary? I tried hard not to affect the other targets but had to add a few things in the test templates that might have broken them.

And the other issue: currently Travis Ci fails with the pull request I sent: https://github.com/antlr/antlr4/pull/1210. The compiler cannot cope with C++11 atm. We'd need someone with Travis (and at least a bit C++) knowledge who can fix that. I'll withdraw the pull request however, as I already changed a few more things and will send a new one, once we have the travis build done. Is it possible to run Travis CI before sending out a new pull request?

There is a third issue, which I hope David Sission will fix: for the runtime tests we currently have to copy the built static runtime lib manually. This should of course be automated.

I'll take care to update the doc/releasing-antlr.md file with instructions to deploy the C++ runtime (prebuilt binaries and headers for VS 2013 + VS 2015, XCode), full source code for cmake.

Mike
--
www.soft-gems.net

Eric Vergnaud

unread,
Jun 18, 2016, 9:30:35 AM6/18/16
to antlr-discussion
Hi Mike,

Sure.

I'm moving country this week-end, so won't be able to look into python / js tests before Wednesday.

Re Travis CI, what should be possible is to run all the tests locally for all targets (that's what I do).
Travis does nothing more except create fresh environments.
What I had to do for some of the targets was to actually install the correct versions of whatever was required as part of the build(this is done in the yams file if I recall correctly).
Automating the copy of the static lib should be easy to do. Will look into that one the other stuff is ok.

Eric

Jan Krause

unread,
Jun 19, 2016, 5:17:30 PM6/19/16
to antlr-discussion
great job guys!

Mike Lischke

unread,
Jun 20, 2016, 3:30:04 AM6/20/16
to antlr-di...@googlegroups.com

> I'm moving country this week-end, so won't be able to look into python / js tests before Wednesday.

Ok, sure. Private stuff first :-)

>
> Re Travis CI, what should be possible is to run all the tests locally for all targets (that's what I do).
> Travis does nothing more except create fresh environments.

Hmm, I'd like to avoid doing that locally for a number of reasons. But of course, if you can do that I'm fine with that.

> What I had to do for some of the targets was to actually install the correct versions of whatever was required as part of the build(this is done in the yams file if I recall correctly).
> Automating the copy of the static lib should be easy to do. Will look into that one the other stuff is ok.

Don't worry about the lib copy. It's shouldn't copy actually, but build it from scratch when starting the tests. David Sisson will take care for that soon. I'm working on deployment scripts atm, so it's really the Travis CI and the JS/PY runtime tests what's left for you.

Mike
--
www.soft-gems.net

Devlin Poster

unread,
Jun 20, 2016, 5:50:08 PM6/20/16
to antlr-discussion
Great work guys!

Is this considered production quality? Working on a commercial C++ application that needs a quality parser.

I was about to use JavaCC, because it supports C++ pretty well, but now that Antlr4 got a C++ target, I may consider switching.

Thanks.

Mike Lischke

unread,
Jun 21, 2016, 3:24:06 AM6/21/16
to antlr-di...@googlegroups.com
>
> Is this considered production quality? Working on a commercial C++ application that needs a quality parser.
>
> I was about to use JavaCC, because it supports C++ pretty well, but now that Antlr4 got a C++ target, I may consider switching.

The new target has not been widely tested yet, but I know of several people who started with it and except for a file encoding problem I have only positive feedback so far. I hope I can fix that encoding problem today.

Mike
--
www.soft-gems.net

Devlin Poster

unread,
Jun 21, 2016, 4:29:08 AM6/21/16
to antlr-discussion

The new target has not been widely tested yet, but I know of several people who started with it and except for a file encoding problem I have only positive feedback so far. I hope I can fix that encoding problem today.


Mike,

I started playing around with it last night and it works flawlessly so far. 

There's something almost unreal about working with the IntelliJ ANTLR plugin to develop and verify the grammar, and then being able to generate a working C++ parser. This is seriously good stuff.

The fact that, for the most part, it mirrors the Java API makes it even easier, since all the documentation out there is valid.

Is this going into the upstream ANTLR4 distro?


Mike Lischke

unread,
Jun 21, 2016, 5:21:43 AM6/21/16
to antlr-di...@googlegroups.com
> I started playing around with it last night and it works flawlessly so far.
>
> There's something almost unreal about working with the IntelliJ ANTLR plugin to develop and verify the grammar, and then being able to generate a working C++ parser. This is seriously good stuff.
>
> The fact that, for the most part, it mirrors the Java API makes it even easier, since all the documentation out there is valid.

Devlin, I'm glad to hear it's working so well for you too.

>
> Is this going into the upstream ANTLR4 distro?

Absolutely! I just finished the deployment script for Windows. That should make it easy for Ter to deploy the C++ runtime along with any other ANLTR4 stuff (provided he can find Mac + Win machines to compile the binaries :-), for Linux it's source code + cmake anyway). As I wrote before we need to do a last change in the runtime tests (building the static lib as part that) and have to make Travis CI happy. Then we are ready to go.

Mike
--
www.soft-gems.net

Adam Retter

unread,
Jun 21, 2016, 6:28:40 AM6/21/16
to antlr-di...@googlegroups.com
>> Is this going into the upstream ANTLR4 distro?
>
> Absolutely! I just finished the deployment script for Windows. That should make it easy for Ter to deploy the C++ runtime along with any other ANLTR4 stuff (provided he can find Mac + Win machines to compile the binaries :-), for Linux it's source code + cmake anyway). As I wrote before we need to do a last change in the runtime tests (building the static lib as part that) and have to make Travis CI happy. Then we are ready to go.
>

I have been following this thread with interest. I have a suggestion
that might help with the Windows builds, and it is an approach that I
am considering in another Java/C++ multiplatform project. I see that
Antlr already uses Travis-CI with it's GitHub. You could also add
AppVeyor into the mix (which is similar but is a CI for Windows code)
and is available free for Open Source projects. You can have Antlr
upload binary build artifacts after a successful build. In this manner
creating a release, would just involve collecting the Windows build
artifacts uploaded to say BinTray or GitHub by AppVeyor and packaging
them into the distribution (or whatever you do). This could help
remove the burden of needing a licensed Windows machine to do a
release.

Cheers Adam.

--
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

Devlin Poster

unread,
Jun 21, 2016, 7:22:42 AM6/21/16
to antlr-discussion

Absolutely! I just finished the deployment script for Windows. 

We use CLion for C++ on all platforms, which is CMake based (CMake can also generate Visual Studio project files IIRC)

Will the CMake file also work on Windows/CLion? I'm currently on my Mac, but I can try it later this week.

Mike Lischke

unread,
Jun 21, 2016, 9:21:07 AM6/21/16
to antlr-di...@googlegroups.com
Hi Adam,

>>
>> Absolutely! I just finished the deployment script for Windows. That should make it easy for Ter to deploy the C++ runtime along with any other ANLTR4 stuff (provided he can find Mac + Win machines to compile the binaries :-), for Linux it's source code + cmake anyway). As I wrote before we need to do a last change in the runtime tests (building the static lib as part that) and have to make Travis CI happy. Then we are ready to go.
>>
>
> I have been following this thread with interest. I have a suggestion
> that might help with the Windows builds, and it is an approach that I
> am considering in another Java/C++ multiplatform project.

Well, I have no trouble with the VS builds. Everything is fine and working. It's just that Travis CI must be configured correctly to build our C++11 code.

> I see that
> Antlr already uses Travis-CI with it's GitHub. You could also add
> AppVeyor into the mix (which is similar but is a CI for Windows code)
> and is available free for Open Source projects.

That would be a decision for Ter, but right, that would help to check builds also on Windows machines (I believe Travis CI is only using Linux, but I might be wrong).

> You can have Antlr
> upload binary build artifacts after a successful build. In this manner
> creating a release, would just involve collecting the Windows build
> artifacts uploaded to say BinTray or GitHub by AppVeyor and packaging
> them into the distribution (or whatever you do). This could help
> remove the burden of needing a licensed Windows machine to do a
> release.

Interesting thought. Tbh I don't know how Ter is going to manage Win + Mac builds for ANTLR in the future. But AppVeyor could be a help at least for Win builds.

On the other hand: code won't change that much. I could even imagine providing the binaries myself. It's really not much work. Takes 60s to build in XCode and VS + the time to upload. Really trivial. So, it remains to be determined if it is worth to add yet another tool to the mix with all registration, setup and whatnot or just do the manual builds once in 6 months.

Mike
--
www.soft-gems.net

Mike Lischke

unread,
Jun 21, 2016, 9:26:23 AM6/21/16
to antlr-di...@googlegroups.com


Absolutely! I just finished the deployment script for Windows. 

We use CLion for C++ on all platforms, which is CMake based (CMake can also generate Visual Studio project files IIRC)

Will the CMake file also work on Windows/CLion? I'm currently on my Mac, but I can try it later this week.

No, cmake is only working in Linux + OSX (without creating project files).


Devlin Poster

unread,
Jun 21, 2016, 9:52:05 AM6/21/16
to antlr-discussion

Devlin, I'm glad to hear it's working so well for you too.


Further testing of the C++ target today has made me confident enough to pick ANTLR4. I might just become the first production user of the new target :D


Cristian Adam

unread,
Jun 21, 2016, 10:23:49 AM6/21/16
to antlr-discussion
It's way easier to maintain only one build system. CMake has generators for all newer Visual Studio versions,
and it's being used also by Microsoft.

One could also build the Java parts with CMake, this way the C++ developers won't have to deal with... java tools :)

Cheers,
Cristian.
 

Adam Retter

unread,
Jun 21, 2016, 10:54:20 AM6/21/16
to antlr-di...@googlegroups.com
> One could also build the Java parts with CMake, this way the C++ developers
> won't have to deal with... java tools :)

Likewise you could build the C++ code with Java tools (Maven NAR
plugin), then Java developers don't have to deal with C++ tools :-p
Sorry couldn't resist ;-)

Devlin Poster

unread,
Jun 21, 2016, 8:20:34 PM6/21/16
to antlr-discussion
By the way, a lot of people would be happy to see ANTLR C++ over at conan.io as well (rapidly becoming *the* package manager for C++, since it's cross platform and actually works)

Eric Vergnaud

unread,
Jun 28, 2016, 5:51:52 AM6/28/16
to antlr-discussion
Mmmm...
Started looking at the tests, and a lot of them are indeed broken...
I strongly suggest you make it possible to run them locally, waiting for a Travis build is going to be terribly slow...

Mike Lischke

unread,
Jun 28, 2016, 5:56:35 AM6/28/16
to antlr-di...@googlegroups.com
>
> Mmmm...
> Started looking at the tests, and a lot of them are indeed broken...
> I strongly suggest you make it possible to run them locally, waiting for a Travis build is going to be terribly slow...


What can I do Eric? I cannot fix those tests, so it wouldn't make much sense to do a local Travis build. I of course ran the C++ runtime tests (and those for the Java target). Can't you just start with the Python + JS runtime tests? Wouldn't need any extra preparation on either side. The Travis build issue is a separate thing and not related to the runtime testing.

Mike
--
www.soft-gems.net

Eric Vergnaud

unread,
Jun 29, 2016, 4:38:49 AM6/29/16
to antlr-discussion
Mike,

as we progress with this, it is likely that some tests will be broken.
It does make sense to be able to run a local full Maven build if only to check that.

Eric

Mike Lischke

unread,
Jun 29, 2016, 5:11:00 AM6/29/16
to antlr-di...@googlegroups.com
> as we progress with this, it is likely that some tests will be broken.
> It does make sense to be able to run a local full Maven build if only to check that.


Absolutely, that's what I did. Just not a Travis CI build.

Mike
--
www.soft-gems.net

Eric Vergnaud

unread,
Jun 29, 2016, 1:18:39 PM6/29/16
to antlr-discussion
And the non-Cpp tests pass?

Mike Lischke

unread,
Jun 29, 2016, 2:55:02 PM6/29/16
to antlr-di...@googlegroups.com
Eric, I don't understand your question. I told you already what I tested and what passed. Runtime tests have nothing to do with Travis CI. The build fails there, which is a different problem. Of the runtime tests those for C++ and Java pass. I'm not sure about C#, as there were so many failures when I ran the full test bed, but I can check this tomorrow again.

Mike Lischke

unread,
Jul 1, 2016, 4:09:36 AM7/1/16
to antlr-di...@googlegroups.com
Hi Eric,

> I'm not sure about C#, as there were so many failures when I ran the full test bed, but I can check this tomorrow again.


I just checked and saw that the C# tests are also fine. So it's really only about JS + PY (probably some syntax errors in the template files).

Mike
--
www.soft-gems.net

Gerald Gainant

unread,
Jul 18, 2016, 6:37:05 PM7/18/16
to antlr-discussion
Could you post again the file antlr4-4.5.4-SNAPSHOT.jar? Thx

Mike Lischke

unread,
Jul 20, 2016, 11:21:39 AM7/20/16
to antlr-di...@googlegroups.com
>
> Could you post again the file antlr4-4.5.4-SNAPSHOT.jar? Thx

I updated the jar today.


Mike
--
www.soft-gems.net

Gerald Gainant

unread,
Sep 6, 2016, 8:48:26 PM9/6/16
to antlr-discussion

404 - File or directory not found.

The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.


while trying to access the link http://www.soft-gems.net/files/antlr4-4.5.4-SNAPSHOT.jar ... 

Gerald Gainant

unread,
Sep 6, 2016, 8:55:02 PM9/6/16
to antlr-discussion

Mike Lischke

unread,
Sep 7, 2016, 3:16:10 AM9/7/16
to antlr-di...@googlegroups.com
Yes, I moved the file to my regular downloads, so I can track the download count. You can also get it from here: http://www.soft-gems.net/index.php/all-downloads.


Gerald Gainant

unread,
Sep 7, 2016, 2:55:57 PM9/7/16
to antlr-discussion
Hi Mike,

Yep, its works for me now and thank you for your time to have this C++ target working.

About the impact of C++11 memory management on performances, do you think an alternative to the shared_ptr could be considered?

Personally, I am expecting a parser to be simple and fast. One way to be fast could to allocate continuously from a large block of memory without recycling allocations at all, releasing the whole block once the parsing is done.
This implies some constraints but it's fast and simple and provides a nice persistence on data allowing to get rid of reference counters.
If this parser needs a temporary amount of 10MB of memory of more to run on a file content, I am ok with that, from the moment I can get back this memory once the parsing is done, with a simple free().

The memory pool system from the ApacheProtableRuntime is a very efficient way to deal with memory in C/C++, far better to my point of view than an OO model like shared_ptr and co.
It allows a higher level of management and could be customize a lot, assigning specific allocator to the context of a pool usage.
And I think it could be very easy to integrate in your C++ target and appreciated by C++ prog who wants control on the memory management.

I hope it could be a good feedback.

Mike Lischke

unread,
Sep 8, 2016, 3:55:30 AM9/8/16
to antlr-di...@googlegroups.com
Hi Gerald,

> About the impact of C++11 memory management on performances, do you think an alternative to the shared_ptr could be considered?

I'm still not 100% settled yet. The current solution seems to work quite nicely and meanwhile the performance is close to the ANTLR3 C target (which is probably the fastest target you can get for ANTLR).

>
> Personally, I am expecting a parser to be simple and fast. One way to be fast could to allocate continuously from a large block of memory without recycling allocations at all, releasing the whole block once the parsing is done.

Well, what I don't want is an own memory manager. It's simply way too much for a single library. However I've been thinking about:

1) Implement own ref counting. I have done this in another application, so I know how to do it.
2) Use arenas (or zones) which get deallocated in one step when you free your parser (or the tree).
3) Use a parse context which serves as the manager of all the memory allocated during parsing and which frees everything one destruction (in the classic way with a delete call).

Keep in mind, however, that there are actually 2 areas where memory is used in the ANTLR runtime: the static data (DFA, ATN) which is shared among all instances of a specific parser/lexer class and can easily grow to half a GB (I have seen this with deeply nested expressions) and the transient data created during a parse run (mostly the parse tree, which you can switch off, btw.). I have lowered the memory load of the static data by that move from an array to a map for certain sparse containers (I mentioned that in an earlier mail) and it could be there is more to optimize. This static data can probably also be optimized to use no shared ptrs, which are currently required to keep objects alive that are held only by other substructures (e.g. ATN configs). Since this is static data, maybe we don't even need to care about destruction at all, as that stays alive anyway as long as the application is running.

Optimizing the parse tree is probably easier however, because it's mostly just the tree itself that holds the context instances and the only thing we have to make sure is that they are freed when the user no longer holds a reference to the tree. Here the idea of a parse context would fit quite well and this is what I'm currently roll around in my brain.

Mike
--
www.soft-gems.net

Gerald Gainant

unread,
Sep 9, 2016, 10:10:16 PM9/9/16
to antlr-discussion
Hi Mike,

All is fine for me now and I have my first listener working. That's great and also a first disappointment at the first time: ~2s and 436MiB of memory to build the parser tree of a 3.57MiB input file ... You mentioned the two areas where memory is used, I don't know how much is allocated for the parse tree and I will need to figure out this aspect.
A factor of 120 is ... surprising. A factor < 10 could be very good and a huge gain on the parsing time too.
May some other people can start report numbers too to check mines and to track a progression?
Definitively I need to dig deeply into the run-time from now. In my situation, the amount of small memory allocations could probably impact both on the perfs and the footprint.

Mike Lischke

unread,
Sep 10, 2016, 6:24:46 AM9/10/16
to antlr-di...@googlegroups.com

> All is fine for me now and I have my first listener working. That's great and also a first disappointment at the first time: ~2s and 436MiB of memory to build the parser tree of a 3.57MiB input file ...

Well, 2s for parsing 3.5MB input file is not that bad I'd say. Is that including warmup or after that? The amount of memory the DFA/ATN consumes depends on the input. This structure grows while it runs the prediction (if needed). Once a path through the rules is initialized it takes the cached values, however (unless you have left hand side predicates, as I mentioned before). And as I wrote in yet another mail, especially rules which allow deep recursion (and input which uses that) can make the static data grow quite large.

> You mentioned the two areas where memory is used, I don't know how much is allocated for the parse tree and I will need to figure out this aspect.

The parse tree is usually very small. It's simply the list of parse contexts that have been seen during a parse run. That is probably only relevant for very large input. And you can switch off the parse tree generation, but then you won't be able to much with the parse result (a tree walk without a tree is obviously not possible), but for syntax checks or with a parse listener there are still some useful areas.

> A factor of 120 is ... surprising. A factor < 10 could be very good and a huge gain on the parsing time too.

What is this 120 or 10 about? I'm missing the context atm.

Mike
--
www.soft-gems.net

Reply all
Reply to author
Forward
0 new messages