ANTLR memory allocations look excessive?

25 views
Skip to first unread message

rtm...@googlemail.com

unread,
Jun 27, 2022, 7:31:00 AM6/27/22
to antlr-discussion
For a tiny little parsing input in C# I noticed it took a full second to start up. Using the profiler I looked at the memory allocated and found that it seems to be doing an gigantic amount of object generation, which then has to be cleaned up by the GC.


Here is my input, 413 bytes of it:


{
    LDB_run_internal_self_tests := false;
--    LDB_run_internal_self_tests := true;
};

create table not null orders
(
    onum int pkey,
    oname varchar(30),
    cnum int
);

select onum, count(oname) as nn
from orders
where onum = 23
group by onum
having count(oname) = 999
;

select onum, count(oname) as nn
from orders
where onum = 123
group by onum
having count(oname) = 888
;



Attached is an image of the profiler's output. The top graph I guess is the allocation nursery, and you can see what appeared to be five garbage collections in just that. I don't know what the bottom graph is but it's going to 48 meg.

Below that are the number of allocations from antlr, I have no idea how it's possible it could be doing this much.

A while back I did some profiling with the Python output, and tentatively concluded that the slowdown was something to do with the number of objects allocated, which makes sense because Python uses a skanky expensive reference counting GC and a fallback mark-sweep if necessary for cycles, whereas Java and C# seem to have much more efficient generational collectors.

I don't have time to dig into this, but I would like other people's opinion. Can anyone reproduce this?

Please note that the stuff I'm parsing actually produces an AST inside and a few other bits and pieces, but if you look at the breakdown in the attached image, you'll see that it really is utterly overwhelmed by antler-related objects.

Keep in mind I may be doing something wrong, but given that, any thoughts?

jan
Capture_annot.JPG

Mike Lischke

unread,
Jun 27, 2022, 8:25:53 AM6/27/22
to 'ANTLR announcements
Please use the ANTLR4 Github discussions (https://github.com/antlr/antlr4/discussions) for such postings. This mailing list is now rather an announcement list.


Am 27.06.2022 um 13:31 schrieb 'rtm...@googlemail.com' via antlr-discussion <antlr-di...@googlegroups.com>:

For a tiny little parsing input in C# I noticed it took a full second to start up. Using the profiler I looked at the memory allocated and found that it seems to be doing an gigantic amount of object generation, which then has to be cleaned up by the GC.


Thanks,
Reply all
Reply to author
Forward
0 new messages