std::unordered_map<std::string, std::shared_ptr<macro> > macroLibrary;
ANTLRInputStream *_input;
CommonTokenStream *_cts;
mLexer *_mLexer; // We seem to be able to remove this without affecting visit.
mParser *_mParser;
mParserVisitor *_mParserVisitor;
mParser::CodeContext *_mParseTree;
We are implementing a macro language, which processes upwards of 15,000 nodes (think files) at a time, each one of which will be composed of a core set of macros and source-code fields (which themselves are visitable and are, in effect, unparameterised macros). There are about 2,000 core macros and another 50,000 instances of visitable source-code in any given instance.Some macro definitions are visited over 100k times - and some visitable source is likewise visited 15k or more times.My first foray on this group was because we weren't aware of the impact of the parse versus the visit. So we now ensure that all source and macros are parsed just once, and then the parsed code is stored in a map as follows (each macro has a unique name):std::unordered_map<std::string, std::shared_ptr<macro> > macroLibrary;
Using this method, we have brought process execution time to 200% the original hand-rolled parse, which we consider to be acceptable.The problem is that the map is massive, and the memory cost is prohibitive.
Our servers are killing these processes on our larger instances due to prohibitive memory allocation. FYI our own code uses shared_ptr everywhere we can and we aren't seeing significant memory leaks.
Each macro instance consists of the following ANTLR4 objects (namespaces removed):
ANTLRInputStream *_input;
CommonTokenStream *_cts;
mLexer *_mLexer;
mParser *_mParser;
mParserVisitor *_mParserVisitor;
mParser::CodeContext *_mParseTree;
QUESTION 1: Can we delete or remove any of these objects once we have parsed the source-code?Obviously we need to keep enough to visit the code, but that is all.
QUESTION 2: Are there any other ways that we can reduce the memory impact of the ANTLR code? StackExchange suggests tweaking things like clearing the DFA with ParserATNSimulator.clearDFA(), but my concern is that it may have prohibitively negative performance impact being a static. Someone also mentioned resetting the PredictionContextCache - but again, I don't know if that is used in visits, and I don't know what sort of impact it would have.
On 19 Sep 2018, at 08:24, 'Mike Lischke' via antlr-discussion <antlr-di...@googlegroups.com> wrote:
Hi Ben,We are implementing a macro language, which processes upwards of 15,000 nodes (think files) at a time, each one of which will be composed of a core set of macros and source-code fields (which themselves are visitable and are, in effect, unparameterised macros). There are about 2,000 core macros and another 50,000 instances of visitable source-code in any given instance.
Some macro definitions are visited over 100k times - and some visitable source is likewise visited 15k or more times.
My first foray on this group was because we weren't aware of the impact of the parse versus the visit. So we now ensure that all source and macros are parsed just once, and then the parsed code is stored in a map as follows (each macro has a unique name):
std::unordered_map<std::string, std::shared_ptr<macro> > macroLibrary;
Using this method, we have brought process execution time to 200% the original hand-rolled parse, which we consider to be acceptable.
The problem is that the map is massive, and the memory cost is prohibitive.
How much is "massive" here? 2000 macros, each consisting of the 6 pointers makes it 96K RAM. Nothing I'd call massive.
I don't know what you store in the macros, but don't duplicate the input. It's usually enough to store a macro name and start and end positions of the replacement text.
Our servers are killing these processes on our larger instances due to prohibitive memory allocation. FYI our own code uses shared_ptr everywhere we can and we aren't seeing significant memory leaks.
What controls the life time of the macros? Who is the owner? If the macroLibrary manages them you don't need shared pointers. Just make the map: `std::unordered_map<std::string, macro> macroLibrary;` instead and pass around references.
Each macro instance consists of the following ANTLR4 objects (namespaces removed):
ANTLRInputStream *_input;
CommonTokenStream *_cts;
mLexer *_mLexer;
mParser *_mParser;
mParserVisitor *_mParserVisitor;
mParser::CodeContext *_mParseTree;
QUESTION 1: Can we delete or remove any of these objects once we have parsed the source-code?
Obviously we need to keep enough to visit the code, but that is all.
First we'd need to know if all these instances are created per macro or are just references to instances stored somewhere else. There's certainly no need to create an own input stream, lexer + parser for each macro. After all, you have a file containing these macros, which is parsed by a single parser instance. You could remove the references and provide access by the outer class that holds the macroLibrary, but as I have shown above, that wouldn't make a really big difference.
There's certainly no need to create an own input stream, lexer + parser for each macro. After all, you have a file containing these macros, which is parsed by a single parser instance.
QUESTION 2: Are there any other ways that we can reduce the memory impact of the ANTLR code? StackExchange suggests tweaking things like clearing the DFA with ParserATNSimulator.clearDFA(), but my concern is that it may have prohibitively negative performance impact being a static. Someone also mentioned resetting the PredictionContextCache - but again, I don't know if that is used in visits, and I don't know what sort of impact it would have.
These suggestions aren't really helpful. The caches are created to speed up parsing. If you clear them, you will get much higher parse times and the memory footprint doesn't really go down, because on the next parse run these caches are built again.
Have you instrumented your code to see where the memory is spent on? What is the biggest structure (and how big is it actually)?
The problem is that the map is massive, and the memory cost is prohibitive.
How much is "massive" here? 2000 macros, each consisting of the 6 pointers makes it 96K RAM. Nothing I'd call massive.Yeah, it's not the pointers that are taking up the space - but what the pointers are pointing to...We are seeing 10-20 Gigs memory being used.
I don't know what you store in the macros, but don't duplicate the input. It's usually enough to store a macro name and start and end positions of the replacement text.I'm not sure I understand. We keep the name as a key, and the class includes the 6 pointers, and a few other things (a few bools and size_t - nothing significant).
We do keep the expansion text, which we could probably lose - but it's maybe 200mb of text altogether, which we can easily absorb.
First we'd need to know if all these instances are created per macro or are just references to instances stored somewhere else. There's certainly no need to create an own input stream, lexer + parser for each macro. After all, you have a file containing these macros, which is parsed by a single parser instance. You could remove the references and provide access by the outer class that holds the macroLibrary, but as I have shown above, that wouldn't make a really big difference.Okay, so since yesterday (writing out questions is always a good way of thinking about things) we've taken out both the Visitor and the Lexer from the class - we instantiate the visitor when we revisit the macro - the impact seems to be minimal. Same goes with the lexer, which we use just for the parse.
There's certainly no need to create an own input stream, lexer + parser for each macro. After all, you have a file containing these macros, which is parsed by a single parser instance.I don't know how you say that. There is no file. There are hundreds of thousands - even millions - of strings, many of which are context dependent, all of which have to be treated as re-visitable macro definitions.I guess what you are suggesting is that we could serialise everything out into a massive file, and store pointers for each macro boundary, using some form of distinguished file separator between each macro - but...
Also, when we visit the code, we need to get fragments of the macro (the text parts that define it), in other words, it does something, corresponding to the parseTree.Ideally, we would like to be able to solely store each parsetree for each macro and use the visitor to access it - but it seems that the parsetree depends upon it's assigned parser, and likewise the input-stream.It's the internal interconnectivity of Antlr4 which is not so clear to me.
Even more useful would be to be able to serialise the parsetree so that we can store (and reload) parsetrees without having to re-parse them. Is that something that ANTLR can do?
One of the elements of building the parsetree I don't understand is that the InputStream must be kept, and the CommonTokenStream, but the Lexer - not so much.