Comment #4 on issue 12339 by paolo...@
gmail.com: "Regular expression too long" on simple RegExp
https://bugs.chromium.org/p/v8/issues/detail?id=12339#c4I have encountered the same problem with a large internal web application and was able to reproduce it with a debug build of version 95.
What I see happening is that in this function:
template <class CharT>
void RegExpParserImpl<CharT>::Advance() {
if (has_next()) {
if (GetCurrentStackPosition() < stack_limit_) {
if (FLAG_correctness_fuzzer_suppressions) {
FATAL("Aborting on stack overflow");
}
ReportError(RegExpError::kStackOverflow);
} else if (zone()->excess_allocation()) { <====== fails
if (FLAG_correctness_fuzzer_suppressions) {
FATAL("Aborting on excess zone allocation");
}
ReportError(RegExpError::kTooLarge);
} else {
current_ = ReadNext<true>();
}
} else {
current_ = kEndMarker;
// Advance so that position() points to 1-after-the-last-character. This is
// important so that Reset() to this position works correctly.
next_pos_ = input_length() + 1;
has_more_ = false;
}
}
zone()->excess_allocation() returns true because segment_bytes_allocated_ is > 256KB (== kExcessLimit) so the function reports that the regular expression is too large, while the actual problem is that the Zone is full; the error would be triggered by any regular expression at that point.
This is the status of the Zone:
this 0x000001ec15b91ba0 {allocation_size_=0x000000000fa866b8 segment_bytes_allocated_=0x00000000102c3430 ...} const v8::internal::Zone *
allocation_size_ 0x000000000fa866b8 unsigned __int64
segment_bytes_allocated_ 0x00000000102c3430 unsigned __int64
position_ 0x000001ec35c79f28 unsigned __int64
limit_ 0x000001ec35c7d110 unsigned __int64
...
And this is the call stack when the check fails:
v8.dll!v8::internal::Zone::excess_allocation() Line 144 C++
> v8.dll!v8::internal::`anonymous namespace'::RegExpParserImpl<unsigned char>::Advance() Line 467 C++
v8.dll!v8::internal::`anonymous namespace'::RegExpParserImpl<unsigned char>::RegExpParserImpl(const unsigned char * input, int input_length, v8::base::Flags<v8::internal::RegExpFlag,int> flags, unsigned __int64 stack_limit, v8::internal::Zone * zone, const v8::internal::CombinationAssertScope<v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::SAFEPOINTS_ASSERT,0>,v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::HEAP_ALLOCATION_ASSERT,0>> & no_gc) Line 416 C++
v8.dll!v8::internal::RegExpParser::VerifyRegExpSyntax<unsigned char>(v8::internal::Zone * zone, unsigned __int64 stack_limit, const unsigned char * input, int input_length, v8::base::Flags<v8::internal::RegExpFlag,int> flags, v8::internal::RegExpCompileData * result, const v8::internal::CombinationAssertScope<v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::SAFEPOINTS_ASSERT,0>,v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::HEAP_ALLOCATION_ASSERT,0>> & no_gc) Line 2429 C++
v8.dll!v8::internal::RegExp::VerifySyntax<unsigned char>(v8::internal::Zone * zone, unsigned __int64 stack_limit, const unsigned char * input, int input_length, v8::base::Flags<v8::internal::RegExpFlag,int> flags, v8::internal::RegExpError * regexp_error_out, const v8::internal::CombinationAssertScope<v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::SAFEPOINTS_ASSERT,0>,v8::internal::PerThreadAssertScopeDebugOnly<v8::internal::HEAP_ALLOCATION_ASSERT,0>> & no_gc) Line 117 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ValidateRegExpLiteral(const v8::internal::AstRawString * pattern, v8::base::Flags<v8::internal::RegExpFlag,int> flags, v8::internal::RegExpError * regexp_error) Line 1804 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParseRegExpLiteral() Line 1832 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParsePrimaryExpression() Line 1951 C++
[...]
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParseExpressionOrLabelledStatement(v8::internal::ZoneList<const v8::internal::AstRawString *> * labels, v8::internal::ZoneList<const v8::internal::AstRawString *> * own_labels, v8::internal::AllowLabelledFunctionStatement allow_function) Line 5441 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParseStatement(v8::internal::ZoneList<const v8::internal::AstRawString *> * labels, v8::internal::ZoneList<const v8::internal::AstRawString *> * own_labels, v8::internal::AllowLabelledFunctionStatement allow_function) Line 5284 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParseStatementListItem() Line 5179 C++
v8.dll!v8::internal::ParserBase<v8::internal::Parser>::ParseStatementList(v8::internal::ScopedList<v8::internal::Statement *,void *> * body, v8::internal::Token::Value end_token) Line 5128 C++
v8.dll!v8::internal::Parser::DoParseProgram(v8::internal::Isolate * isolate, v8::internal::ParseInfo * info) Line 633 C++
v8.dll!v8::internal::Parser::ParseOnBackground(v8::internal::ParseInfo * info, int start_position, int end_position, int function_literal_id) Line 3290 C++
v8.dll!v8::internal::BackgroundCompileTask::Run() Line 1612 C++
v8.dll!v8::ScriptCompiler::ScriptStreamingTask::Run() Line 2667 C++
[...]
The Zone used by the RegExpParser is the same Zone used by the ParserBase (ParserBase<Impl>::ValidateRegExpLiteral).
There could have been some changes in v.95 that caused V8 to allocate more in the Parser zone than it did in the previous version, and the app script file is large enough to trigger the error in the new version.
I don't know how to provide a repro for this issue, though, given that I only managed to repro it with an internal website.