Stack Overflow when merging by cherry pick

12 views
Skip to first unread message

Brad Larson

unread,
May 5, 2009, 5:12:48 PM5/5/09
to Repo and Gerrit Discussion
We've had a couple merges in Gerrit fail due to a
java.lang.StackOverflowError. I'm fairly sure the problem is in the
regex inside private void writeCherryPickCommit(final Merger m, final
CodeReviewCommit n). The stack trace isn't very helpful, just lots
of:

at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)

repeating a few dozen times.

I suspect this line:
if (!msgbuf.toString().matches("^(\n|.)*\n[A-Za-z0-9-]{1,}: [^\n]{1,}\n
$")) {

isn't happy if the commit message has a ':' outside of the signed-off-
by: line. I started looking into the regex to find a fix, but I'm
busy with other things and thought I'd see if anyone else could get to
it first.

Could that line be causing this stack overflow?

Thanks,
Brad

Shawn Pearce

unread,
May 6, 2009, 5:15:54 PM5/6/09
to repo-d...@googlegroups.com

Eeek.

Looks like the JRE's regex code got into trouble with this pattern.

Really that pattern is a cop-out.  I should add specialized footer processing to RevCommit in JGit because scanning things like Signed-off-by lines is really useful for any client that wants to parse a commit message.  I just was too lazy to code it at the time in JGit, so Gerrit wound up with this regex.  Doing it "right" in JGit would mean scanning a byte[] by hand, which means writing the pattern matcher such that it doesn't have this sort of infinite recursion.  :-)

Maybe there is another way to write the pattern that isn't so horribly recursive.  Perhaps changing [A-Za-z0-9-]{1,} to [A-Za-z0-9-][A-Za-z0-9-]* for example?

David Brown

unread,
May 6, 2009, 6:15:54 PM5/6/09
to repo-d...@googlegroups.com
I wouldn't think '*' in the pattern would be much different than the {1,}, although I'm curious why you didn't just use '+'.

But, a simple improvement would be to allocate a real Matcher, and use find, instead of the pattern at the beginning to try and eat lines.  The recursion there comes from having the same pattern (\n) in what you are searching for as what you are skipping.

David

Shawn Pearce

unread,
May 28, 2009, 11:04:31 PM5/28/09
to repo-d...@googlegroups.com
FWIW, this is now at least logged in JIRA http://jira.source.android.com/jira/browse/GERRIT-207
Reply all
Reply to author
Forward
0 new messages