Replicas on Gerrit v3.6.1 SshCommandDestroy Index out of Bounds issue

107 views
Skip to first unread message

Rikard Almgren

unread,
Sep 19, 2022, 5:12:54 AM9/19/22
to Repo and Gerrit Discussion
We've run into a problem with replicas under load on Gerrit v3.6.1 throwing index out of bounds exceptions and killing ssh connections. The heap eventually climbs to death.

Has anyone else seen similar or has an idea of what's gone wrong?
Have we missed something in the upgrade, or a bad configuration?

[SshCommandDestroy-238] ERROR com.google.gerrit.pgm.Daemon : Thread SshCommandDestroy-238 threw exception
java.lang.IndexOutOfBoundsException: Index 2 out of bounds for length 2
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:372)
at java.base/java.util.ArrayList.get(ArrayList.java:459)
at java.base/java.util.Collections$UnmodifiableList.get(Collections.java:1310)
at com.google.gerrit.sshd.SshLogJsonLayout$SshJsonLogEntry.<init>(SshLogJsonLayout.java:90)
at com.google.gerrit.sshd.SshLogJsonLayout.toJsonLogEntry(SshLogJsonLayout.java:40)
at com.google.gerrit.util.logging.JsonLayout.format(JsonLayout.java:36)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:303)
at org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:353)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:156)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:232)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65)
at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:143)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:232)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65)
at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:143)
at com.google.gerrit.sshd.SshLog.onExecute(SshLog.java:217)
at com.google.gerrit.sshd.CommandFactoryProvider$Trampoline.log(CommandFactoryProvider.java:236)
at com.google.gerrit.sshd.CommandFactoryProvider$Trampoline$2.onExit(CommandFactoryProvider.java:196)
at com.google.gerrit.sshd.commands.Upload.onExit(Upload.java:117)
at com.google.gerrit.sshd.BaseCommand$TaskThunk.cancel(BaseCommand.java:467)
at com.google.gerrit.server.git.WorkQueue$Task.cancel(WorkQueue.java:549)
at com.google.gerrit.sshd.BaseCommand.destroy(BaseCommand.java:190)
at com.google.gerrit.sshd.DispatchCommand.destroy(DispatchCommand.java:154)
at com.google.gerrit.sshd.CommandFactoryProvider$Trampoline.onDestroy(CommandFactoryProvider.java:254)
at com.google.gerrit.sshd.CommandFactoryProvider$Trampoline.lambda$destroy$0(CommandFactoryProvider.java:245)
at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:113)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

Happy to hear any ideas. There really isn't anything else in the logs, and v3.5.2 was running fine.

Rikard Almgren

unread,
Sep 19, 2022, 5:17:25 AM9/19/22
to Repo and Gerrit Discussion
We have also noted a metrics increase from replicas.
One notable metric is cache evictions for change notes, which was absent during v3.5.2, either because it wasn't collected, or because it didn't happen, but it's climbing fast v3.6.1

Sven Selberg

unread,
Sep 19, 2022, 11:47:26 AM9/19/22
to Repo and Gerrit Discussion
On Monday, September 19, 2022 at 11:17:25 AM UTC+2 Rikard Almgren wrote:
We have also noted a metrics increase from replicas.
One notable metric is cache evictions for change notes, which was absent during v3.5.2, either because it wasn't collected, or because it didn't happen, but it's climbing fast v3.6.1

On Monday, 19 September 2022 at 11:12:54 UTC+2 Rikard Almgren wrote:
We've run into a problem with replicas under load on Gerrit v3.6.1 throwing index out of bounds exceptions

The IndexOutOfBoundsException was a unrelated bug that was triggered since the issue caused BaseCommand#destroy to be called:
Introduced here, https://gerrit-review.googlesource.com/c/gerrit/+/326881, fix coming.
 
and killing ssh connections. The heap eventually climbs to death.

This issue remains.
Same event that 3.5 handles gracefully and 3.6 blows up:

On 3.5
3.5-behavior.png

On 3.6
 3.6-behavior.png

David Åkerman

unread,
Nov 17, 2022, 7:50:06 AM11/17/22
to Repo and Gerrit Discussion
Hi again,

We upgraded to Gerrit v3.6.2 about 4 weeks ago and since then we have not encountered this problem again. So it seems like this problem was solved in v3.6.2.

Best regards,
David
 
Reply all
Reply to author
Forward
0 new messages