Hi,
I manage 2 heavy-load Gerrit servers and I have an issue that kills me.
After I restart Gerrit I see that some ssh requests are getting locked one by one. In about 3 hours after restart I see 2 locked requests
ssh -p 29418 gitserver gerrit show-queue -w
237c7393 git-upload-pack '/projectA.git' (user-A)
e32dfbec git-upload-pack '/project-B' (user-B)
Once these requests get locked they stay in the queue forever (until Gerrit restart). I set Gerrit SSH to 24 thread (for my 4 core server) and this amount of threads get exhausted in 20-30 hours. Once all ssh threads get locked nobody can sync repository, webUI works fine though. So I have to restart Gerrit daily.
Does anybody have the same issue?
I checked jstack for Gerrit process and I see 2 types of blocked threads. One of them is a culprit. Both types of deadlocks are in Apache SSHD so i suspect that the bug is somewhere there.
Here are the stacktrace examples:
Thread 23275: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=485 (Compiled frame)
- org.apache.sshd.common.channel.ChannelPipedInputStream.read(byte[], int, int) @bci=54, line=81 (Compiled frame)
- org.eclipse.jgit.util.IO.readFully(java.io.InputStream, byte[], int, int) @bci=8, line=175 (Compiled frame)
- org.eclipse.jgit.transport.PacketLineIn.readLength() @bci=10, line=140 (Interpreted frame)
- org.eclipse.jgit.transport.PacketLineIn.readString() @bci=1, line=107 (Compiled frame)
- org.eclipse.jgit.transport.UploadPack.recvWants() @bci=14, line=385 (Compiled frame)
- org.eclipse.jgit.transport.UploadPack.service() @bci=107, line=340 (Interpreted frame)
- org.eclipse.jgit.transport.UploadPack.upload(java.io.InputStream, java.io.OutputStream, java.io.OutputStream) @bci=159, line=313 (Interpreted frame)
- com.google.gerrit.sshd.commands.Upload.runImpl() @bci=109, line=50 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand.service() @bci=75, line=104 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand.access$000(com.google.gerrit.sshd.AbstractGitCommand) @bci=1, line=34 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand$1.run() @bci=4, line=69 (Interpreted frame)
- com.google.gerrit.sshd.BaseCommand$TaskThunk.run() @bci=98, line=395 (Interpreted frame)
- java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=441 (Interpreted frame)
Thread 23278: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=485 (Compiled frame)
- org.apache.sshd.common.channel.ChannelPipedInputStream.read(byte[], int, int) @bci=54, line=81 (Compiled frame)
- org.eclipse.jgit.transport.IndexPack.fill(org.eclipse.jgit.transport.IndexPack$Source, int) @bci=171, line=933 (Compiled frame)
- org.eclipse.jgit.transport.IndexPack.readPackHeader() @bci=14, line=754 (Interpreted frame)
- org.eclipse.jgit.transport.IndexPack.index(org.eclipse.jgit.lib.ProgressMonitor) @bci=8, line=401 (Interpreted frame)
- org.eclipse.jgit.transport.ReceivePack.receivePack() @bci=88, line=787 (Interpreted frame)
- org.eclipse.jgit.transport.ReceivePack.service() @bci=81, line=630 (Interpreted frame)
- org.eclipse.jgit.transport.ReceivePack.receive(java.io.InputStream, java.io.OutputStream, java.io.OutputStream) @bci=206, line=577 (Interpreted frame)
- com.google.gerrit.sshd.commands.Receive.runImpl() @bci=158, line=89 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand.service() @bci=75, line=104 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand.access$000(com.google.gerrit.sshd.AbstractGitCommand) @bci=1, line=34 (Interpreted frame)
- com.google.gerrit.sshd.AbstractGitCommand$1.run() @bci=4, line=69 (Interpreted frame)
- com.google.gerrit.sshd.BaseCommand$TaskThunk.run() @bci=98, line=395 (Interpreted frame)
- java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=441 (Interpreted frame)
Most of the blocked threads are of the first type (Upload command).