We have a Gerrit problem plaguing us that we cannot get to the bottom of. Where using Google’s repo on the AOSP.
We reboot the Gerrit (3.1.3) server so all fresh and ready for service.
when we use repo clone and sync AOSP several of the threads(may) deadlock. Over time as repositories are work on Gerrit runs out of working threads and total freeze.
Here is an example thread dump of deadlocks......
Here is an example thread dump
"SSH git-upload-pack /aosp_caf/platform/external/autotest (jenkinsci)" prio=1 BLOCKED
org.apache.sshd.common.session.helpers.AbstractSession.writePacket(AbstractSession.java:805)
org.apache.sshd.common.channel.AbstractChannel.writePacket(AbstractChannel.java:781)
org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:227)
org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127)
org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420)
org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174)
org.eclipse.jgit.transport.SideBandOutputStream.write(SideBandOutputStream.java:153)
org.eclipse.jgit.internal.storage.pack.PackOutputStream.write(PackOutputStream.java:132)
org.eclipse.jgit.internal.storage.file.ByteArrayWindow.write(ByteArrayWindow.java:91)
org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs2(PackFile.java:581)
org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs(PackFile.java:433)
org.eclipse.jgit.internal.storage.file.WindowCursor.copyObjectAsIs(WindowCursor.java:221)
org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1736)
org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1713)
org.eclipse.jgit.internal.storage.pack.PackOutputStream.writeObject(PackOutputStream.java:171)
org.eclipse.jgit.internal.storage.file.WindowCursor.writeObjects(WindowCursor.java:229)
org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1701)
org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1689)
org.eclipse.jgit.internal.storage.pack.PackWriter.writePack(PackWriter.java:1248)
org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2361)
org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2197)
org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:1111)
org.eclipse.jgit.transport.UploadPack.uploadWithExceptionPropagation(UploadPack.java:868)
org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:782)
com.google.gerrit.sshd.commands.Upload.runImpl(Upload.java:95)
com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:107)
com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32)
com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:72)
com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:469)
com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:110)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:610)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
"sshd-SshDaemon[245b0bd7](port=22)-nio2-thread-1" daemon prio=5 BLOCKED org.apache.sshd.common.channel.ChannelOutputStream.close(ChannelOutputStream.java:249) org.apache.sshd.common.util.io.IoUtils.closeQuietly(IoUtils.java:192) org.apache.sshd.common.util.io.IoUtils.closeQuietly(IoUtils.java:148) org.apache.sshd.server.channel.ChannelSession.closeImmediately0(ChannelSession.java:221) org.apache.sshd.server.channel.ChannelSession$$Lambda$614/58240131.run(Unknown Source) org.apache.sshd.common.util.closeable.Builder$1.doClose(Builder.java:47) org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159) org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159) org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159) org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87) org.apache.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87) org.apache.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87) org.apache.sshd.common.session.helpers.SessionHelper.exceptionCaught(SessionHelper.java:1076) org.apache.sshd.common.session.helpers.AbstractSessionIoHandler.exceptionCaught(AbstractSessionIoHandler.java:53) org.apache.sshd.common.io.nio2.Nio2Session.exceptionCaught(Nio2Session.java:194) org.apache.sshd.common.io.nio2.Nio2Session.handleWriteCycleFailure(Nio2Session.java:493) org.apache.sshd.common.io.nio2.Nio2Session$2.onFailed(Nio2Session.java:448) org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$failed$1(Nio2CompletionHandler.java:46) org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$636/562823097.run(Unknown Source) java.security.AccessController.doPrivileged(Native Method) org.apache.sshd.common.io.nio2.Nio2CompletionHandler.failed(Nio2CompletionHandler.java:45) sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:128) sun.nio.ch.Invoker.invokeDirect(Invoker.java:157) sun.nio.ch.UnixAsynchronousSocketChannelImpl.implWrite(UnixAsynchronousSocketChannelImpl.java:739) sun.nio.ch.AsynchronousSocketChannelImpl.write(AsynchronousSocketChannelImpl.java:382) sun.nio.ch.AsynchronousSocketChannelImpl.write(AsynchronousSocketChannelImpl.java:399) org.apache.sshd.common.io.nio2.Nio2Session.doWriteCycle(Nio2Session.java:434) org.apache.sshd.common.io.nio2.Nio2Session.startWriting(Nio2Session.java:418) org.apache.sshd.common.io.nio2.Nio2Session.writePacket(Nio2Session.java:177) org.apache.sshd.common.session.helpers.AbstractSession.doWritePacket(AbstractSession.java:876) org.apache.sshd.common.session.helpers.AbstractSession.sendPendingPackets(AbstractSession.java:719) org.apache.sshd.common.session.helpers.AbstractSession.handleNewKeys(AbstractSession.java:681) org.apache.sshd.common.session.helpers.AbstractSession.doHandleMessage(AbstractSession.java:413) org.apache.sshd.common.session.helpers.AbstractSession.handleMessage(AbstractSession.java:362) org.apache.sshd.common.session.helpers.AbstractSession.decode(AbstractSession.java:1207) org.apache.sshd.common.session.helpers.AbstractSession.messageReceived(AbstractSession.java:323) org.apache.sshd.common.session.helpers.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:63) org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:368) org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:346) org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:343) org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38) org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$510/350494316.run(Unknown Source) java.security.AccessController.doPrivileged(Native Method) org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37) sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126) sun.nio.ch.Invoker$2.run(Invoker.java:218) sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)Gerrit error log example..
[2020-02-21 00:47:35,553] [SSH git-upload-pack /aosp_caf/platform/prebuilts/sdk (jenkinsci)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user jenkinsci account 1000016) during git-upload-pack '/aosp_caf/platform/prebuilts/sdk' java.net.SocketTimeoutException: waitForCondition(Window[server/remote](ChannelSession[id=74, recipient=2]-ServerSessionImpl[jenkinsci@/10.123.209.8:37966])) timeout exceeded: 3600000 at org.apache.sshd.common.channel.Window.waitForCondition(Window.java:305) at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:252) at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:188) at org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127) at org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420) at org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174) at org.eclipse.jgit.transport.SideBandOutputStream.write(SideBandOutputStream.java:153) at org.eclipse.jgit.internal.storage.pack.PackOutputStream.write(PackOutputStream.java:132) at org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs2(PackFile.java:614) at org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs(PackFile.java:433) at org.eclipse.jgit.internal.storage.file.WindowCursor.copyObjectAsIs(WindowCursor.java:221) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1736) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1713) at org.eclipse.jgit.internal.storage.pack.PackOutputStream.writeObject(PackOutputStream.java:171) at org.eclipse.jgit.internal.storage.file.WindowCursor.writeObjects(WindowCursor.java:229) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1701) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1689) at org.eclipse.jgit.internal.storage.pack.PackWriter.writePack(PackWriter.java:1248) at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2361) at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2197) at org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:1111) at org.eclipse.jgit.transport.UploadPack.uploadWithExceptionPropagation(UploadPack.java:868) at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:782) at com.google.gerrit.sshd.commands.Upload.runImpl(Upload.java:95) at com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:107) at com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32) at com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:72) at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:469) at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:610) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: java.net.SocketTimeoutException: waitForCondition(Window[server/remote](ChannelSession[id=74, recipient=2]-ServerSessionImpl[jenkinsci@/10.123.209.8:37966])) timeout exceeded: 3600000 at org.apache.sshd.common.channel.Window.waitForCondition(Window.java:305) at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:252) at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:188) at org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127) at org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420) at org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174) at org.eclipse.jgit.transport.SideBandOutputStream.flushBuffer(SideBandOutputStream.java:127) at org.eclipse.jgit.transport.SideBandOutputStream.flush(SideBandOutputStream.java:133) at org.eclipse.jgit.transport.UploadPack$SideBandErrorWriter.writeError(UploadPack.java:2453) at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:800) ... 14 more
The Gerrit Config
[gerrit]
basePath = git
canonicalWebUrl = https://gerrit.example.co.uk/
serverId = 17a63224-b838-4914-8dbb-8a3f3db2c594
[container]
user = gerrit
javaHome = /usr/lib/jvm/java-8-openjdk-amd64/jre
javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
javaOptions = "-Dhttp.proxyHost=proxy-squid.example.co.uk -Dhttp.proxyPort=881 -Dhttp.nonProxyHosts=localhost|127.0.0.1|*.example.co.uk -Dhttps.proxyHost=proxy-squid.example.co.uk -Dhttps.proxyPort=881 -Dhttps.nonProxyHosts=localhost^|127.0.0.1|*.example.co.uk -Djdk.http.auth.tunneling.disabledSchemes=''"
[index]
type = LUCENE
[auth]
type = LDAP
gitBasicAuthPolicy = LDAP
userNameToLowerCase = true
[ldap]
server = ldap://ad-ldap.example.co.uk
username = xxxxxx
accountBase = OU=Example Users,DC=EXAMPLE,DC=co,DC=uk
groupBase = OU=Example Groups,DC=EXAMPLE,DC=co,DC=uk
groupPattern = (&(objectClass=group)(cn=${groupname}))
[receive]
enableSignedPush = false
[sshd]
listenAddress = *:29418
advertisedAddress = gerrit:29418
batchThreads = 1
maxConnectionsPerUser = 128
idleTimeout = 4h
waitTimeout = 60m
maxAuthTries = 12
threads = 24 {made lots to give us a bit more dead thread Resilience )
[core]
packedGitLimit = 4g
packedgitwindowsize = 32m
packedGitOpenFiles = 800
deltaBaseCacheLimit = 30m
[httpd]
listenUrl = proxy-https://localhost:8080/
[cache]
directory = cache
[sendemail]
smtpServer = mailhost.example.co.uk
[container]
heapLimit = 25g
[plugins]
allowRemoteAdmin = true
checkFrequency = 0
[gc]
startTime = Sat 10:00
interval = 7 days
[user]
email = ger...@example.com
[gitweb]
type = gitweb
[core]
packedGitLimit = 4g
packedgitwindowsize = 32m
packedGitOpenFiles = 800
deltaBaseCacheLimit = 30m
[httpd]
listenUrl = proxy-https://localhost:8080/
[cache]
directory = cache
[sendemail]
smtpServer = mailhost.example.co.uk
[container]
heapLimit = 25g
[plugins]
allowRemoteAdmin = true
checkFrequency = 0
[gc]
startTime = Sat 10:00
interval = 7 days
[user]
email = ger...@example.com
[gitweb]
type = gitweb
machine spec32GB , Intel(R) Xeon(R) CPU E3-1220 v5 @ 3.00GHz 4 (cores).Any idea... maybe for more logging.....
Moved to some new hardware and had the same problems.To cut a long story short after much trial, we fixed the issue by addsshd.rekeyBytesLimit = 0sshd.rekeyTimeLimit = 0to the gerrit config.