SSH Deadlocked threads, AOSP repo cloning

219 views
Skip to first unread message

pete neal

unread,
Feb 21, 2020, 6:21:25 AM2/21/20
to Repo and Gerrit Discussion

We have a Gerrit problem plaguing us that we cannot get to the bottom of. Where using Google’s repo on the AOSP. 


We reboot the Gerrit (3.1.3) server so all fresh and ready for service. 

when we use repo clone and sync AOSP several of the threads(may) deadlock. Over time as repositories are work on Gerrit runs out of working threads and total freeze.

Here is an example thread dump of deadlocks......


Here is an example thread dump

"SSH git-upload-pack /aosp_caf/platform/external/autotest (jenkinsci)" prio=1 BLOCKED

            org.apache.sshd.common.session.helpers.AbstractSession.writePacket(AbstractSession.java:805)

            org.apache.sshd.common.channel.AbstractChannel.writePacket(AbstractChannel.java:781)

            org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:227)

            org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127)

            org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420)

            org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174)

            org.eclipse.jgit.transport.SideBandOutputStream.write(SideBandOutputStream.java:153)

            org.eclipse.jgit.internal.storage.pack.PackOutputStream.write(PackOutputStream.java:132)

            org.eclipse.jgit.internal.storage.file.ByteArrayWindow.write(ByteArrayWindow.java:91)

            org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs2(PackFile.java:581)

            org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs(PackFile.java:433)

            org.eclipse.jgit.internal.storage.file.WindowCursor.copyObjectAsIs(WindowCursor.java:221)

            org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1736)

            org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1713)

            org.eclipse.jgit.internal.storage.pack.PackOutputStream.writeObject(PackOutputStream.java:171)

            org.eclipse.jgit.internal.storage.file.WindowCursor.writeObjects(WindowCursor.java:229)

            org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1701)

            org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1689)

            org.eclipse.jgit.internal.storage.pack.PackWriter.writePack(PackWriter.java:1248)

            org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2361)

            org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2197)

            org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:1111)

            org.eclipse.jgit.transport.UploadPack.uploadWithExceptionPropagation(UploadPack.java:868)

            org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:782)

            com.google.gerrit.sshd.commands.Upload.runImpl(Upload.java:95)

            com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:107)

            com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32)

            com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:72)

            com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:469)

            com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:110)

            java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

            java.util.concurrent.FutureTask.run(FutureTask.java:266)

           java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

            java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

            com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:610)

            java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

            java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

            java.lang.Thread.run(Thread.java:748)

 

"sshd-SshDaemon[245b0bd7](port=22)-nio2-thread-1" daemon prio=5 BLOCKED
            org.apache.sshd.common.channel.ChannelOutputStream.close(ChannelOutputStream.java:249)
            org.apache.sshd.common.util.io.IoUtils.closeQuietly(IoUtils.java:192)
            org.apache.sshd.common.util.io.IoUtils.closeQuietly(IoUtils.java:148)
            org.apache.sshd.server.channel.ChannelSession.closeImmediately0(ChannelSession.java:221)
            org.apache.sshd.server.channel.ChannelSession$$Lambda$614/58240131.run(Unknown Source)
            org.apache.sshd.common.util.closeable.Builder$1.doClose(Builder.java:47)
            org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
            org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159)
            org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
            org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159)
            org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
            org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159)
            org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
            org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69)
            org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
            org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48)
            org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87)
            org.apache.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65)
            org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
            org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48)
            org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87)
            org.apache.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65)
            org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56)
            org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
            org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69)
            org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
            org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48)
            org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:87)
            org.apache.sshd.common.session.helpers.SessionHelper.exceptionCaught(SessionHelper.java:1076)
            org.apache.sshd.common.session.helpers.AbstractSessionIoHandler.exceptionCaught(AbstractSessionIoHandler.java:53)
            org.apache.sshd.common.io.nio2.Nio2Session.exceptionCaught(Nio2Session.java:194)
            org.apache.sshd.common.io.nio2.Nio2Session.handleWriteCycleFailure(Nio2Session.java:493)
            org.apache.sshd.common.io.nio2.Nio2Session$2.onFailed(Nio2Session.java:448)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$failed$1(Nio2CompletionHandler.java:46)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$636/562823097.run(Unknown Source)
            java.security.AccessController.doPrivileged(Native Method)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler.failed(Nio2CompletionHandler.java:45)
            sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:128)
            sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
            sun.nio.ch.UnixAsynchronousSocketChannelImpl.implWrite(UnixAsynchronousSocketChannelImpl.java:739)
            sun.nio.ch.AsynchronousSocketChannelImpl.write(AsynchronousSocketChannelImpl.java:382)
            sun.nio.ch.AsynchronousSocketChannelImpl.write(AsynchronousSocketChannelImpl.java:399)
            org.apache.sshd.common.io.nio2.Nio2Session.doWriteCycle(Nio2Session.java:434)
            org.apache.sshd.common.io.nio2.Nio2Session.startWriting(Nio2Session.java:418)
            org.apache.sshd.common.io.nio2.Nio2Session.writePacket(Nio2Session.java:177)
            org.apache.sshd.common.session.helpers.AbstractSession.doWritePacket(AbstractSession.java:876)
            org.apache.sshd.common.session.helpers.AbstractSession.sendPendingPackets(AbstractSession.java:719)
            org.apache.sshd.common.session.helpers.AbstractSession.handleNewKeys(AbstractSession.java:681)
            org.apache.sshd.common.session.helpers.AbstractSession.doHandleMessage(AbstractSession.java:413)
            org.apache.sshd.common.session.helpers.AbstractSession.handleMessage(AbstractSession.java:362)
            org.apache.sshd.common.session.helpers.AbstractSession.decode(AbstractSession.java:1207)
            org.apache.sshd.common.session.helpers.AbstractSession.messageReceived(AbstractSession.java:323)
            org.apache.sshd.common.session.helpers.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:63)
            org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:368)
            org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:346)
            org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:343)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$510/350494316.run(Unknown Source)
            java.security.AccessController.doPrivileged(Native Method)
            org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37)
            sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
            sun.nio.ch.Invoker$2.run(Invoker.java:218)
            sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
            java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

            java.lang.Thread.run(Thread.java:748)


Gerrit error log example..


[2020-02-21 00:47:35,553] [SSH git-upload-pack /aosp_caf/platform/prebuilts/sdk (jenkinsci)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user jenkinsci account 1000016) during git-upload-pack '/aosp_caf/platform/prebuilts/sdk' java.net.SocketTimeoutException: waitForCondition(Window[server/remote](ChannelSession[id=74, recipient=2]-ServerSessionImpl[jenkinsci@/10.123.209.8:37966])) timeout exceeded: 3600000 at org.apache.sshd.common.channel.Window.waitForCondition(Window.java:305) at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:252) at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:188) at org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127) at org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420) at org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174) at org.eclipse.jgit.transport.SideBandOutputStream.write(SideBandOutputStream.java:153) at org.eclipse.jgit.internal.storage.pack.PackOutputStream.write(PackOutputStream.java:132) at org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs2(PackFile.java:614) at org.eclipse.jgit.internal.storage.file.PackFile.copyAsIs(PackFile.java:433) at org.eclipse.jgit.internal.storage.file.WindowCursor.copyObjectAsIs(WindowCursor.java:221) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1736) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1713) at org.eclipse.jgit.internal.storage.pack.PackOutputStream.writeObject(PackOutputStream.java:171) at org.eclipse.jgit.internal.storage.file.WindowCursor.writeObjects(WindowCursor.java:229) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1701) at org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjects(PackWriter.java:1689) at org.eclipse.jgit.internal.storage.pack.PackWriter.writePack(PackWriter.java:1248) at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2361) at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:2197) at org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:1111) at org.eclipse.jgit.transport.UploadPack.uploadWithExceptionPropagation(UploadPack.java:868) at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:782) at com.google.gerrit.sshd.commands.Upload.runImpl(Upload.java:95) at com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:107) at com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32) at com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:72) at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:469) at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:610) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: java.net.SocketTimeoutException: waitForCondition(Window[server/remote](ChannelSession[id=74, recipient=2]-ServerSessionImpl[jenkinsci@/10.123.209.8:37966])) timeout exceeded: 3600000 at org.apache.sshd.common.channel.Window.waitForCondition(Window.java:305) at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:252) at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:188) at org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:127) at org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2420) at org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:174) at org.eclipse.jgit.transport.SideBandOutputStream.flushBuffer(SideBandOutputStream.java:127) at org.eclipse.jgit.transport.SideBandOutputStream.flush(SideBandOutputStream.java:133) at org.eclipse.jgit.transport.UploadPack$SideBandErrorWriter.writeError(UploadPack.java:2453) at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:800) ... 14 more


The Gerrit Config

[gerrit]

basePath = git

canonicalWebUrl = https://gerrit.example.co.uk/

serverId = 17a63224-b838-4914-8dbb-8a3f3db2c594

[container]

user = gerrit

javaHome = /usr/lib/jvm/java-8-openjdk-amd64/jre

javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"

javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"

javaOptions = "-Dhttp.proxyHost=proxy-squid.example.co.uk -Dhttp.proxyPort=881 -Dhttp.nonProxyHosts=localhost|127.0.0.1|*.example.co.uk -Dhttps.proxyHost=proxy-squid.example.co.uk -Dhttps.proxyPort=881 -Dhttps.nonProxyHosts=localhost^|127.0.0.1|*.example.co.uk -Djdk.http.auth.tunneling.disabledSchemes=''"

[index]

type = LUCENE

[auth]

type = LDAP

gitBasicAuthPolicy = LDAP

userNameToLowerCase = true

[ldap]

server = ldap://ad-ldap.example.co.uk

username = xxxxxx

accountBase = OU=Example Users,DC=EXAMPLE,DC=co,DC=uk

groupBase = OU=Example Groups,DC=EXAMPLE,DC=co,DC=uk

groupPattern = (&(objectClass=group)(cn=${groupname}))

[receive]

enableSignedPush = false

[sshd]

listenAddress = *:29418

advertisedAddress = gerrit:29418

batchThreads = 1

maxConnectionsPerUser = 128

idleTimeout = 4h

waitTimeout = 60m

maxAuthTries = 12

threads = 24 {made lots to give us a bit more dead thread Resilience )

[core]

packedGitLimit = 4g

packedgitwindowsize = 32m

packedGitOpenFiles = 800

deltaBaseCacheLimit = 30m

[httpd]

listenUrl = proxy-https://localhost:8080/

[cache]

directory = cache

[sendemail]

smtpServer = mailhost.example.co.uk

[container]

heapLimit = 25g

[plugins]

allowRemoteAdmin = true

checkFrequency = 0

[gc]

startTime = Sat 10:00

interval = 7 days

[user]

email = ger...@example.com

[gitweb]

type = gitweb


 machine spec
      32GB , Intel(R) Xeon(R) CPU E3-1220 v5 @ 3.00GHz 4 (cores).

Any idea... maybe for more logging.....

Pete.


Matthias Sohn

unread,
Feb 21, 2020, 9:55:27 AM2/21/20
to pete neal, Repo and Gerrit Discussion
this is way too large for a server with 4 cores. Set this initially to 2 * number of cores
and monitor sshd queue sizes and CPU load to see if this is too high. If load is too high
queue sizes will keep increasing when available CPUs cannot keep pace with incoming
requests. If thread pool is configured too large you may overload the server, CPU load goes up
and java gc ratio increases and eats CPU capacity you could otherwise use to process requests.
This effectively lowers throughput. Configure the thread pool size so that average java gc ratio
stays below around 10%. This should help to ensure you don't overload the server.

To move this threshold to larger number of concurrent requests you need more CPUs
either on the master or you may offload read traffic to replica servers.

The JGit block cache configured via core.packedGitLimit caches pack file pages to reduce overhead
to copy git objects from the file system cache. Its size is limited to at max 1/4 of the max heap size.
Leave a good fraction of available memory for OS and file system caches and other processes running on the server.
Number of CPUs needed mostly depends on number of concurrent git requests your server has to serve.
Typically the load of a Gerrit server is dominated by upload-pack (fetch, clone) requests.

Most performance issues originate from requests on large repositories. If you can, keep them smaller than 1GB.
Avoid large binary files in git repositories since they are less efficiently compressed by git's pack algorithms
which bloats repository size and slows down transports.

[core]

packedGitLimit = 4g

packedgitwindowsize = 32m

why did you set such a huge window size (default is 8k) ? 

packedGitOpenFiles = 800

deltaBaseCacheLimit = 30m

[httpd]

listenUrl = proxy-https://localhost:8080/

[cache]

directory = cache

[sendemail]

smtpServer = mailhost.example.co.uk

[container]

heapLimit = 25g

[plugins]

allowRemoteAdmin = true

checkFrequency = 0

[gc]

startTime = Sat 10:00

interval = 7 days

[user]

email = ger...@example.com

[gitweb]

type = gitweb


 machine spec
      32GB , Intel(R) Xeon(R) CPU E3-1220 v5 @ 3.00GHz 4 (cores).

Any idea... maybe for more logging.....

Run jgit gc in a separate process to avoid it dominates heap usage of the gerrit Java process while running gc.
If you use git gc ensure you configure it to generate bitmap indexes.

Maybe you need to run git gc more frequently, once a week is probably not enough. Try once a day. 
Collect statistics about number of loose objects, number of pack files

$ git count-objects -vH

and loose refs 

$ find repo.git/refs/ -type file | wc -l

per repository to see if git gc needs to run more frequently.

When using Gerrit 3.x you need to run git gc a lot more frequently on the All-Users repository.
In order to gain more insights Install the javamelody plugin or setup an external monitoring system
via one of the metrics-reporter-x plugins.

-Matthias

pete neal

unread,
Feb 26, 2020, 12:08:35 PM2/26/20
to Repo and Gerrit Discussion
Were' a little at the mercy of the AOSP for the size of the repositories, and they and quite large.
it would seem were rather CPU bound so were going to move to some better hardware and see if the deadlock occurs. I'll update when we know.

javamelody plugin installed.

thanks again.

pete neal

unread,
Mar 6, 2020, 7:33:40 AM3/6/20
to Repo and Gerrit Discussion
Moved to some new hardware and had the same problems.
To cut a long story short after much trial, we fixed the issue by add 
sshd.rekeyBytesLimit = 0
sshd.rekeyTimeLimit = 0
to the gerrit config.

It seems that the SSH rekey process fails (probably mixing up packets) when it's under load with large repositories, i.e "repo sync -j6" on AOSP that has a 20G and quite a few 2G+ repositories.  
There is an old reference to the issue here
We are using the sshd NIO2 in Gerrit 3.1.3
I suggest this till persists as an issue in Gerrit.

Thanks to anyone how took the time to consider our issue,  
especially Matthias, as are server is healthier after his tips and all the git GC's  :-)

Pete.

David Ostrovsky

unread,
Mar 8, 2020, 4:20:43 AM3/8/20
to Repo and Gerrit Discussion

Am Freitag, 6. März 2020 13:33:40 UTC+1 schrieb pete neal:
Moved to some new hardware and had the same problems.
To cut a long story short after much trial, we fixed the issue by add 
sshd.rekeyBytesLimit = 0
sshd.rekeyTimeLimit = 0
to the gerrit config.


Thanks for letting us know. There is also this pending issue upstream,
where SSHD maintainer asked to upgrade to the recent SSHD release
2.4.0 that was done in: [2], [3].


滕龙

unread,
Mar 8, 2020, 4:41:11 AM3/8/20
to David Ostrovsky, Repo and Gerrit Discussion, David Pursehouse, Luca Milanesio
I think maybe I had met this issue and solved it in my case.

I saw that you config the 'sshd.batchThread=1', that means you have
only 1 thread for Gerrit to deal with SSH requests from
non-interactive users.


Documents about SSH batchThreads of Gerrit:

```
sshd.batchThreads

Number of threads to allocate for SSH command requests from
non-interactive users. If equals to 0, then all non-interactive
requests are executed in the same queue as interactive requests.

Any other value will remove the number of threads from the queue
allocated to interactive users, and create a separate thread pool of
the requested size, which will be used to run commands from
non-interactive users.

If the number of threads requested for non-interactive users is larger
than the total number of threads allocated in sshd.threads, then the
value of sshd.threads is increased to accommodate the requested value.

By default is 1 on single core node, 2 otherwise.

```



Your logs show that the username is 'jekinsci', it's very likely a
non-interactive user. if there are any other non-interactive users
are execute SSH commands (Like the UploadPack operation).,
'jekinsci' operation will be blocked there.

So, does the user is a non-interactive user type and is there any
other non-interactive users exist? You can execute `watch -d ssh -p
29418 use@server show-queue` to monitor the Gerrit backend queue.

If there are multiple non-interactive users are working with SSD
commands, maybe you can try to increment the value of
`sshd.batchThread` or set it to 0 to test if the server load is
allowed.


Hope this can solve the problem you met


For this case, I uploaded a change already a few days before
(https://gerrit-review.googlesource.com/c/gerrit/+/247352)
David Ostrovsky <david.o...@gmail.com> 于2020年3月8日周日 下午4:20写道:
> --
> --
> To unsubscribe, email repo-discuss...@googlegroups.com
> More info at http://groups.google.com/group/repo-discuss?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/1fd1cead-b539-4fb3-9d13-2efd9b8ba4cf%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages