All git commands and ssh commands to gerrit hang. Web UI still working.

410 views
Skip to first unread message

Dom Alessi

unread,
Mar 31, 2014, 2:35:43 PM3/31/14
to repo-d...@googlegroups.com
Hi,

We had an issue on Gerrit 2.6.1 last week where eventually all git commands are hanging and even ssh commands (show-queue, connections, cache, etc).  So we upgraded to 2.7 and we have placed all Jenkins users in the Non-Interactive group and gave them streaming permissions.  Even so we are facing the same issue where evenutally all git commands and ssh commands fail.  I see some exceptions in Apache Mina, but not sure if they are relevant.  We are on Apache 2.2.15 and postgres DB. We do not have any OutOfMemory errors.  We just see that some of our 8 CPUs are at 100%.  We do not see any errors in the sshd_log. Our gerrit.config looks like this:

[database]
        type = POSTGRESQL
hostname = 127.0.0.1
database = reviewdb
username = gerrit
        poolLimit = 64
        poolMaxIdle = 12
        poolMaxWait = 60 seconds

[container]
user = gerrit
javaHome = /usr/java/jdk1.7.0_40
        heapLimit = 16g

javaOptions = -XX:+UseG1GC -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -server -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -verbose:gc -XX:+PrintGCDa
teStamps -XX:+PrintGCDetails -Xloggc:/opt/collabnet/gerrit/logs/gerrit.gc.log -XX:MaxPermSize=128m -Dh2.useThreadContextClassLoader=true -Dlog4j.configuration=file:////opt/collabnet/gerrit/etc/log4j-ng.pro
perties -javaagent:/opt/newrelic/newrelic.jar -Dnewrelic.config.file=/etc/newrelic/newrelic-gerrit.yml
[gitweb]
        cgi = /var/www/gitweb-caching/gitweb.cgi
[sshd]
        listenAddress = *:29418
        threads = 32
        batchThreads = 12
        maxAuthTries = 12
idleTimeout = 5m
maxConnectionsPerUser = 256

[core]
        packedGitWindowSize = 64k
        packedGitOpenFiles = 8192
        packedGitLimit = 10g # this is a fraction of java heap
        streamFileThreshold = 2047m
        deltaBaseCacheLimit = 50m
[pack]
        deltacompression = on
        threads = 0
bigFileThreshold = 20m
        indexVersion = 2
[cache "web_sessions"]
        memoryLimit = 4096
        maxAge = 23 hours
[cache "accounts"]
        memoryLimit = 7000
[cache "accounts_byemail"]
        memoryLimit = 7000
[cache "accounts_byname"]
        memoryLimit = 7000
[cache "sshkeys"]
        memoryLimit = 4096
[cache "projects"]
        memoryLimit = 4096
[cache "groups"]
        memoryLimit = 32768
[cache "groups_byinclude"]
        memoryLimit = 32768
[cache "diff"]
        memoryLimit = 128m
diskLimit = 256m
timeout = 5s
[cache "diff_intraline"]
        memoryLimit = 128m
        diskLimit = 256m
[cache "ctf_groups"]
        memoryLimit = 32768
        maxAge = 1 day
[cache "ctf_credentials"]
        memoryLimit = 32768
        maxAge = 1 day
[receive]
timeout = 4min
checkMagicRefs = false
maxObjectSizeLimit = 512m

The exceptions I see are :

java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:273)
        at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:44)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:690)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
        at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)


and

java.lang.IllegalStateException: Incorrect identification: bad line ending
        at org.apache.sshd.common.session.AbstractSession.doReadIdentification(AbstractSession.java:654)
        at org.apache.sshd.server.session.ServerSession.readIdentification(ServerSession.java:356)
        at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:247)
        at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:54)
        at org.apache.sshd.common.io.mina.MinaService.messageReceived(MinaService.java:94)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:690)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
        at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:109)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
        at org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:410)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:710)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
        at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
        at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

and

java.lang.NullPointerException
at org.apache.sshd.common.session.AbstractSession.getSession(AbstractSession.java:184)
at org.apache.sshd.server.session.ServerSession.getActiveSessionCountForUser(ServerSession.java:564)
at org.apache.sshd.server.session.ServerSession.userAuth(ServerSession.java:482)
at org.apache.sshd.server.session.ServerSession.handleMessage(ServerSession.java:212)
at org.apache.sshd.common.session.AbstractSession.decode(AbstractSession.java:587)
at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:253)
at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:54)
at org.apache.sshd.common.io.mina.MinaService.messageReceived(MinaService.java:94)
at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:690)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:109)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:410)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:710)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Thanks,
Dom

Matthias Sohn

unread,
Apr 2, 2014, 8:32:04 AM4/2/14
to Dom Alessi, Repo and Gerrit Discussion

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

try to use gerrit show-queue command [1] to check what your server is doing.
Additionally create a couple of thread dumps (jstack) to get more information about what's going on.


--
Matthias

Dom Alessi

unread,
Apr 11, 2014, 8:32:11 AM4/11/14
to repo-d...@googlegroups.com, Dom Alessi
Hi Matthias,

in the end it was a plugin we wrote that caused an issue.  Sorry for the noise.  But I did take the opportunity to raise the following NPE on Apache MINA and it has been fixed, so perhaps Gerrit can include the fix in a future release


BR,
Dom
Reply all
Reply to author
Forward
0 new messages