SSH sessions hangging - affecting some users

53 views
Skip to first unread message

Darragh Bailey

unread,
Jun 14, 2017, 7:43:51 AM6/14/17
to Repo and Gerrit Discussion
Hi,


We're currently running Gerrit 2.13.4 (via a docker container) and one of our users has reported numerous hangs where multiple attempts are required to push a patch via ssh for review.

Looking at show-connections and show-queue I'm able to see a number of connections/tasks stuck corresponding with the times they reported the issue.

Session    User            Remote Host
--------------------------------------------------------------
decdfa53   user1           10.37.229.16
2b2c2314   user1           10.37.229.16
920d4eb5   user1           10.37.229.16
71b69c75   user1           10.37.229.16
60fc8833   user1           10.37.229.16
ea4521d1   user1           10.37.229.16
218b64e8   user1           10.37.229.16
75f708e7   user1           10.37.229.16
bce9fb98   user2           10.37.229.130
d3cce359   user2           10.37.229.130
7178972e   user2           10.37.229.130
0f6722b6   <me>            10.37.78.218
--
SSHD Backend: nio2


Task     State        StartTime         Command
------------------------------------------------------------------------------
de789af3              Jun-07 12:28      git-receive-pack ... (user1)
ab17b3c3              Jun-07 12:37      git-receive-pack ... (user1)
d21746c7              Jun-07 12:41      git-receive-pack ... (user1)
b19f34e4              Jun-07 12:45      git-receive-pack ... (user1)
20345011              Jun-07 12:50      git-receive-pack ... (user1)
ea50c19c              Jun-08 15:47      git-receive-pack ... (user1)
61807cc5              Jun-08 15:51      git-receive-pack ... (user1)
b50100b8              Jun-09 12:03      git-receive-pack ... (user1)
fcf3f38a              Jun-09 12:23      git-receive-pack ... (user2)
13c6fb3b              Jun-09 12:26      git-receive-pack ... (user2)
f196c782              Jun-09 12:32      git-receive-pack ... (user2)
c9e00b24 23:00:00.004 Jun-07 08:38      Log File Compressor
------------------------------------------------------------------------------
  15 tasks


I've pruned some connections relating to our CI systems, because they are watching stream events.

Looking at the SSH log I can see the connections on Jun 7th hanging:

[2017-06-07 08:09:22,253 +0000] f3e92ad2 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:09:23,206 +0000] f3e92ad2 user1 a/109 git-upload-pack.... 7ms 489ms 0
[2017-06-07 08:09:23,546 +0000] f3e92ad2 user1 a/109 LOGOUT
[2017-06-07 08:09:24,890 +0000] f80b47ac user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:13:26,583 +0000] f80b47ac user1 a/109 LOGOUT
[2017-06-07 08:13:26,583 +0000] f80b47ac user1 a/109 git-receive-pack.... 0ms 241238ms killed
[2017-06-07 08:13:31,829 +0000] a2fb5e96 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:13:32,770 +0000] a2fb5e96 user1 a/109 git-upload-pack.... 7ms 479ms 0
[2017-06-07 08:13:33,106 +0000] a2fb5e96 user1 a/109 LOGOUT
[2017-06-07 08:13:34,515 +0000] c2ef72d2 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:13:35,438 +0000] c2ef72d2 user1 a/109 git-receive-pack.... 1ms 467ms 0 git/2.7.4
[2017-06-07 08:13:35,774 +0000] c2ef72d2 user1 a/109 LOGOUT
[2017-06-07 08:18:27,645 +0000] e53165bc user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:18:28,593 +0000] e53165bc user1 a/109 git-upload-pack.... 7ms 484ms 0
[2017-06-07 08:18:28,930 +0000] e53165bc user1 a/109 LOGOUT
[2017-06-07 08:18:30,417 +0000] a54e0d31 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:18:31,329 +0000] a54e0d31 user1 a/109 git-receive-pack.... 0ms 457ms 0 git/2.7.4
[2017-06-07 08:18:31,665 +0000] a54e0d31 user1 a/109 LOGOUT
[2017-06-07 08:19:06,584 +0000] aaefbad9 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:19:07,576 +0000] aaefbad9 user1 a/109 git-upload-pack.... 7ms 529ms 0
[2017-06-07 08:19:07,913 +0000] aaefbad9 user1 a/109 LOGOUT
[2017-06-07 08:19:09,594 +0000] ea3132bc user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:19:10,574 +0000] ea3132bc user1 a/109 git-receive-pack.... 1ms 524ms 0 git/2.7.4
[2017-06-07 08:19:10,914 +0000] ea3132bc user1 a/109 LOGOUT
[2017-06-07 08:19:42,131 +0000] 2a1aeadd user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:19:43,077 +0000] 2a1aeadd user1 a/109 git-upload-pack.... 7ms 483ms 0
[2017-06-07 08:19:43,415 +0000] 2a1aeadd user1 a/109 LOGOUT
[2017-06-07 08:19:44,736 +0000] 3f030e92 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 08:19:45,983 +0000] 3f030e92 user1 a/109 git-receive-pack.... 0ms 788ms 0 git/2.7.4
[2017-06-07 08:19:46,323 +0000] 3f030e92 user1 a/109 LOGOUT
[2017-06-07 12:28:35,642 +0000] 5e926a3c user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:28:36,629 +0000] 5e926a3c user1 a/109 git-upload-pack.... 5ms 521ms 0
[2017-06-07 12:28:36,968 +0000] 5e926a3c user1 a/109 LOGOUT
[2017-06-07 12:28:38,303 +0000] decdfa53 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:37:54,941 +0000] eb4a8bc3 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:37:55,974 +0000] eb4a8bc3 user1 a/109 git-upload-pack.... 7ms 539ms 0
[2017-06-07 12:37:56,337 +0000] eb4a8bc3 user1 a/109 LOGOUT
[2017-06-07 12:37:57,706 +0000] 2b2c2314 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:41:12,567 +0000] 062d9f10 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:41:13,648 +0000] 062d9f10 user1 a/109 git-upload-pack.... 8ms 593ms 0
[2017-06-07 12:41:13,987 +0000] 062d9f10 user1 a/109 LOGOUT
[2017-06-07 12:41:15,328 +0000] 920d4eb5 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:45:37,841 +0000] d1c508c7 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:45:38,796 +0000] d1c508c7 user1 a/109 git-upload-pack.... 7ms 492ms 0
[2017-06-07 12:45:39,133 +0000] d1c508c7 user1 a/109 LOGOUT
[2017-06-07 12:45:40,542 +0000] 71b69c75 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:50:07,232 +0000] 40ae2428 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:50:08,178 +0000] 40ae2428 user1 a/109 git-upload-pack.... 6ms 484ms 0
[2017-06-07 12:50:08,516 +0000] 40ae2428 user1 a/109 LOGOUT
[2017-06-07 12:50:09,838 +0000] 60fc8833 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:54:20,431 +0000] 9dcb283f user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:54:21,376 +0000] 9dcb283f user1 a/109 git-upload-pack.... 5ms 483ms 0
[2017-06-07 12:54:21,716 +0000] 9dcb283f user1 a/109 LOGOUT
[2017-06-07 12:54:23,097 +0000] 3d41dcae user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:54:36,584 +0000] 3d41dcae user1 a/109 git-receive-pack.... 0ms 12997ms 0 git/2.7.4
[2017-06-07 12:54:36,920 +0000] 3d41dcae user1 a/109 LOGOUT

The connections that hung:
[2017-06-07 12:28:38,303 +0000] decdfa53 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:37:57,706 +0000] 2b2c2314 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:41:15,328 +0000] 920d4eb5 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:45:40,542 +0000] 71b69c75 user1 a/109 LOGIN FROM 10.37.229.16
[2017-06-07 12:50:09,838 +0000] 60fc8833 user1 a/109 LOGIN FROM 10.37.229.16


I haven't seen anything obvious from the changelog for 2.13.5+ that indicates a bug being fixed that would prevent these from hanging, and I'm unsure on how to debug further. So far the people affected appear to be about 2 out of 60+.

We've asked the user affected the most to run 'git-review' with GIT_SSH_COMMAND='ssh -vvv'  for a while to see if there is a clear indication as to what might be happening from the client perspective. But given these connections appear to be permanently stuck on the server, it doesn't seem likely that will provide much more info.

Any suggestions on what to look for next? Any debug that can be turned on that might be useful?

--
Darragh Bailey

Saša Živkov

unread,
Jun 14, 2017, 8:26:46 AM6/14/17
to Darragh Bailey, Repo and Gerrit Discussion
As always: make a thread dump of the Gerrit process and check what are these threads (these processing the receive-pack, means "git push") doing.
 

--
Darragh Bailey

--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Darragh Bailey

unread,
Jun 14, 2017, 10:52:04 AM6/14/17
to Saša Živkov, Repo and Gerrit Discussion
On 14 June 2017 at 13:25, Saša Živkov <ziv...@gmail.com> wrote:


As always: make a thread dump of the Gerrit process and check what are these threads (these processing the receive-pack, means "git push") doing.
 


All look like the following:


"SSH git-receive-pack '/<repo>/<name>.git' (user1)" #830 prio=1 os_prio=0 tid=0x000055e2d49bd000 nid=0x526 waiting on condition [0x00007f89fadc0000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000a551fee8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at org.apache.sshd.common.channel.ChannelPipedInputStream.read(ChannelPipedInputStream.java:136)
        at org.eclipse.jgit.transport.PackParser.fill(PackParser.java:1173)
        at org.eclipse.jgit.transport.PackParser.readPackHeader(PackParser.java:874)
        at org.eclipse.jgit.transport.PackParser.parse(PackParser.java:512)
        at org.eclipse.jgit.internal.storage.file.ObjectDirectoryPackParser.parse(ObjectDirectoryPackParser.java:195)
        at org.eclipse.jgit.transport.BaseReceivePack.receivePack(BaseReceivePack.java:1308)
        at org.eclipse.jgit.transport.BaseReceivePack.receivePackAndCheckConnectivity(BaseReceivePack.java:1045)
        at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:250)
        at org.eclipse.jgit.transport.ReceivePack.receive(ReceivePack.java:206)
        at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:97)
        at com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:101)
        at com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32)
        at com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:70)
        at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:442)
        - locked <0x00000000a5522ef0> (a com.google.gerrit.sshd.BaseCommand$TaskThunk)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


I tried searching the thread dump for 7f89fadc0000, a5522ef0 & a5522ef0, with no results found, so no idea as to why it got stuck.

I'll try attaching a copy of the dump and see if it will be included and whether someone else might know what it means.
 



--
Darragh Bailey
"Nothing is foolproof to a sufficiently talented fool"
gerrit-stacktrace.out.gz

Luca Milanesio

unread,
Jun 15, 2017, 1:02:39 AM6/15/17
to Darragh Bailey, Saša Živkov, Repo and Gerrit Discussion
It is stuck in Apache Mina SSHD, waiting (forever) for data.
Have you configured any SSH timeout in Gerrit?

Luca.

 
--
--
To unsubscribe, email repo-discuss...@googlegroups.com

More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
<gerrit-stacktrace.out.gz>

Reply all
Reply to author
Forward
0 new messages