Git timeouts on jenkins git fetch from gerrit slave.

1,549 views
Skip to first unread message

Christian Goetze

unread,
Jun 21, 2014, 3:05:28 PM6/21/14
to repo-d...@googlegroups.com
We recently switched to a master/slave setup using gerrit 2.7, and I've been observing a drastic increase of "git fetch" timeouts on jenkins.

Anyone seen something like it or has any ideas on possible causes? The error is not generally reproducible - the next build on the same slave using the same git repo in the same workspace works fine.

Building remotely on centos5-x64-bld-slave-51 (admin-linux2 centos5 operational) in workspace /build/jenkins/workspace/codebase@0
09:16:04 Fetching changes from the remote Git repository
09:16:04 Fetching upstream changes from ssh://jenk...@gerrit-slave.corp.appdynamics.com:29418/codebase.git
10:16:05 ERROR: Timeout after 60 minutes
10:16:05 FATAL: Failed to fetch from ssh://jenk...@gerrit-slave.corp.appdynamics.com:29418/codebase.git
10:16:05 hudson.plugins.git.GitException: Failed to fetch from ssh://jenk...@gerrit-slave.corp.appdynamics.com:29418/codebase.git
10:16:05 	at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:623)
10:16:05 	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:855)
10:16:05 	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:880)
10:16:05 	at hudson.model.AbstractProject.checkout(AbstractProject.java:1320)
10:16:05 	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
10:16:05 	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
10:16:05 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
10:16:05 	at hudson.model.Run.execute(Run.java:1689)
10:16:05 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
10:16:05 	at hudson.model.ResourceController.execute(ResourceController.java:88)
10:16:05 	at hudson.model.Executor.run(Executor.java:231)
10:16:05 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress ssh://jenk...@gerrit-slave.corp.appdynamics.com:29418/codebase.git +refs/heads/master:refs/remotes/origin/master" returned status code 143:
10:16:05 stdout: 
10:16:05 stderr: Killed by signal 15.
10:16:05 

Christian Goetze

unread,
Jun 21, 2014, 4:49:47 PM6/21/14
to repo-d...@googlegroups.com
Increased the MaxStartups setting on sshd on the gerrit slave. My theory is that since about 20 or so builds start whenever replication takes place, that bunching up of fetches may cause sshd to deliberately drop connections as documented in the sshd manpage.

We'll have to see if that helps.
Reply all
Reply to author
Forward
0 new messages