| We intermittently see nodes fail during Git checkout with a traceback that looks like this: 09:45:25 java.lang.NoClassDefFoundError: Could not initialize class jenkins.model.Jenkins$MasterComputer*09:45:25* at org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)09:45:25 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)09:45:25 at jdk.internal.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)09:45:25 at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)09:45:25 at java.lang.reflect.Method.invoke(Method.java:564)09:45:25 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:922)09:45:25 at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:896)09:45:25 at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:853)09:45:25 at hudson.remoting.UserRequest.perform(UserRequest.java:207)09:45:25 at hudson.remoting.UserRequest.perform(UserRequest.java:53)09:45:25 at hudson.remoting.Request$2.run(Request.java:358)09:45:25 at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)09:45:25 at java.util.concurrent.FutureTask.run(FutureTask.java:264)09:45:25 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)09:45:25 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)09:45:25 at hudson.remoting.Engine$1$1.run(Engine.java:98)09:45:25 at java.lang.Thread.run(Thread.java:844) Full log: https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-macos-10.13-py3-build-test/2797//console Retrying does not resolve the problem, however, subsequent builds on the same node often do succeed (for the case above, three hours later another build succeeded.) The error is highly reminiscent of https://issues.jenkins-ci.org/browse/JENKINS-19453 but that issue was fixed in the Jenkins 1.x series, and this is a much more modern version of Jenkins. Additionally, the error doesn't seem to be persistent (in that it's not necessary to restart the worker to resolve the problem.) BTW, this is not just an OS X slave problem; we've had it happen to Linux workers too (although the missing class is different): https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-trusty-py2.7.9-build/3214/console I'm not really sure how to go about making a reproducing test case. Let me know if you have any ideas. |