Upgrade 1.598 -> 1.609.1 breaks jobs on 1 slave, revert doesn't fix

31 views
Skip to first unread message

Ross Oliver

unread,
Jun 10, 2015, 6:16:14 PM6/10/15
to jenkins...@googlegroups.com
Greetings,

I am running a Jenkins master and several slaves all on Mac OS 10.10.  Yesterday I attempted an upgrade from 1.598 to 1.609.1, and immediately all jobs on one slave started failing, unable to check out from git.  The other slaves were unaffected an continued to operate normally.  After some basic troubleshooting, I decided I needed to get these jobs working again, so I reverted to 1.598.  I also had to manually restore the config.xml from the previous day, as the storage of node data in 1.609.1 was incompatible with 1.598.  The revert seems successful, but the jobs on the 1 slave were still failing with the same errors.  The jobs would run fine if moved to another slave.  Still other than the slave.jar file, I could not find anything on the slave that the upgrade/revert might have changed.  For troubleshooting, I created a very simple freestyle job that executed a single shell command "echo this is here"  That job failed with the following log message (slave host name removed per company confidentiality policy):

Started by user ross_oliver
[EnvInject] - Loading node environment variables.
Building remotely on <slave host name> (java7 master2) in workspace <workspace path>
FATAL: command execution failed
java.io.IOException: Remote call on <slave host name> failed
	at hudson.remoting.Channel.call(Channel.java:760)
	at hudson.Launcher$RemoteLauncher.launch(Launcher.java:916)
	at hudson.Launcher$ProcStarter.start(Launcher.java:381)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:761)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:536)
	at hudson.model.Run.execute(Run.java:1718)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:89)
	at hudson.model.Executor.run(Executor.java:240)
Caused by: java.lang.NoSuchFieldError: NULL_OUTPUT_STREAM
	at hudson.Launcher$ProcStarter.<init>(Launcher.java:152)
	at hudson.Launcher.launch(Launcher.java:408)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:1129)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:1101)
	at hudson.remoting.UserRequest.perform(UserRequest.java:121)
	at hudson.remoting.UserRequest.perform(UserRequest.java:49)
	at hudson.remoting.Request$2.run(Request.java:324)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
	at ......remote call to <slave host name>(Native Method)
	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1356)
	at hudson.remoting.UserResponse.retrieve(UserRequest.java:221)
	at hudson.remoting.Channel.call(Channel.java:752)
	... 13 more
Build step 'Execute shell' marked build as failure
Finished: FAILURE

This job also succeeds on other slaves.  I've tried disconnecting and reconnecting the slave several times, and verified the slave.jar file removed and replaced each time.  I cleared the .jenkins jar cache on both the master and slave.  A lot of Google searching hasn't turned up anything directly relavant.  I've run out of ideas as to where to look for what else might have changed.  Any suggestions for additional troubleshooting would be greatly appreciated.

Thanks,
Ross Oliver

Ross Oliver

unread,
Jun 11, 2015, 9:44:49 PM6/11/15
to jenkins...@googlegroups.com
Root cause turned out to be an outdated commons-io jar I had installed for another project in the Jenkins user's ~/Library/Java/Extensions directory.  When I updated to the latest version, both Jenkins and the other project were happy.  The problem manifested at the time of the upgrade attempt because that was the first time the Jenkins slave on that host had been restarted since the installation of the outdated jar.
Reply all
Reply to author
Forward
0 new messages