Node getting CancellationException during startup

58 views
Skip to first unread message

Sverre Moe

unread,
Feb 20, 2019, 5:20:36 AM2/20/19
to Jenkins Users
Having problems connecting to nodes getting CancellationException

    [02/20/19 11:15:03] [SSH] Starting sftp client.
    [02/20/19 11:15:03] [SSH] Copying latest remoting.jar...
    [02/20/19 11:15:03] [SSH] Copied 789,283 bytes.
    Expanded the channel window size to 4MB
    [02/20/19 11:15:03] [SSH] Starting agent process: cd "/home/build/jenkins" && /usr/java/latest/bin/java  -jar remoting.jar -workDir /home/build/jenkins
    Feb 20, 2019 10:43:39 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
    INFO
: Using /home/build/jenkins/remoting as a remoting work directory
   
Both error and output logs will be printed to /home/build/jenkins/remoting
    ERROR
: null
    java
.util.concurrent.CancellationException
        at java
.util.concurrent.FutureTask.report(FutureTask.java:121)
        at java
.util.concurrent.FutureTask.get(FutureTask.java:192)
        at hudson
.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:902)
        at hudson
.slaves.SlaveComputer$1.call(SlaveComputer.java:294)
        at jenkins
.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
        at jenkins
.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
        at java
.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java
.lang.Thread.run(Thread.java:745)
   
[02/20/19 10:47:05] Launch failed - cleaning up connection
   
Slave JVM has not reported exit code. Is it still running?
   
[02/20/19 10:47:05] [SSH] Connection closed.

Not sure what this is about? I have plenty of other nodes that are working fine, but not these
SUSE Linux Enterprise Server 11 SP3 x32
SUSE Linux Enterprise Server 11 SP3 x64
Both Jenkins instance and the nodes are running the same version of Java, Oracle JDK 8u102.

Could it have something do to with that the build user has its home directory mounted from a network location?

Ivan Fernandez Calvo

unread,
Feb 20, 2019, 2:14:39 PM2/20/19
to Jenkins Users
Check the date time in both hosts, the log is not consistent there are more than an hour of clock difference

Sverre Moe

unread,
Feb 21, 2019, 8:21:51 AM2/21/19
to Jenkins Users
That was my fault. I forgot some lines and had paste it in from a different try


[02/21/19 14:16:46] [SSH] Starting sftp client. [02/21/19 14:16:47] [SSH] Copying latest remoting.jar... [02/21/19 14:16:47] [SSH] Copied 789,283 bytes. Expanded the channel window size to 4MB [02/21/19 14:16:47] [SSH] Starting agent process: cd "/home/build/jenkins" && /usr/java/latest/bin/java -jar remoting.jar -workDir /home/build/jenkins Feb 21, 2019 2:16:48 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using /home/build/jenkins/remoting as a remoting work directory Both error and output logs will be printed to /home/build/jenkins/remoting ERROR: null

java
.util.concurrent.CancellationException
 at java
.util.concurrent.FutureTask.report(FutureTask.java:121)
 at java
.util.concurrent.FutureTask.get(FutureTask.java:192)
 at hudson
.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:902)
 at hudson
.slaves.SlaveComputer$1.call(SlaveComputer.java:294)
 at jenkins
.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
 at jenkins
.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
 at java
.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java
.lang.Thread.run(Thread.java:745)
[02/21/19 14:20:16] Launch failed - cleaning up connection
Slave JVM has not reported exit code. Is it still running?
[02/21/19 14:20:16] [SSH] Connection closed.




Sverre Moe

unread,
Feb 21, 2019, 8:23:46 AM2/21/19
to Jenkins Users
It tries again and again, but the same error each time. And each times I am left with two new instances of java on the node still running and it piles up.

build    32256  0.1  0.3 5774136 50148 ?       Sl   13:51   0:02 /usr/java/latest/bin/java -jar remoting.jar -workDir /home/build/jenkins
build    32688  0.0  0.0  12912  1772 ?        Ss   14:13   0:00 bash -c cd "/home/build/jenkins" && /usr/java/latest/bin/java  -jar remoting.jar
-workDir /home/build/jenkins

kuisathaverat

unread,
Feb 21, 2019, 9:10:03 AM2/21/19
to jenkins...@googlegroups.com
There is 4 min between the start of the connection and the termination, the agent never opens the channel and is killed when the ping thread checks the connection. It is an old version of ssh-slaves-plugin because the log, Which version is it? also Which Jenkins Core? Which authentication do you use on those agents? you need to grab more info about the issue see https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md#common-info-needed-to-troubleshooting-a-bug

>Could it have something do to with that the build user has its home directory mounted from a network location?

It copies the remoting.jar, but it is possible, you can try to change the root fs in the agent configuration to /tmp and test it. 

--
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-users/A7kFTrmdEr8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/98d7a97f-f611-4c4c-8878-abd2ac2f544f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Björn Pedersen

unread,
Feb 21, 2019, 9:22:31 AM2/21/19
to Jenkins Users

We have setups with more than one agent on one host, but they do not share the root. Both agents hold a lock on <root>/remoting/remoting.log.0.lck

So you would probably need to ensure each remoting instance uses a different root dir (do not share it wiht other nodes).

Björn

kuisathaverat

unread,
Feb 21, 2019, 9:27:30 AM2/21/19
to jenkins...@googlegroups.com
yep, I use to recommend not share user too.

--
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-users/A7kFTrmdEr8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages