Remote Jenkins Node never brought offline

146 views
Skip to first unread message

Alexandru Băluț

unread,
Apr 5, 2018, 12:06:22 PM4/5/18
to jenkins...@googlegroups.com
I'm using Jenkins ver. 2.107.1 and I created a Node. See the screenshot below for the configuration details of the Node.

The node is launched by a "Launch command" which starts an expensive cloud instance, then starts the agent remotely through ssh, then after the agent process stops shuts down the machine. 

In the agent log I see "Agent successfully connected and online" but even though the Availability is set to "Take this agent online when in demand, and offline when idle", and Idle delay is 1, it's never taken offline. I even specifically marked the agent as offline by clicking a button in its Status page, but still the same.

I tried to kill the "java -jar agent.jar" process and the machine was shut down as expected, but then it was brought up again, even though Build Queue is empty "No builds in the queue".

Why does Jenkins keep bringing it up?

Thanks,
Alex







Alexandru Băluț

unread,
Apr 10, 2018, 9:40:57 AM4/10/18
to jenkins...@googlegroups.com
On 5 April 2018 at 15:44, Alexandru Băluț <alexand...@gmail.com> wrote:
I'm using Jenkins ver. 2.107.1 and I created a Node. See the screenshot below for the configuration details of the Node.




The problem I reported seems to be gone. But now there is another. In the start-worker.sh script I have:

#!/bin/sh


I=$1

P=...

Z=...


# According to `gcloud compute instances start --help` this is sync.

gcloud compute --project $P instances start --zone $Z $I || exit 1


finish() {

  # Shutdown.

  echo 3 >> /tmp/x

  gcloud compute --project $P instances stop --zone $Z $I >> /tmp/x

}

trap finish EXIT


# "What Jenkins expects from your script is that, in the end, it has to execute

# the agent program like java -jar agent.jar, on the right computer, and have

# its stdin/stdout connect to your script's stdin/stdout."

# https://wiki.jenkins.io/display/JENKINS/Distributed+builds#Distributedbuilds-WriteyourownscripttolaunchJenkinsagents

echo 1 >> /tmp/x

gcloud compute --project $P ssh --zone $Z $I --command 'wget "http://10.132.0.20:8080/jnlpJars/agent.jar" -O agent.jar && java -jar agent.jar'


echo 2 >> /tmp/x



The agent is started fine, the job being run fails as expected, but then "echo 2" is never executed. Not even "echo 3" which should be executed when the scripts exits. It seems as if the script process is killed -9. This is a problem because the instance I started cannot be brought down using this nice synchronous mechanism provided by "gcloud compute instances start/stop". 

This is what I see in the node log:

[...]
Connection terminated
channel stopped
[04/10/18 13:13:47] Launching agent
$ /var/lib/jenkins/start-worker.sh instance-eval-worker-template
+ I=instance-eval-worker-template
+ P=...
+ Z=...
+ gcloud compute --project ... instances start --zone ... instance-eval-worker-template
Starting instance(s) instance-eval-worker-template...
.done.
Updated [https://www.googleapis.com/compute/v1/projects/.../instances/instance-eval-worker-template].
+ trap finish EXIT
+ echo 1
+ gcloud compute --project ... ssh --zone ... instance-eval-worker-template --command wget "http://10.132.0.20:8080/jnlpJars/agent.jar" -O agent.jar && java -jar agent.jar
Updating project ssh metadata...
.....................failed.
--2018-04-10 13:14:14--  http://10.132.0.20:8080/jnlpJars/agent.jar
Connecting to 10.132.0.20:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 762466 (745K) [application/java-archive]
Saving to: ‘agent.jar’

     0K .......... .......... .......... .......... ..........  6%  102M 0s
   700K .......... .......... .......... .......... ....      100%  206M=0.006s

2018-04-10 13:14:14 (131 MB/s) - ‘agent.jar’ saved [762466/762466]

<===[JENKINS REMOTING CAPACITY]===>channel started
Remoting version: 3.17
This is a Unix agent
Evacuated stdout
Agent successfully connected and online
Connection terminated



Any idea what's going on? Should I file a bug or am I using it incorrectly?

Alexandru Băluț

unread,
Apr 18, 2018, 11:28:34 AM4/18/18
to Jenkins Users
It seems the script is killed when the command-launcher plugin is done with the agent. I filed https://issues.jenkins-ci.org/browse/JENKINS-50842
Reply all
Reply to author
Forward
0 new messages