[JIRA] (JENKINS-42811) Vsphere Cloud plugin stops working after a number of builds

8 views
Skip to first unread message

jorge.pena@king.com (JIRA)

unread,
Mar 15, 2017, 12:39:02 PM3/15/17
to jenkinsc...@googlegroups.com
Jorge Peña updated an issue
 
Jenkins / Bug JENKINS-42811
Vsphere Cloud plugin stops working after a number of builds
Change By: Jorge Peña
Summary: On Demand calculateMaxAdditionalSlavesPermitted broken Vsphere Cloud plugin stops working after a number of builds
Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

pjdarton@gmail.com (JIRA)

unread,
Sep 24, 2019, 6:53:03 AM9/24/19
to jenkinsc...@googlegroups.com
pjdarton commented on Bug JENKINS-42811
 
Re: Vsphere Cloud plugin stops working after a number of builds

First, is this still a problem with the current version (2.20) of the plugin? If not, please close the issue.

If it is... when you say "the following error appears in the main Jenkins.log file", what error do you mean? The log shows no errors, only "INFO" and "WARNING".

As for your ingenious workaround...

  • creating a new vSphereCloud instance and passing in an instanceCap of 0 will result in an instanceCap of Integer.MAX_VALUE (2147483647); that's expected. It's ugly, sure, but it is expected. There's a lot of code in this plugin that's evolved over time and is "just about good enough" - if folks were to rewrite it today, coding against Jenkins as it is right now, it could look a lot neater, but this is old code, so it's got a few wrinkles.
  • By repeatedly re-creating the templates, you're defeating the plugin's ability to track what VMs exist, which may well result in duplicate slave/VM names, and maybe causing yourself a memory leak; that shouldn't be necessary and, as far as I am aware (as "it works for me"), is not necessary.
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

jorge.pena@king.com (JIRA)

unread,
Jan 10, 2020, 9:36:02 AM1/10/20
to jenkinsc...@googlegroups.com

Hi pjdarton thank you for replying.

Yes, when I wrote error I refered to the warning printed on the post above. After creating the issue I realised that having the value 2147483647 is just cosmetic.

We are relying on this plugin for creating and destroying jenkins agents on demand. This issue keeps happening with newer versions of the plugin so we are running a job recurrently in our Jenkins instance to avoid this issue. We didn't experience issues of memory. We had issues of VMs with duplicated names but I am not sure it was related with this workaround. We developed a small script that would destroy the virtual machines on vmware that weren't present in Jenkins, after all these are linked clones and were re-created again.

Currently we are running

Jenkins version: 2.150.3
vsphere-cloud plugin: 2.21

This is a piece of the log where it should create a new jenkins agent (the last one since we set up a hard limit of 50 agents) but it didn't create it

Jan 10, 2020 3:11:32 PM INFO org.jenkinsci.plugins.vSphereCloud provision
provision(xcode-10.1,20): Provisioning 0 new =[]
Jan 10, 2020 3:11:32 PM INFO org.jenkinsci.plugins.vSphereCloud calculateMaxAdditionalSlavesPermitted
There are 49 VMs in this cloud. The instance cap for the cloud is 50, so we have room for more
Jan 10, 2020 3:11:32 PM INFO org.jenkinsci.plugins.vSphereCloud provision
provision(xcode-11.1,6): 0 existing slaves (=0 executors), templates available are [Template[prefix=sod-xxx-01, provisioned=[sod-xxx-011, sod-xxx-0110, sod-xxx-012, sod-xxx-013, sod-xxx-014, sod-xxx-015, sod-xxx-016, sod-xxx-017, sod-xxx-018, sod-xxx-019], planned=[], unwanted={}, max=10, fullness=100.000%]]

That piece of log just keeps repeating over and over. After running the job with the mentioned workaround it starts creating agents again.

Regards,

pjdarton@gmail.com (JIRA)

unread,
Jan 10, 2020, 12:41:03 PM1/10/20
to jenkinsc...@googlegroups.com
pjdarton commented on Bug JENKINS-42811

According to that log message, you've set a max=10 for instances from that template.
So, unless you've got other templates defined on that cloud, you're never going to reach 50 as that template is capped at 10 - that's why it's saying it's 100% full when it's got 10 instances provisioned with max=10.

You either need to allow that template to spawn an unlimited number of instances (capped only by the cloud instance cap), or to spawn a larger number of instances, or to define other templates such that the sum of all templates' max fields comes to at least the cloud's instance cap.

jorge.pena@king.com (JIRA)

unread,
Jan 12, 2020, 3:00:03 PM1/12/20
to jenkinsc...@googlegroups.com
jpena commented on Bug JENKINS-42811

We have other templates defined on the cloud.

However, even with no cap defined globally or in the template, the behaviour of not creating more instances keeps happening sometimes until the script is executed.

pjdarton@gmail.com (JIRA)

unread,
Jan 13, 2020, 5:53:03 AM1/13/20
to jenkinsc...@googlegroups.com
pjdarton commented on Bug JENKINS-42811

Well, for the scenario you've given logs for, the behavoir is working-as-designed - it was told not to provision more than 10 of that template so that's what it did. It only broke that limit after you killed off the old cloud and replaced it with one with no memory of the old instances

If you can reproduce the scenario where it's "not creating more instances" when there is "no cap" (and provide logs & other information) then that'd make the problem more solvable.

Reply all
Reply to author
Forward
0 new messages