Provision multiple instances of slaves with NodeProvisioner

yo...@ignissoft.com

unread,

Jul 16, 2014, 10:00:57 AM7/16/14

to jenkin...@googlegroups.com

Hello,

I'm writing a Jenkins Cloud plug-in that should work similar to EC2 plug-in.
The sequence of events is as follows:
1. Jenkins decides that there are not enough slaves to run jobs
2. Jenkins access a private resource manager system to reserve and launch test set-up. The resource manager returns an IP address of the test set-up
3. Jenkins launches slave on the returned IP.
4. The job is executed on the slave
5. Once the job is completed Jenkins kills the slave and updates the resouce manager that the set-up is free.

I have extended Cloud, Slave, ComputerLauncher etc and I have a plugin that does all of the above.
However... Jenkins never provisions more than one slave from the resource manager cloud.

Looking into the code of NodeProvisioner.update, it seems that the condition
if (idle < MARGIN || needSomeWhenNoneAtAll) {
Is never met once there is one slave in the pool.

When there are no slaves, needSomeWhenNoneAtAll == true and the if clause is executed.
When there is one slave, needSomeWhenNoneAtAll == false and idle > MARGIN so the if clause is not executed.

I've found out that MARGIN is set to 0.1 by default with the following code:
private static final float MARGIN = Integer.getInteger(NodeProvisioner.class.getName() + ".MARGIN", 10) / 100f;

I've changed this line to
private static final float MARGIN = Integer.getInteger(NodeProvisioner.class.getName() + ".MARGIN", 100) / 100f;
And forced idle < MARGIN so the if clause always gets executed and Jenkins launches multiple slaves.

My questions are:
1. Why, by default, MARGIN is set to 0.1 so idle > MARGIN and Jenkins does not launch more than one slave?
2. How can I change this default behaviour so Jenkins does not wait at all? So as soon as it finds out there is awaiting job it will ask to provision new slave?

Thanks,
Yoram

Stephen Connolly

unread,

Jul 16, 2014, 10:24:31 AM7/16/14

to jenkin...@googlegroups.com

idle is based on a long term average.

There are two schools of thought when ramping up build resources.

You can optimize for maximum resource utilization

You can optimize for shortest waiting time

You cannot do both at the same time.

Since it is not instantaneous to start up a build node you can sometimes get a faster build if you just wait for one of the existing builds to complete... as a result firing up a build node for every job in the queue will end up wasting resources.

Similarly the decommissioning strategy is likely to leave build nodes hanging around for a minute or two it would be good if we have some work they could do rather than just throw them away.

The default strategy in Jenkins is to try to maintain the build queue length at approx the number of build nodes. (Yes this may shock you) In other words Jenkins wants there always to be a job ready for each node when it is finished its current work.

That strategy will give the maximum utilization of resources with a secondary minimum waiting time.

I intend in the near future (once I complete getting the scalability framework open sourced) to make the strategy plugable, thus allowing people to, e.g.

Provision nodes while there is a demand for them - which will minimize waiting time at the expense of wasting resources

At present the trick is to play with MARGIN and MARGIN0 and the decay rate so that you get nodes provisioned faster in response to the exponentially weighted average number of idle nodes changing... but be warned, adjusting these values will cause idle nodes to get provisioned depending on the build load.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yo...@ignissoft.com

unread,

Jul 19, 2014, 4:03:18 AM7/19/14

to jenkin...@googlegroups.com

Thanks for the detailed response.

I will definitely wait for the updated mechanism, but meanwhile, I would like to have multiple instances even if sometimes they are wasted.

So what would be the best way to manipulate MARGIN and MARGIN0 without compiling Jenkins (of-course)?

Thanks,

Yoram

Stephen Connolly

unread,

Jul 19, 2014, 5:35:53 AM7/19/14

to jenkin...@googlegroups.com

Just set the system properties on the JVM before you start Jenkins:

https://github.com/jenkinsci/jenkins/blob/8c4288ca59568e9c8d94022de7871e472f019a2f/core/src/main/java/hudson/slaves/NodeProvisioner.java#L373, e.g.

-Dhudson.slaves.NodeProvisioner.MARGIN=...

Ivan Kalinin

unread,

Jul 20, 2014, 2:48:24 PM7/20/14

to jenkin...@googlegroups.com

Thumbs up for pluggable strategies!

Also, I volunteer to help if reasonable guidance is provided.

Nikolay Borisenko

unread,

Jun 4, 2015, 3:14:52 PM6/4/15

to jenkin...@googlegroups.com

Guys, please advise what values for options I should set to be closer with the next feature: "Immediately provide new node if all nodes are busy".

Thank you in advance.

Stephen Connolly

unread,

Jun 5, 2015, 5:13:51 AM6/5/15

to jenkin...@googlegroups.com

You can get really aggressive provisioning with a fast clock decay rate and a high margin and margin0

THE FOLLOWING ARE AT YOUR OWN RISK

-Dhudson.model.LoadStatistics.decay=0.2

-Dhudson.slaves.NodeProvisioner.MARGIN=50

-Dhudson.slaves.NodeProvisioner.MARGIN0=0.85

BE WARNED the above values are VERY AGRESSIVE and you will see over-provisioning.

If you use these with a cloud provider where you pay for provisioned nodes - e.g. EC2 - then BE PREPARED FOR A BIG CREDIT CARD BILL

If you want something less aggressive, perhaps start with

-Dhudson.model.LoadStatistics.decay=0.7

-Dhudson.slaves.NodeProvisioner.MARGIN=30

-Dhudson.slaves.NodeProvisioner.MARGIN0=0.6

Which may still result in over-provisioning, but should be less likely.

Over-provisioning is when you get nodes provisioned and then they sit there idle because by the time they are provisioned the job they were provisioned for has completed already.

With Jenkins versions prior to 1.607 you can even see many multiples of nodes provisioned for the same job.

This issue is further compounded by almost every single cloud implementation doing exactly the wrong thing and adding the node to Jenkins from the Callable<Node> rather than leaving that as the responsibility of NodeProvisioner (which is what the javadoc says they are supposed to do) which means that the node is then counted as provisioned but not available and hence another node needs to be provisioned.

On 4 June 2015 at 20:14, Nikolay Borisenko <nikolay.v...@gmail.com> wrote:

Guys, please advise what values for options I should set to be closer with the next feature: "Immediately provide new node if all nodes are busy".

Thank you in advance.

--

You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/18291b2a-3a79-46e5-b262-b9f957d8ad44%40googlegroups.com.

Nikolay Borisenko

unread,

Jun 5, 2015, 7:50:41 AM6/5/15

to jenkin...@googlegroups.com

Stephen, thank you for detailed answer! Your values works perfect, as I expected.

Reply all

Reply to author

Forward