Pipeline node scheduling options

162 views
Skip to first unread message

John Calsbeek

unread,
Oct 28, 2016, 11:06:15 PM10/28/16
to Jenkins Users
We have a problem trying to get more control over how the node() decides what node to allocate an executor on. Specifically, we have a situation where we have a pool of nodes with a specific label, all of which are capable of executing a given task, but with a strong preference to run the task on the same node that ran this task before. (Note that these tasks are simply different pieces of code within a single pipeline, running in parallel.) This is what Jenkins does normally, at job granularity, but as JENKINS-36547 says, all tasks scheduled from any given pipeline will be given the same hash, which means that the load balancer has no idea which tasks should be assigned to which node. In our situation, only a single pipeline ever assigns jobs to this pool of nodes.

So far we have worked around the issue by assigning a different label to each and every node in the pool in question, but this has a new issue: if any node in that pool goes down for any reason, the task will not be reassigned to any other node, and the whole pipeline will hang or time out.

We have worked around that by assigning each task to "my-node-pool-# || my-node-pool-fallback", where my-node-pool-fallback is a label which contains a few standby nodes, so that if one of the primary nodes goes down the pipeline as a whole can still complete. It will be slower (these tasks can take two to ten times longer when not running on the same node they ran last time), but it will at least complete.

Unfortunately, the label expression doesn't actually mean "first try to schedule on the first node in the OR, then use the second one if the first one is not available." Instead, there will usually be some tasks that schedule on a fallback node even if the node they are "assigned" to is still available. As a result, almost every run of this pipeline ends up taking the worst-case time: it is likely that some task will wander away from its assigned node to run on a fallback, which leads the fallback nodes to be over-scheduled and leaves other nodes sitting idle.

The question is: what are our options? One hack we've considered is attempting to game the scheduler by using sleep()s: initially schedule all the fallback nodes with a task that does nothing but sleep(), then schedule all our real tasks (which will now go to their assigned machines whenever possible, because the fallback nodes are busy sleeping), and finally let the sleeps complete so that any tasks which couldn't execute on their assigned machines now execute on the fallbacks. A better solution would probably be to create a LoadBalancer plugin that codifies this somehow: preferentially scheduling tasks only on their assigned label, scheduling on fallbacks only after 30 seconds or a minute.

Is anyone out there dealing with similar issues, or know of a solution that I have overlooked?

Thanks,
John Calsbeek

Michael Lasevich

unread,
Oct 29, 2016, 12:14:53 AM10/29/16
to Jenkins Users
Is there are way to reduce the need for tasks to run on same slave? I suspect the issue is having data from the last run - if that is the case, is there any shared storage solution that may reduce the time difference? If you can reduce the need for binding tasks to specific nodes, you bypass your entire headache.

As for your other approaches, 

A minor point is that you may consider that since a node can have multiple labels, your nodes can have individual labels AND a shared label - meaning your fallback can be shared among all the existing nodes.

But more to the point, if your main issue is that you are worried that a node may be unavailable, you may consider some automatic node allocation. I am not sure if there are other examples, but for example the AWS node allocation can automatically allocate a new node if no threads are available for a label. That may be a decent backup strategy. If you are not using AWS - you can probably look if there is another node provisioning plugin that fits or if not, look at how they do that and write your own plugin to do it

But maybe I am overthinking it. In the end, if your primary concern is that node may be down - remember that pipeline is groovy code - groovy code that has access to the Jenkins API/internals. You can write some code that will check the state of the slaves and select a label to use before you even get to the node() statement. Sure, that will not fix the issue of a node going down in a middle of a job, but may catch the job before it assigns a task to a dead node.

Alternatively, you can simply write another job, in lieu of a plugin, that will scan all your tasks and nodes and if it detects a node down and a task waiting for it, assign the label to another node from the "standby" pool

I realize all of this sounds hacky. I would really consider that the first and foremost task would be to figure out if you can bypass the problem in the first place.

-M

John Calsbeek

unread,
Oct 29, 2016, 12:36:49 AM10/29/16
to Jenkins Users
On Friday, October 28, 2016 at 9:14:53 PM UTC-7, Michael Lasevich wrote:
Is there are way to reduce the need for tasks to run on same slave? I suspect the issue is having data from the last run - if that is the case, is there any shared storage solution that may reduce the time difference? If you can reduce the need for binding tasks to specific nodes, you bypass your entire headache.

Shared storage is a potential option, yes, but the tasks in question are currently not very fault-tolerant when it comes to network hitches.
 
A minor point is that you may consider that since a node can have multiple labels, your nodes can have individual labels AND a shared label - meaning your fallback can be shared among all the existing nodes.

Yes, we do this in some cases already. But the core issue remains (a node that happens to schedule on a fallback node is not running on its preferred node).
 
But more to the point, if your main issue is that you are worried that a node may be unavailable, you may consider some automatic node allocation. I am not sure if there are other examples, but for example the AWS node allocation can automatically allocate a new node if no threads are available for a label. That may be a decent backup strategy. If you are not using AWS - you can probably look if there is another node provisioning plugin that fits or if not, look at how they do that and write your own plugin to do it

Assuming that we have a fixed amount of computing resources, does this have any advantage over writing a LoadBalancer plugin?
 
But maybe I am overthinking it. In the end, if your primary concern is that node may be down - remember that pipeline is groovy code - groovy code that has access to the Jenkins API/internals. You can write some code that will check the state of the slaves and select a label to use before you even get to the node() statement. Sure, that will not fix the issue of a node going down in a middle of a job, but may catch the job before it assigns a task to a dead node.

Ah, that's an interesting idea. Something that I forgot to mention in the original post is that if there was a node() function that allocates with a timeout, that would also be a building block that we could use to fix this problem. (If attempting to allocate a specific node fails with a timeout, then schedule on a fallback. timeout() doesn't work because that would apply the timeout to the task as well, not merely to the attempt to allocate the node.) We could indeed query the status of nodes directly. I have a niggling doubt that it would be possible to do this without a race condition (what if the node goes down between querying its status and scheduling on it?), but it's definitely something worth investigating.
 
Alternatively, you can simply write another job, in lieu of a plugin, that will scan all your tasks and nodes and if it detects a node down and a task waiting for it, assign the label to another node from the "standby" pool

This is an idea that we had considered, yeah, although I was considering it as a first step in the pipeline before scheduling, which made me nervous about race conditions. But if, as you suggest, it was a frequently run job which is always attempting to set up node allocations… that could definitely work. Good suggestion, thanks!

Thanks,
John Calsbeek

Michael Lasevich

unread,
Oct 29, 2016, 12:58:33 AM10/29/16
to Jenkins Users


On Friday, October 28, 2016 at 9:36:49 PM UTC-7, John Calsbeek wrote:

Shared storage is a potential option, yes, but the tasks in question are currently not very fault-tolerant when it comes to network hitches.

Well, it would pay to make them more fault-tolerant :-) But even if you do not fix the process, you do not have to run it from the shared storage, just use it as storage. Using a node-local mirror, you can rsync it from shared storage, run the task, then rsync it back - assuming your data does not change much (which I understand is not always the case) - you will soon have a relatively recent rsync copy on every node - reducing amount of data moving. May or may not work in your case, but something to consider.
 
 
 
But more to the point, if your main issue is that you are worried that a node may be unavailable, you may consider some automatic node allocation. I am not sure if there are other examples, but for example the AWS node allocation can automatically allocate a new node if no threads are available for a label. That may be a decent backup strategy. If you are not using AWS - you can probably look if there is another node provisioning plugin that fits or if not, look at how they do that and write your own plugin to do it

Assuming that we have a fixed amount of computing resources, does this have any advantage over writing a LoadBalancer plugin?

If you are allocating your nodes instead of pre-creating, you do not have to have a big shared pool, instead specific nodes are allocated with same label only as needed, and as old nodes that died are decommissioned, they can re-join the pool of available resources. Of course if you feed the affinity requirement, just using them all as a pool is probably easier.

 
But maybe I am overthinking it. In the end, if your primary concern is that node may be down - remember that pipeline is groovy code - groovy code that has access to the Jenkins API/internals. You can write some code that will check the state of the slaves and select a label to use before you even get to the node() statement. Sure, that will not fix the issue of a node going down in a middle of a job, but may catch the job before it assigns a task to a dead node.

Ah, that's an interesting idea. Something that I forgot to mention in the original post is that if there was a node() function that allocates with a timeout, that would also be a building block that we could use to fix this problem. (If attempting to allocate a specific node fails with a timeout, then schedule on a fallback. timeout() doesn't work because that would apply the timeout to the task as well, not merely to the attempt to allocate the node.) We could indeed query the status of nodes directly. I have a niggling doubt that it would be possible to do this without a race condition (what if the node goes down between querying its status and scheduling on it?), but it's definitely something worth investigating.

I am wondering if you can do some weird combination of  parallel + sleep + failFast  + try/catch to emulate a timeout for a specific task
 
Alternatively, you can simply write another job, in lieu of a plugin, that will scan all your tasks and nodes and if it detects a node down and a task waiting for it, assign the label to another node from the "standby" pool

This is an idea that we had considered, yeah, although I was considering it as a first step in the pipeline before scheduling, which made me nervous about race conditions. But if, as you suggest, it was a frequently run job which is always attempting to set up node allocations… that could definitely work. Good suggestion, thanks!
 
Throw enough things against a wall, something will stick ;-)  Glad to be of help.

Good luck.

 -M

Paul van der Ende

unread,
May 29, 2017, 1:11:18 PM5/29/17
to Jenkins Users
Same issue here. Our incremental builds suffer from very poor locality because the pipeline always start on a completely different set of node then last time, and we have quite a few nodes. I came up with the same tricks you mention, but it did not help much. Did you find a better solution already, besides managing nodes by hand?

My preferred solution is as follows: Ideally every 'node()' in my pipeline should stick to the same Jenkins node whenever possible. They also should end up in the same workspace to improve reuse of cached artifacts, but ideally different 'nodes()' of the same pipeline should end up in a different working space to avoid conflicts. 

If I compare nodes() with freestyle jobs then they should basically work the same way. I can not believe this is not the case.

John Calsbeek

unread,
May 30, 2017, 10:30:10 AM5/30/17
to Jenkins Users
We are currently using Michael's suggestion of running a job periodically which looks for labels with 0 nodes and assigns that label to a node in the "fallback" pool. We are continuing to manage locality by manually juggling label assignments. So now we have guaranteed incremental builds, which is nice, but every time we add or remove a pipeline, we have to redistribute labels somehow.
Reply all
Reply to author
Forward
0 new messages