Scaling - executors vs slaves

1,610 views
Skip to first unread message

Bruce Epstein

unread,
Jul 27, 2016, 2:42:42 PM7/27/16
to Jenkins Users
Hi -

I'm an experienced Jenkins user (writing Ant scripts, using plugins, etc.) but not an IT/administrator, and my IT dept is not that familiar with Jenkins scaling.

If anyone can point me to a comprehensive discussion of the best way to scale, please provide a url.

Current architecture:

Only one master with just a single executor.
All jobs are run on the master
Running jenkins 1.652
The load is not the heavily. We probably never have more than 2 or 3 users needing Jenkins at the same time, and usually it is just one.  95% of the time, we don't have a scale issue, so I don't want to over-engineer the solution.
We have three or four development teams, and sometimes queue conflicts arise. We want to scale up a bit for future growth.

Current problems:
1. Some jobs (with three or four sub-jobs) monopolize the queue for 30+ minutes, preventing other jobs from running. One in particular is a library built in response to an svn change, which then triggers four other apps to rebuild. These are separate Jenkins jobs and yet they hog the queue preventing other users from running any jobs, even "in between" each app being rebuilt.

2. Some multiconfiguration jobs (that build, say, 30 war files), can take about 90 minutes to run (3 minutes per iteration). We'd like to cut that down, but at least they allow other jobs to run (i.e. don't monopolize the queue). These wars can be built in parallel (no need to run in series, which is the default for multiconfiguration jobs, I assume).

Things I've tried:
1. No matter how I've tried to configure the queue-hogging job, I can't get it to "play nicely". Once it starts, it runs all the way through (say, 4 subjobs, each taking about 8 minutes). So, configuring the master to use, say, 2 or 3 executors seems to be one way to allow other jobs to run without being shut out.

2. Increasing the number of executors "works" for some use cases, but it also seems to cause jobs to run in parallel that I need to run in sequence. I'm unclear on how to prevent multiple executors from being used when I want one job to wait for another. Is this just how executors work? How do I ensure the extra executors are assigned to other jobs and not just used in parallel for the queue-hogging job?

Possible solutions:
1. Add slaves?  (see below)
2. Use multiple executors with BuildFlow or similar plugins to prevent jobs being triggered to run in parallel? Even BuildFlow seems to require at least two executors, or it hangs up trying to launch the first subjob in the flow.

Proposed solution:

1. Stick with only one master. Creating multiple masters seems unnecessary at our size.
2. Don't build jobs on the master...leave that to the slaves. (This seems to be the best practice?)
3. Create two slaves eventually (one is enough for now while we are still performing builds on master too)
4. Configure one slave to use only one executor. Configure the second slave to use multiple executors.
5. Configure certain jobs to run on the appropriate slave (single-executor or multi-executor) depending on the job's needs.

6. Should I be looking at CloudBees or plugins like EC2, Heavy Job, or One-Shot Executor?


I need someone who has "been there, done that" to give me a reality check or alert me to any blindspots before I ask IT to acquire more hardware and configure it. I want to have some confidence this will solve the problem without being overkill.


Any insights appreciated.


In gratitude, I'm happy to answer any Flex questions. :-)


Thanks,

Bruce



Jackson, Randy

unread,
Jul 27, 2016, 3:41:48 PM7/27/16
to jenkins...@googlegroups.com

Bruce,

 

If you need to expand the number of executors on your master and prevent the jobs  you need to run in sequence from running in parallel, I would look into using the Parameterized Trigger Plugin.  You can create a control job to fire off the jobs you want to run one after the other by adding a trigger build step for each of the jobs and selecting the “Block until the triggered projects finish their builds” option.  Now the control job will hog one of the executors while each of the triggered jobs is running, so you will need 2 executors to run your sequence of jobs.  We have 10 executors open even though we don’t run many jobs and don’t run into any problems.

 

Hope that helps.

 

Randy Jackson

Software Build Engineer

Indiana Farm Bureau Insurance

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/3a038f29-5221-4830-80ae-ca5a70c7ccc7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


This email is intended solely for the named addressee(s) and may be confidential or contain privileged information. Review, disclosure, copying, distribution, or use of this email by anyone other than an intended recipient is strictly prohibited. If you received this email in error, please delete it from your system and notify the sender immediately. While precautions have been taken to help ensure no computer viruses are present, there is a risk whenever transmitting emails or downloading attachments. The sender will not be liable for any loss or damage resulting from any malware in this communication or for improper or incomplete transmission of its contents or for any delay in its receipt.

Stephen Connolly

unread,
Jul 27, 2016, 3:58:01 PM7/27/16
to jenkins...@googlegroups.com
You really want a few build agents.

FTR you are not even close to scaling limits... I'm giving a talk on how to scale Jenkins at this years Jenkins World... I don't want to spoil the fun of that talk so I'm not giving out details, but if you cannot get to Jenkins World it will be available on-line... And hopefully it will be a good talk with interesting things for everyone no matter how big or small their Jenkins is.
--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/3a038f29-5221-4830-80ae-ca5a70c7ccc7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Sent from my phone

John Mellor

unread,
Jul 27, 2016, 3:59:06 PM7/27/16
to jenkins...@googlegroups.com

The system load associated with any given build varies extremely widely, so I do not think that it is possible to provide guidelines for how many executors to configure.  Some jobs just do a lot of disk i/o, while other use all possible processor threads to build (e.g: make –j steps).  It may be possible to build some scheduling heuristics that would put the appropriate jobs on the various slave machines, but it would be insanely complicated and error-prone.

 

I stopped worrying about it, and just set the number of executors on each machine to one.  Nothing runs on the master, as it is busy enough with its normal task load.  If that is not efficient, so what?  Hardware is cheap in comparison – especially VMs and containerized slaves.  Use something like Kubernetes if you want to manage the total Jenkins load on the cluster and spin up more slaves as required.

 

From: jenkins...@googlegroups.com [mailto:jenkins...@googlegroups.com] On Behalf Of Bruce Epstein
Sent: July-27-16 14:43
To: Jenkins Users
Subject: Scaling - executors vs slaves

 

Hi -

--

Baptiste Mathus

unread,
Jul 27, 2016, 4:02:15 PM7/27/16
to jenkins...@googlegroups.com

Good practices to scale, IMO:
* don't build on the master
* Yes. Add agents/slaves, see below.
* Never put more than one executor per agent (slave term now deprecated). Engineering time is far more expensive than having N agents, preventing builds to step over each other's toes.
** We initially used static agents, running in a corporate ESX. And now it's mostly running using a docker swarm cluster, for about 60 hours a day and ~1000 active jobs.
* IMO directly go to two or three agents and not just one. This way you'll maybe avoid (your users) designing builds to depend on a specific machine.
** Corollary: never use node names as labels.

My 2 cents

-- Baptiste


Bruce Epstein

unread,
Aug 2, 2016, 1:25:22 AM8/2/16
to Jenkins Users, m...@batmat.net
Thank you to everyone for the cogent replies to my question.
Reply all
Reply to author
Forward
0 new messages