New Job Cacher plugin to cache dependencies of builds on docker based executors

552 views
Skip to first unread message

Peter Hayes

unread,
Nov 30, 2016, 10:18:07 AM11/30/16
to Jenkins Developers
Hi,

We are using Cloudbees Private SaaS Edition which utilizes docker containers as executors.  A side effect of this is that each time you run a job, you start with a fresh container without any previously cached dependencies (we use gradle generally).  This increases the length of the build and adds network traffic to our Artifactory instance.  I looked around for existing plugins but didn't find any so I have started a plugin[1] based on SimpleBuildWrapper that stores a configured set of files on the master at the end of the build and then on the next build downloads them to master in the original location. 

I still have more work remaining but prior to investing more time, I wanted to check with this group to see if it makes sense to complete this or if there is a better option. I also had seen a post[2] on the user's list a few months ago looking for a similar capability that didn't come up with anything.

Thanks,
Pete

Jesse Glick

unread,
Nov 30, 2016, 2:04:03 PM11/30/16
to Jenkins Dev
On Wed, Nov 30, 2016 at 10:18 AM, Peter Hayes <pete...@gmail.com> wrote:
> each time you run a job, you
> start with a fresh container without any previously cached dependencies (we
> use gradle generally). This increases the length of the build and adds
> network traffic to our Artifactory instance. I looked around for existing
> plugins but didn't find any so I have started a plugin[1] based on
> SimpleBuildWrapper that stores a configured set of files on the master at
> the end of the build and then on the next build downloads them to master in
> the original location.

This seems like a poor approach; rather than overloading Artifactory,
you will be overloading the Jenkins master. Archiving artifacts via
the Remoting channel can already wreck performance; you are talking
about potentially orders of magnitude more traffic than that.

There are two basic approaches to this kind of problem. One, which
assumes that the agents reuse workspaces between builds, is to set the
local repository/cache location to a workspace location. The
`docker-workflow` demo does this:

https://github.com/jenkinsci/docker-workflow-plugin/blob/46432bbe36af17dac93cfedcc93ffa51beba1343/demo/repo/flow.groovy#L20-L22

The other approach is to mount a volume containing the cache, letting
the Docker daemon handle the storage, which the
`parallel-test-executor` demo does:

https://github.com/jenkinsci/parallel-test-executor-plugin/blob/3961df3784045df1f6f285bc2b685ead4bc8593b/demo/Makefile#L3-L27

The volume-based approach is probably the more scalable, though there
are two points to beware: at least Maven’s `install:install` will dump
locally built artifacts into the repository alongside downloaded
releases (probably Gradle does something similar); and Maven’s Aether
repository manager is by default not thread-safe (Takari fixes this).
Maven 5 may allow the cache to be properly separated (again I am not
sure how Gradle fares here); in the meantime you may need to ensure
that there is a distinct volume for every potentially concurrent
build, for example keyed by `${JOB_NAME}/${EXECUTOR_NUMBER}`.

At any rate the exact solution chosen is going to depend on details of
how agents are provisioned and workspaces managed, so at root this
might simply be an RFE for CJP-PSE.

Peter Hayes

unread,
Nov 30, 2016, 3:36:43 PM11/30/16
to Jenkins Developers
Thanks for the insight.  I do see that this will cause a burden on the master node.  Since we are using CJP-PSE, that is mitigated somewhat as we will be running quite a few masters so the ratio of jobs to masters won't be terribly high.  

Reusing workspaces isn't an option for us due to the architecture of CJP-PSE at the moment. I actually did start using an externally mounted volume but as you note, we will run into concurrency issues with shared caches on the host instance and there is no reliable way to separate the caches while still getting the benefit of caches as there is no distinct executor number (always 1). If there was some enhancement to CJP to transparently manage workspaces across executor (and support parallel build execution) then we could look at that.  I did raise this with the PSE team in any event a while back and I imagine that this will need to be addressed as it is a step back in performance from classic persistent Jenkins executors.

The other thought that crossed my mind since we are running in AWS is to leverage a more scalable file store within AWS like S3.  Both artifact archiving and dependency caching could be good candidates. It would be cool if there was an S3 backing of FilePath abstraction and plugin developers could seamlessly access it via Project.getStoragePath() or something like that.  Then a plugin like I am proposing could provide a more scalable solution without hardwiring to S3.  I'm guessing I'm not the first to think of it so there are likely challenges in doing so. 

Peter Hayes

unread,
Nov 30, 2016, 4:21:46 PM11/30/16
to Jenkins Developers
Continuing to think on this a bit more - FilePath abstraction doesn't look like it would work as it assumes a computer on the other end. What if there was an "External Storage Plugin" extension point that could be backed by S3 and leveraged by other plugins for managing large files associated with jobs.  Ideally it would share a job lifecycle so that when jobs are renamed / deleted, the related external storage area for the jobs would be managed as well.  Is there an extension point for something like that?

Jesse Glick

unread,
Nov 30, 2016, 5:59:04 PM11/30/16
to Jenkins Dev
Probably you are looking for the External Workspace Manager plugin.
Last I checked, it had not been extended to really support clouds. I
suggest you raise this as an RFE with the PSE team, rather than
discussing it here—unless you intend to try developing such an
extension yourself, in which case you would likely want to hang out in
https://gitter.im/jenkinsci/external-workspace-manager-plugin and ask
for advice.

Peter Hayes

unread,
Nov 30, 2016, 6:19:08 PM11/30/16
to Jenkins Developers
Ok. Thanks.
Reply all
Reply to author
Forward
0 new messages