Gradle Enterprise for Jenkins development

Mark Waite

unread,

Dec 1, 2021, 3:24:20 PM12/1/21

to Jenkins Infrastructure

During DevOps World 2021 there was a presentation by Gradle Enterprise that highlighted their build and test acceleration capabilities for maven projects and gradle projects. The presentation and later discussions give me hope that the Jenkins project might benefit significantly from Gradle Enterprise.

Gradle Enterprise offers a license for open source projects. I've confirmed with Etienne Studer, the Gradle Enterprise SVP of development that the Jenkins project is eligible for the license.

Etienne described the Gradle Enterprise build and test acceleration process as a series of clearly defined experiments that are used to identify areas that can be accelerated, then improvements are applied and the results are evaluated. We'd run those experiments, see what we learn, and report the results.

Based on my discussion with Etienne, I suggested that it might be good to invite him to a meeting of the Jenkins infrastructure team for a 10-15 minute presentation followed by question and answer from the infrastructure team.

Would the infrastructure team be willing to dedicate most of a meeting to a discussion of build and test acceleration with Gradle Enterprise?

They have both a hosted configuration and a self-hosted configuration. The hosted configuration is usually used by smaller projects (like JUnit) while the hosted configuration is used by larger projects (like the Spring framework).

Mark Waite

Jesse Glick

unread,

Dec 1, 2021, 3:50:16 PM12/1/21

to jenkin...@googlegroups.com

On Wed, Dec 1, 2021 at 3:24 PM Mark Waite <mark.ea...@gmail.com> wrote:

The presentation and later discussions give me hope that the Jenkins project might benefit significantly from Gradle Enterprise.

Based on what I can see of their materials I am skeptical this would be of much use to the Jenkins project. Our main overhead is running large functional & acceptance test suites. https://gradle.com/gradle-enterprise-solution-overview/build-cache-test-distribution/#test-distribution claims to addresses this by parallelizing test suites, something we already do in Jenkins and would have to replicate in a completely different environment, duplicating a lot of subtle build configuration in the process. The test distribution system might be useful outside CI, if lots of developers run lots of local tests, but is that really the case?

https://www.launchableinc.com/predictive-test-selection feels more likely to offer a material improvement in test times, and probably be a lot less intrusive to set up. And of course we would be assured of first-class tech support. :-)

At any rate, it is worth exploring various options, so long as it is understood that pushing nontrivial configuration into the source base (incl. CI system setup) would require broader discussion of the value vs. complexity.

Kohsuke Kawaguchi

unread,

Dec 1, 2021, 5:05:12 PM12/1/21

to jenkin...@googlegroups.com

Yeah, I'd love to get Launchable deployed for Jenkins! Obviously we'd be happy to do it for free -- I'd just like to get some visibility and adoption.

I want to make sure how you would envision Launchable useful to Jenkins, though. The most straightforward thing our users do is to run us in the pull request validation phase to get quick feedback. Statistically there's always a chance that the subsetting misses a regression because we skip some tests, but those would be caught post merge. This works very well when you have a team of people developing big software -- they have multiple layers of tests in the delivery pipeline and some regressions are always slipping per-merge anyway.

For typical open-source projects, however, it's a little different. There are a large number of people who are contributing one off changes, and the project as a whole doesn't really have that much incentive to speed those up. Their changes sit in the PR phase for a long time, too, and compute cost for CI is also often given for free (e.g., GitHub Actions).

In the case of the Jenkins project, CI is not free, in fact AIU it's quite costly, so perhaps from the computational cost perspective this might be worthwhile just for that. Or maybe we have less of a long tail contributors now and speed boost for the small number of core contributors might make this worthwhile.

Or maybe the idea is to apply this with ATH so that we can "left shift" subset of ATH during PR review.

In parallel, if I can instrument the CI process for Jenkins pre-merge validation, I can get the data collection going to train a model, which is a non-invasive process. From that we can see the performance of the model -- how much regressions we can catch by running what subset.

--
You received this message because you are subscribed to the Google Groups "Jenkins Infrastructure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins-infr...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/jenkins-infra/CANfRfr2%3DCrHe-ZZakWzv_Cz07EgPZSTWsd3wvH66%3DnOj0PoUqA%40mail.gmail.com.

Jesse Glick

unread,

Dec 1, 2021, 5:48:58 PM12/1/21

to jenkin...@googlegroups.com

On Wed, Dec 1, 2021 at 5:05 PM Kohsuke Kawaguchi <k...@kohsuke.org> wrote:

Or maybe the idea is to apply this with ATH so that we can "left shift" subset of ATH during PR review.

That, or PCT (esp. if redesigned as per JENKINS-45047), or just big in-repo tests in the case of core. I agree that we need to carefully consider what we actually want out of something like this.

Kohsuke Kawaguchi

unread,

Dec 1, 2021, 10:15:17 PM12/1/21

to jenkin...@googlegroups.com

When you say "in-repo tests", are we talking unit tests that run with "mvn test"?

--

You received this message because you are subscribed to the Google Groups "Jenkins Infrastructure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins-infr...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/jenkins-infra/CANfRfr2CnUVdz%2BZh7-SbfSMHiw18DWGR%3DQ%3DPcvENX4qvB3MiGg%40mail.gmail.com.

Jesse Glick

unread,

Dec 2, 2021, 8:37:27 AM12/2/21

to jenkin...@googlegroups.com

On Wed, Dec 1, 2021 at 10:15 PM Kohsuke Kawaguchi <k...@kohsuke.org> wrote:

When you say "in-repo tests", are we talking unit tests that run with "mvn test"?

Well, not necessarily “unit” tests, but yes.

Mark Waite

unread,

Dec 15, 2021, 4:56:35 AM12/15/21

to Jenkins Infrastructure

Etienne Studer presented a demonstration of Gradle Enterprise and how it could help development on Jenkins core.

Recording is at https://youtu.be/94Z26hnf17s

We'll host a developer Meetup in January 2022 so that he can present the material and so that the Jenkins developer community can ask questions.

I think we should do something similar with Launchable.

Mark Waite

Kohsuke Kawaguchi

unread,

Dec 15, 2021, 10:37:42 PM12/15/21

to jenkin...@googlegroups.com

Sure, thanks for the opportunity, and happy to do it! What would be the right venue?

Just so that I can contextualize what Launchable does better, what is the problem we are trying to solve with Gradle Enterprise or Launchable?

Given the way build acceleration works, I find it rather unlikely that it helps Core build in any meaningful way. The cache is effective if the program being built has a "wide" dependency tree; there, when you touch one module, you can expect a fair number of modules independent from that, so you save. Jenkins core has a very narrow but tall dependency tree; and as Jesse wrote, basically all the time is spent on running tests. As Mark asks in the video, if there's no change you can skip building and testing altogether so you will be done in two minutes, but the moment you touch anything in core, you end up building all the chunky parts.

Test distribution & parallelization would be impactful to reduce the turn-around time, though Jesse sounds like that's something already done in Jenkins core; Is that so? That's news to me. Test distribution outside CI can be really only enabled for trusted developers because opening it up to everybody is basically allowing anybody to run arbitrary code on our infra. I bet that's not something we want to do.

--

You received this message because you are subscribed to the Google Groups "Jenkins Infrastructure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins-infr...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/jenkins-infra/c6295d44-02ca-4aff-b529-1ec6186d9747n%40googlegroups.com.

Mark Waite

unread,

Dec 16, 2021, 7:33:17 AM12/16/21

to Jenkins Infrastructure

On Wed, Dec 15, 2021 at 8:37 PM Kohsuke Kawaguchi wrote:

Sure, thanks for the opportunity, and happy to do it! What would be the right venue?

A meeting of the Jenkins infra team would be a great first context if using Launchable would require infrastructure changes. The infrastructure team meetings are each Tuesday at 2:30 PM UTC.

After the session with the Jenkins infra team, a Jenkins Online Meetup focused on discussing the idea with developers would be the next step.

Just so that I can contextualize what Launchable does better, what is the problem we are trying to solve with Gradle Enterprise or Launchable?

I want to reduce the time we spend running tests on ci.jenkins.io so that we reduce the time required for a developer to know that their pull request is valid and so that we reduce the infrastructure cost associated with running automated tests not affected by the changes since the most recent successful build. A single run of Jenkins core on ci.jenkins.io takes at least 60 minutes. Pull requests that change a relatively unused area of code take the same 60 minutes as pull requests that affect much wider areas of code. As an example of a wide impact change, see PR-5923. As an example of a narrow impact change, see PR-5905. As a more rare example of a very narrow impact change, see PR-6057.

I believe deeply in the automated tests used in Jenkins core. They help us keep the core stable and reliable. I'd love to run tests that are affected by changes in the build and not run tests that are unaffected by the changes in the build.

As justification for the proposal, the ci.jenkins.io job is configured to interrupt the current run if a new change arrives on the branch during the run. The arrival of changes on the master branch is fast enough now that 50% of jobs on the master branch are stopped before they complete because another pull request has been merged to the master branch while the current build is running. I think it is exactly the right behavior to stop the current job if a new change has arrived, but it would be even better if the master branch builds and pull request builds would only run tests that actually evaluate the changed code since the previous successful build.

I would love to have an improvement in the time that a developer spends running tests on their development machine, but do not consider that as the primary goal. Developers can select the subset of tests they run through their IDE. I consider a reduction of developer testing time as a much lower priority than reducing the time spent by jobs on ci.jenkins.io.

I would love to have an improvement in the time that plugins spend in their test automation (like the git plugin's 30+ minutes of automated tests), but I think that is much less impact on the community as a whole than focusing our attention on Jenkins core and its test automation.

Given the way build acceleration works, I find it rather unlikely that it helps Core build in any meaningful way. The cache is effective if the program being built has a "wide" dependency tree; there, when you touch one module, you can expect a fair number of modules independent from that, so you save. Jenkins core has a very narrow but tall dependency tree; and as Jesse wrote, basically all the time is spent on running tests. As Mark asks in the video, if there's no change you can skip building and testing altogether so you will be done in two minutes, but the moment you touch anything in core, you end up building all the chunky parts.

Test distribution & parallelization would be impactful to reduce the turn-around time, though Jesse sounds like that's something already done in Jenkins core; Is that so? That's news to me. Test distribution outside CI can be really only enabled for trusted developers because opening it up to everybody is basically allowing anybody to run arbitrary code on our infra. I bet that's not something we want to do.

The ci.jenkins.io job for Jenkins core runs four parallel tasks, two variants of acceptance test harness, a JDK 8 automated test variant, and a JDK 11 automated test variant. In that sense, we're already running parallel testing on the ci.jenkins.io job.

The ci.jenkins.io job for the full acceptance test harness splits its tests into 8+ parallel threads of execution (and does it for both JDK 8 and JDK 11). I'd love to be able to avoid tests in that environment that are not affected by the changes since the most recent passing build, but I think that is outside the scope of this initial experiment.

Mark Waite

Jesse Glick

unread,

Dec 16, 2021, 9:09:30 AM12/16/21

to jenkin...@googlegroups.com

On Thu, Dec 16, 2021 at 7:33 AM Mark Waite <mark.ea...@gmail.com> wrote:

not run tests that are unaffected by the changes in the build.

It is probably unusual to be able to say definitively that this is so. PR-6057 would be one such case of course, but it would be very hard to prove that any change touching `pom.xml`, `src/`, or `.mvn/` could not possibly fail some test. Projects which do mostly unit testing and do not use reflection can use tools which analyze classes actually loaded by tests (or even finer-grained analysis) to determine that most changes do not affect most tests. But most of our test time is in `JenkinsRule`-based tests which load a large subset of the test classpath; and Jenkins of course uses reflection heavily.

So we could use ML-based tools such as Launchable (or maybe Gradle Enterprise—I could not get a clear idea of what their test prediction feature consists of), but with the caveat that sometimes PRs would slip through that legitimately regress some test and we would need to revert or clean up after merge. You give up some safety in exchange for faster turnaround. We would have to experiment to see how reliable such predictions are. (I am personally often surprised by test failures that as an experienced contributor I would never have thought to check before filing a PR. Which is a good thing! Tests that only verify the functionality you thought you were modifying do not do much to prevent regressions.)

All of the above applies not just to Jenkins core, but to any plugin with a heavy test suite—`git` as you mentioned, `workflow-cps`, `pipeline-model-definition`, and a few others. And it would apply more if we tried harder to run downstream tests (JENKINS-45047 or `plugin-compat-tester`, and/or parts of `acceptance-test-harness`).

The arrival of changes on the master branch is fast enough now that 50% of jobs on the master branch are stopped before they complete because another pull request has been merged to the master branch while the current build is running.

That is fixable trivially by just changing our branch configuration, as I have been pleading for years: INFRA-1633. Again there is a trade-off in that you would occasionally merge a passing PR which results in a trunk test failure, though I expect this is pretty uncommon in practice, especially compared to simple flaky tests. (And if you suspect there will be such a semantic conflict, you can always `git pull origin master:master && git push` to err on the side of caution.) There are also batch-merge tools which solve this problem without giving up trunk safety; GitHub recently announced a private beta which I think we are on the waiting list for.

The ci.jenkins.io job for the full acceptance test harness splits its tests into 8+ parallel threads of execution

Which we could also do for the Jenkins core build. Would not reduce machine costs on ci.jenkins.io (would actually increase them a little), but could reduce wall clock time for core builds (specifically `test` submodule), giving us faster turnaround. (KK is certainly familiar with this system since he is the initial author of the plugin that enables it!)

`jenkinsci/bom` builds are parallelized by plugin and core baseline; not currently by test split (which would mostly benefit the bottleneck plugins mentioned above).

Tim Jacomb

unread,

Dec 16, 2021, 12:02:54 PM12/16/21

to jenkin...@googlegroups.com

> GitHub recently announced a private beta which I think we are on the waiting list for.

I signed us up for it, no response though

--

You received this message because you are subscribed to the Google Groups "Jenkins Infrastructure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins-infr...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/jenkins-infra/CANfRfr0e8nAynoxyLkAvvyZO68Vx%2B7dG_z1NbhqwAfMTxMOadQ%40mail.gmail.com.

Mark Waite

unread,

Dec 17, 2021, 8:17:48 AM12/17/21

to Jenkins Infrastructure

On Thursday, December 16, 2021 at 7:09:30 AM UTC-7 Jesse Glick wrote:

On Thu, Dec 16, 2021 at 7:33 AM Mark Waite wrote:

The arrival of changes on the master branch is fast enough now that 50% of jobs on the master branch are stopped before they complete because another pull request has been merged to the master branch while the current build is running.

That is fixable trivially by just changing our branch configuration, as I have been pleading for years: INFRA-1633. Again there is a trade-off in that you would occasionally merge a passing PR which results in a trunk test failure, though I expect this is pretty uncommon in practice, especially compared to simple flaky tests. (And if you suspect there will be such a semantic conflict, you can always `git pull origin master:master && git push` to err on the side of caution.) There are also batch-merge tools which solve this problem without giving up trunk safety; GitHub recently announced a private beta which I think we are on the waiting list for.

I thought that the arrival rate of changes on the master branch would not be affected by your proposed change (INFRA-1633) of not building pull requests when the target branch (master) changes.

In the cases that I'm seeing, a new change is merged to the master branch less than 60 minutes after the previous change was merged to the master branch. The arrival of the new change starts the build on the master branch before the preceding build on the master branch has completed all its tests. If the build of the master branch could be "smarter" so that it only runs tests that are affected by the change, that might allow tests to complete even during times when changes are arriving on the master branch more frequently.

Can you help me understand further?

Mark Waite

Jesse Glick

unread,

Dec 17, 2021, 10:46:28 AM12/17/21

to jenkin...@googlegroups.com

On Fri, Dec 17, 2021 at 8:17 AM Mark Waite <mark.ea...@gmail.com> wrote:

In the cases that I'm seeing, a new change is merged to the master branch less than 60 minutes after the previous change was merged to the master branch.

Sorry, you are talking about something unrelated to what I thought.

If the build of the master branch could be "smarter" so that it only runs tests that are affected by the change

Well, yes, in theory. This is essentially the same as the desire to only run tests in a PR build which could be affected by the PR diff. As written above, I doubt this is a realistic goal for Jenkins core, unless you explicitly decide to rely on ML heuristics (like Launchable) and accept that some genuine regressions will just be lost—to be caught later by daily builds, or just prior to release.

To give another perspective on what KK said earlier:

The cache is effective if the program being built has a "wide" dependency tree; there, when you touch one module, you can expect a fair number of modules independent from that, so you save. Jenkins core has a very narrow but tall dependency tree

Imagine that we decided to merge Jenkins core and some large number of plugins into a giant monorepo, so that they were all governed by a single version number and released simultaneously, preferably with some plugin manager change to keep all included plugins updated in lockstep, but retained their current dependency structure (a fairly complex directed acyclic graph) just converted to use in-reactor snapshot dependencies like

<version>${project.version}</version>

There are various practical and social difficulties with such a move (who decides which plugins are included and which are “out of tree”? how would you move plugins in or out later?), but just focus on the impact on the build and test process—both locally for developers and on CI. On the one hand, there is a huge advantage that complex changes spanning core and multiple (in-tree) plugins can be done as a single monorepo PR, perhaps even an atomic commit; no need for all the machinery we have built to handle cross-repository dependencies and version skew (PCT, JEP-305, etc.). On the other hand, `mvn verify` in this tree could take hours even with `-T`, running tens of thousands of tests, rendering it impossible for CI builds to offer timely feedback; and developers trying to work on just one plugin

mvn -am -pl my-plugin -Pquick-build install

mvn -pl my-plugin hpi:run

could be frustrated by the first command taking several minutes, and the need to remember to

mvn -pl some-plugin-my-plugin-depends-on -Pquick-build install

whenever making upstream changes (though some IDEs handle this for you at least for Java source code changes).

That scenario is one where the remote caching feature in Gradle Enterprise (or for that matter, more monorepo-focused build tools like Bazel) would be really invaluable. You could say with confidence that a patch to `plugins/lockable-resources/src/main/java/org/jenkins/plugins/lockableresources/LockStep.java` could only possibly affect `plugins/lockable-resources/src/test/java/**/*Test.java` and tests in plugins downstream of it in the dependency tree, so a CI build of such a PR would be reasonably quick (same for the merge commit to `master`). Locally,

mvn verify

would (re-)build and test just the components you actually have local modifications to, and components depending on them, and just skip 95% of the work as it would duplicate results already cached by a CI build of some recent public commit.

This is not the scenario we have, however. Our “build & test result cache” for portions of the dependency graph is effectively released binaries in Artifactory. JEP-229 & Dependabot reduce the friction for percolating changes through the system but do not change the fundamentally distributed workflow.

Disclaimer: I am just speculating on what Gradle Enterprise does, as I have not worked with it, based on https://docs.gradle.com/enterprise/maven-build-cache/#cache_key and the like. Also here I am focussing on the cache feature, not other features like uploaded scans, failure analysis, test distribution (duplicating what we already do with Jenkins agents AFAICT), flakiness reporting, etc.

Kohsuke Kawaguchi

unread,

Dec 17, 2021, 6:00:46 PM12/17/21

to jenkin...@googlegroups.com

Thanks for all the context, Mark and Jesse! It's been a while for me, but now all the memories came back :-)

Please sign me up for the upcoming Jenkins infra team meeting. 14:30 UTC is 6:30am PT, it's early enough I can do just about any week.

Based on this conversation, this is what I'm seeing:

Goal #1: Reduce CI spend. Our infra costs a lot of money to run!
Goal #2: Reduce the turn-around time for developers to get a green light to merge changes

In Jenkins, these problems are deeply intertwined. The key solution that I think everyone is in agreement with is that we need to avoid running meaningless tests, so that every CPU hour spent counts more. There are multiple places this can be done, and this thread hashed out all the major vectors:

Means #1: Predictive test selection (PTS), which selects tests more intelligently and avoids running less valuable tests.
Means #2: Do something with less valuable PR merge builds. We can stop them altogether ala INFRA-1633, or we can employ PTS so that they take less time.
Means #3: Do something with master branch builds; they get aborted too often and aborted builds are a waste of CPU time. Again we can tweak what we choose to build, or we can employ PTS so that they take less time.

When we only test what matters, by combining these approaches, we can save resources, and that saving can be deployed toward goal #1 & #2.

Some of the CPU hour saving can be translated directly into the cost saving.
One common source of frustrations for developers is when the build queue skyrockets, creating a large queue before their workload gets served. If the total work produced from those queue goes down, congestion would ease and the turn-around time goes down (means #1/#2)
Reduction of a PR validation test directly translates to the turn-around time reduction. That is, the 1hr that takes to test any commit goes down to whatever time we'd be willing to expend.
And perhaps more importantly, the CPU hour saving can be redeployed to run tests that we are not running today, thereby improving the quality overall.

The question isn't "when we skip some tests, doesn't that create some regressions?" The question is, "if we got 1hr of CPU time, what tests will contribute the most in our building confidence to this change"? I can guarantee you that 80% of the unit tests are providing very little value toward this goal; we just don't know which 80% :-) I'm also willing to bet that if we stop running those tests and instead use the time to run some additional ATH tests, we could very well be cutting time and improving the quality at the same time.

Launchable is far from perfect, but I believe we formulated the right question, and that's really half the battle. There are so many ways in which PTS can be intelligent that we haven't yet gotten to -- we run all unit tests on two JDK versions, but for most tests it's probably contributing very little value to run them on two platforms, for example.

And above all, how to test what matters should be a simple data driven decision. At the end of the day, in this entire thread we are "just" debating different heuristics that we think provides the best bang for the buck, when really it's just a "simple" optimization problem that can be left for machines. That is the ultimate problem I'm trying to solve with Launchable.

As Jesse knows, my experience with Jenkins is a major contributing factor for my choosing to tackle this very problem, so it means a lot to me that I can help this project and grow with it.

--

You received this message because you are subscribed to the Google Groups "Jenkins Infrastructure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins-infr...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/jenkins-infra/CANfRfr2%2BS-nuUr%3DHUA8Df11ouk%3DouDq2CLwLiWGEgfnMVP%2BXPg%40mail.gmail.com.

Reply all

Reply to author

Forward