Performance regression due to symbol annotations

57 views
Skip to first unread message

Marcin Zajączkowski

unread,
Aug 3, 2016, 4:33:54 AM8/3/16
to job-dsl-plugin
Hi,

After migration to Jenkins 2.11 (from 1.642.4) some time ago we noticed a significant performance regression in the execution time of our Job DSL seeds. For example a time of generation ~3500 jobs in one seed doubled (from ~7 minutes to ~15).

I analyzed the problem and after digging it out, profiling and bisecting the code I was able to find a reason. The "regression" was introduced in Jenkins 2.2 and affects only Jenkins 2.2+ and Job DSL 1.46+. Job DSL 1.46 started to use @Symbol to use SymbolLookup from the Structs plugin to find DescribableModels (in a DescribableHelper class). In general symbols are good, but also seems to be much slower.

This is especially painful in a seed which creates dozens/hundreds of the same pipelines for different µservices/products/realms (for example based on custom configuration descriptors resolved in a Git repo) - the same lookup operations are repeated. It begs to be cached. A quick PoC (with global caching of methods in DescribableHelper) decreased a part build internal representation of jobs to be generated in Job DSL (of already mentioned seed) fror ~11 minutes 30 seconds to just ~30 seconds.

Of course indefinitely caching everything globally is not the best possible option (I did it just to verify my thesis). However, having an ability to cache those calls per seed execution seems to be a very interesting idea to mitigate mentioned issue and by the way improve generation performance even more. This feature could be optionally disabled/enabled in a seed job configuration.

I'm not very knowledgeable in the internal Jenkins model and maybe there are some important issues/side effects that I'm not aware of. What do you thing about that idea?


As an alternative approach some optimizations could be done at the Jenkins level (probably not only Job DSL uses that new mechanism).

Marcin

P.S. I've created a corresponding issue in Jira to track progress - https://issues.jenkins-ci.org/browse/JENKINS-37138

--
http://blog.solidsoft.info/ - Solid Soft - Working code is not enough

Daniel Spilker

unread,
Aug 9, 2016, 4:38:47 PM8/9/16
to job-dsl-plugin
Thanks for investigating!

Do you have a pointer to the commit in Jenkins 2.2 that introduced the performance problem?

And have you pushed your caching PoC to GitHub so that I can take a look? It should be OK to cache the lookups. The cache only needs to be invalidated when a new plugin is installed. I'm already doing that for the embedded API viewer:
https://github.com/jenkinsci/job-dsl-plugin/blob/job-dsl-1.48/job-dsl-plugin/src/main/groovy/javaposse/jobdsl/plugin/JobDslPlugin.groovy#L21-L26

Daniel

Daniel Spilker

unread,
Aug 10, 2016, 8:18:23 AM8/10/16
to job-dsl...@googlegroups.com
I did some promising experiments with caching. Can you test https://github.com/jenkinsci/job-dsl-plugin/pull/891 in your environment to see if there is a benefit?

Daniel

--
You received this message because you are subscribed to the Google Groups "job-dsl-plugin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to job-dsl-plugin+unsubscribe@googlegroups.com.
To post to this group, send email to job-dsl-plugin@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/job-dsl-plugin/b323e1d2-7444-411f-becd-87eb6c6887da%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

al...@hollytree.co.uk

unread,
Aug 10, 2016, 11:41:28 AM8/10/16
to job-dsl-plugin
For what it's worth, I am also seeing approximately 6x longer processing times with 1.48 than 1.45 on Jenkins 1.658
To unsubscribe from this group and stop receiving emails from it, send an email to job-dsl-plugi...@googlegroups.com.
To post to this group, send email to job-dsl...@googlegroups.com.

Mike Rooney

unread,
Aug 12, 2016, 2:56:08 PM8/12/16
to job-dsl-plugin
I'm happy to test it but not sure how to build the hpi/jpi. I've got your branch checked out but can't find a directory where "mvn install" works like other plugins. Any tips?

We are also seeing much longer job generation times since upgrading from 1.40 => 1.48 (at the same time as we upgraded from Jenkins 1.651 LTS to Jenkins 2.71 LTS).

Mike Rooney

unread,
Aug 12, 2016, 4:44:50 PM8/12/16
to job-dsl...@googlegroups.com
Got it, "./gradlew jpi" from the repo root :)

Tested locally first then deployed to our production instance which uses job-dsl to generate few hundred jobs. It's awesome! We have 4 different jobs that generate different DSL from different repos, they went from ~15 min to ~1 min or less. Great work! Would love to see this in 1.49 for the next release.

--
You received this message because you are subscribed to a topic in the Google Groups "job-dsl-plugin" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/job-dsl-plugin/3QCC-T2hr-Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to job-dsl-plugin+unsubscribe@googlegroups.com.
To post to this group, send email to job-dsl-plugin@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/job-dsl-plugin/06f1f0c0-851c-4858-ae69-4afc65593d2b%40googlegroups.com.

Marcin Zajączkowski

unread,
Aug 13, 2016, 6:51:01 PM8/13/16
to job-dsl...@googlegroups.com
On 2016-08-09 22:38, Daniel Spilker wrote:
> Thanks for investigating!
>
> Do you have a pointer to the commit in Jenkins 2.2 that introduced the
> performance problem?

Two commits by Kohsuke adding @Symbol annotations and even more @Symbol
annotations to Jenkins core. Without them scanning for DescribableModel
(even with the new version of Job DSL was significantly faster) -
processing all those annotations is just slow in Jenkins itself.

https://github.com/jenkinsci/jenkins/commit/3d439015d822b4a3e4d4b111eb938af589b7abe3
https://github.com/jenkinsci/jenkins/commit/26f824632aa33b8ce7c2bd9cf3b34a8ede018c94

> And have you pushed your caching PoC to GitHub so that I can take a look?
> It should be OK to cache the lookups. The cache only needs to be
> invalidated when a new plugin is installed. I'm already doing that for the
> embedded API viewer:
> https://github.com/jenkinsci/job-dsl-plugin/blob/job-dsl-1.48/job-dsl-plugin/src/main/groovy/javaposse/jobdsl/plugin/JobDslPlugin.groovy#L21-L26

I had something similar as a quick & dirty patch to verify the result on
our Jenkins (and it helped a lot). However, first I wanted to ask you
about possible side effects - like a change ability of those models at
runtime (at they mutable at the Jenkins side?) - is it safe to kept them
cached "forever" for the whole server instance? Because of that I was
thinking about rewriting it to have them cached for given job's
execution only. Anyhow, I don't Jenkins model and maybe I'm too careful.

I see that you already made and merged the PR. My bad, I was on holidays
and I lost my change to contribute so great performance boost to job-dsl :(.

I will try to test your code the following week.

Marcin
--
http://blog.solidsoft.info/ - Working code is not enough

Daniel Spilker

unread,
Aug 15, 2016, 3:00:30 AM8/15/16
to job-dsl...@googlegroups.com
Ah, sorry. I was on vacation before your vacation so, I couldn't reply earlier.

But I'm glad that we where able to fix the performance problems.

Daniel


--
You received this message because you are subscribed to the Google Groups "job-dsl-plugin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to job-dsl-plugin+unsubscribe@googlegroups.com.

To post to this group, send email to job-dsl-plugin@googlegroups.com.

Marcin Zajączkowski

unread,
Aug 15, 2016, 3:46:07 PM8/15/16
to job-dsl...@googlegroups.com
On 2016-08-15 09:00, Daniel Spilker wrote:
> Ah, sorry. I was on vacation before your vacation so, I couldn't reply
> earlier.

No problem, I should use ";(" instead of ":(". The effect seems to be
achieved and I'm happy about that :).

In general, I'm glad that the issue with @Symbol occurred. It drove me
to dive into the problem and in the end thanks to caching the overall
performance is much better even than before the regression. It is
especially visible in the seed jobs generating hundreds or thousands of
jobs!

Before fixing that I was planning to get onto the way to make seed job
execution paralleled, but currently the ROI just seems to be too low.

Anyway, I'm still a little bit concerns about live-long caching all
those Jenkins' DescribableModel. Can they be removed in Jenkins (by some
other operations outside job-dsl) or can the reference be changed
leaving job-dsl plugin with outdated elements?

Marcin
http://blog.solidsoft.info/ - Working code is not enough

Reply all
Reply to author
Forward
0 new messages