Prevent building branches/PRs existing before the first branch indexing (Branch API plugin)

1,440 views
Skip to first unread message

Adam Gabryś

unread,
Apr 15, 2020, 9:11:54 AM4/15/20
to Jenkins Developers

Hello,

we provide CI for many teams in a big company. Unfortunately, we have a huge problem with server loads after the spin up process, after restarts and jobs modification. Multibranch jobs execute branch indexing on every single change and directly after the job creation. It is great, because people see what is available. On the other side, when people don't delete old branches, we waste a lot of resources on building old stuff. In our case it completely blocks servers for days. We tried to use build strategies, but none of them solved it. The closest solution was SkipInitialBuildOnFirstBranchIndexing, but unfortunately it:

  • blocks all new branches and PRs builds executed by webhooks (it always skip the first build no matter how it is triggered)
  • won’t solve the problem when the branch indexing is executed for a second time (again, everything is build)

For now we use the patch visible in this PR (I'm aware that it won't be merged - it is just a workaround, I skipped tests). I wanted to ask - do you have any ideas on how to solve this properly? We are open to implement the proposed solution 🙂

For now, I have two ideas to make it configurable:

  1. add an option to disable automatic builds on branch indexing
  2. extend BranchBuildStrategy class and pass information about the trigger cause

Both solutions don't cover the workaround solution, because they will only allow the skipping of all builds caused by branch indexing (we will lose the ability to execute builds of branches/PRs for which webhooks have not been received - for example, lost due to network problems).


Disable Automatic build on branch indexing

The first solution requires to add a new property to the multibranch project and simply skip the build if causeFactory instanceof IndexingCauseFactory. I don't know how to add this configurable property, but I can get this knowledge 😉

This approach does not change the API, just adds a new parameter.


Extend BranchBuildStrategy API

This approach is backward compatible, but introduces new methods to API (in a class which is extended by many plugins).

How could we implement it:

  • add a new BranchBuildStrategy#isAutomaticBuild method which takes one more parameter (the trigger cause - indexing vs. webhook):
public boolean isAutomaticBuild(@NonNull TriggerCause cause,
                                @NonNull SCMSource source,
                                @NonNull SCMHead head,
                                @NonNull SCMRevision currRevision,
                                @CheckForNull SCMRevision lastBuiltRevision,
                                @CheckForNull SCMRevision lastSeenRevision,
                                @NonNull TaskListener listener) {
    // by default delegate to the version without the cause
    return isAutomaticBuild(source, head, currRevision, lastBuiltRevision, lastSeenRevision, listener);
}
  • add a new BranchBuildStrategy#automaticBuild which will be executed by the MultiBranchProject:
public final boolean automaticBuild(@NonNull TriggerCause cause,
                                    @NonNull SCMSource source,
                                    @NonNull SCMHead head,
                                    @NonNull SCMRevision currRevision,
                                    @CheckForNull SCMRevision lastBuiltRevision,
                                    @CheckForNull SCMRevision lastSeenRevision,
                                    @NonNull TaskListener listener) {
        if (Util.isOverridden(BranchBuildStrategy.class, getClass(), "isAutomaticBuild", TriggerCause.class,
                SCMSource.class, SCMHead.class, SCMRevision.class, SCMRevision.class, SCMRevision.class,
                TaskListener.class)) {
            return isAutomaticBuild(cause, source, head, currRevision, lastBuiltRevision, lastSeenRevision, listener);
        }
        return automaticBuild(source, head, currRevision, lastBuiltRevision, lastSeenRevision, listener);
}


It falls back to the original automaticBuild method (backward compatible).

  • create a new build strategy (in new or existing plugin) SkipBuildOnBranchIndexing

Let me know what do you think.


Kind regards
Adam Gabryś

Tony Noble

unread,
Apr 16, 2020, 6:25:58 AM4/16/20
to jenkin...@googlegroups.com
From my experiences with this, I sympathise with your frustration.  In our situation (not sure if this maps with yours) the ideal sequence of events would be:

1. Project is created
2. Branches are indexed, but not built
3. Polling (or hooks) now enabled
4..?  Scheduled indexing now auto-builds new branches if selected.  Hooks will trigger builds

The main thing is that on creation of a multibranch project referencing a repository with a significant number of old branches, everything doesn't get built all at once - like yourself, we found this would happily bring a Jenkins instance to its knees quite effectively.  Once the initial creation and first branch index is complete, however, existing functionality is fine.

Tony

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/751d8c6f-50cd-4d10-bb90-861690f5d1fc%40googlegroups.com.
Message has been deleted

Adam Gabryś

unread,
Apr 16, 2020, 5:03:15 PM4/16/20
to Jenkins Developers
Hi Tony,
Sorry, but I don't understand. You wrote the existing functionality is fine, but the current behavior is as follow:
  1. Project is created
  2. Branches are indexed, all branches are built ← kills the server
  1. Polling (or hooks) now enabled
  1. Scheduled indexing now auto-builds new branches if selected. Hooks will trigger builds
    My workaround changes it to what you described:
    1. Project is created
    1. Branches are indexed, but not built
    1. Polling (or hooks) now enabled
    1. Scheduled indexing now auto-builds new branches if selected. Hooks will trigger builds
      I think such feature is required by many people, so I would like to implement a solution which could be merged to the plugin. Patching every plugin's release is very ineffective and error prone.

      My workaround was declined (I was expecting that), so I'm looking for a way to implement it in a way which will be approved by the maintainers.

      Kind regards
      Adam Gabryś

      James Nord

      unread,
      Apr 18, 2020, 4:44:48 AM4/18/20
      to Jenkins Developers
      what scm prilovider are you using?

      I fixed this by adding the only build matching branches to be 'master Stable-* PR-*'

      this builds only the branches matching any one of those. the PR-* matches pull requests in GitHub (even if they are from a branch called wibble). obviously you should tune master and stable-* to match your use case.

      Jesse Glick

      unread,
      Apr 18, 2020, 9:33:42 AM4/18/20
      to Jenkins Dev
      On Wed, Apr 15, 2020 at 9:11 AM Adam Gabryś <adam....@live.com> wrote:
      > when people don't delete old branches

      FYI, for GitHub at least you can address this directly:

      https://help.github.com/en/github/administering-a-repository/managing-the-automatic-deletion-of-branches

      tony....@gmail.com

      unread,
      Apr 18, 2020, 11:48:15 AM4/18/20
      to jenkin...@googlegroups.com
      I think the point is to prevent an automatic build of a shedload of old pre-existing branches when creating a multi branch pipeline job for an older repo, even though from then onwards, you want new branches to be auto-discovered and built. Regexes can’t always help with this.

      It’s not really an argument about whether repositories *should* be in that state, it’s that sometimes they are and you need to be able to work with them without crippling a Jenkins server for hours at a time when creating new jobs.

      Sent from my iPhone

      > On 18 Apr 2020, at 14:33, Jesse Glick <jgl...@cloudbees.com> wrote:
      > --
      > You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
      > To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
      > To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr3bpVX4owimCtFYmjeg8-Wmh9L9nbY942JgtccP-xTVog%40mail.gmail.com.

      Daniel Beck

      unread,
      Apr 19, 2020, 6:40:07 AM4/19/20
      to jenkin...@googlegroups.com

      Adam Gabryś

      unread,
      Apr 19, 2020, 9:32:36 AM4/19/20
      to Jenkins Developers
      Thank you for you comments. Answers to all proposals and questions bellow:

      > what scm prilovider are you using?

      At this point we support only GitHub, but we may support different SCM in the future too.


      > I fixed this by adding the only build matching branches to be 'master Stable-* PR-*'

      > Regexes can’t always help with this.

      This doesn't prevent building branches after the job creation. If I inform developers that only `ABC-*` branches and PR are build, then all branches will be called `ABC-`, because developers need CI results (it executes a lot of additional tools like: SonarQube or WhiteSource)


      > for GitHub at least you can address this directly

      SCM services are managed by other people. Even if all teams always delete old branches, the load is still quite huge. Example: Let's have 35 teams, every has 10 branches (develop, master, 5 features, 2 epic branches and 1 for previous release) and 2 PRs. Imagine an average build takes about 1 hour (monolit applications, some needs 20 minutes, other 2 hours). After a server creation (K8S environment, server fully provisioned). I have 35 * 12 = 420 builds. We have 120 dynamic slaves. 420 / 120 = 3.5. It means that the server is blocked for 3.5 hours. We also waste money, because the server is hosted on cloud.

      Of course it is a good idea to always ask developers to enable this option, just to keep repositories clean.
      We cannot use it because:
      1) If developers need to build an older version of any application, they should be able to do it (for many reasons, like execute deployment somewhere). I would like to not force them to push commits like "empty, show the branch on CI"
      2) the application is under heavy development, even building branches and PRs not older than 1 week - kills the server

      James Nord

      unread,
      Apr 19, 2020, 9:44:30 AM4/19/20
      to Jenkins Developers
      > > I fixed this by adding the only build matching branches to be 'master Stable-* PR-*'

      > > Regexes can’t always help with this.

      > This doesn't prevent building branches after the job creation.

      correct but it will build interesting branches not dangling ones.

      > If I inform developers that only `ABC-*` branches and PR are build, then all branches will be called `ABC-`, because developers need CI results (it executes a lot of additional tools like: SonarQube or WhiteSource)

      if a developer wants a build they are still able to file a draft PR in GitHub, so if you tell them that why would they create branches called ABC-?

      Adam Gabryś

      unread,
      Apr 19, 2020, 10:44:47 AM4/19/20
      to Jenkins Developers
      There are two strategies for building PRs:
      1) test PR before the merge operation - the same as building the source branch
      2) test PR after the (virtual) merge operation - presents the state after merging

      We use a second strategy, because the source branch could work standalone, but break everything with the latest changes.

      It means that if we force developers to create a draft PR to just execute a build, they will have to:
      * create a new branch from the branch which is interesting for them
      * create a PR from the new branch to the existing branch
      I don't believe any developer would like to use such flow.

      > correct but it will build interesting branches not dangling ones.

      I'm confused about these "interesting" branches and filtering by regex. In my company people use Git-Flow. Branches are created to work on new features or bug fixes. I would classify all code changes as interesting to execute on the CI.

      Liam Newman

      unread,
      Apr 23, 2020, 3:55:17 PM4/23/20
      to Jenkins Developers

      Adam, 
      I think the feature/behavior you're suggesting is definitely worth implementing.  It would let people safely create new projects without dealing with build storms.

      So to reiterate - the behavior we're looking for is:
      1. Project is created
      2. Branches are indexed, but not built
      3. Polling (or hooks) now enabled
      4. Scheduled indexing now auto-builds new branches if selected. Hooks will trigger builds
      You mentioned SkipInitialBuildOnFirstBranchIndexing .  I think something like that is way to go here.  Only instead of disabling the "first indexing of each branch", it would be "Skip Build on First Job Indexing".  I know there are several possible ways ways to detect the state "this is the first time a project has been indexed", but I'm not sufficiently knowledgable to be able to point you to the exact right spot.  Sorry. But I firmly believe this should be doable without changing the SCM or Branch API.  We just need to do some digging to find it.  

      Sound good? 

      -L. 

      Jon Brohauge

      unread,
      Apr 24, 2020, 3:48:12 AM4/24/20
      to Jenkins Developers
      We have a lot of activity in regards to spinning up new Jenkins with preconfigured pipeline jobs, and thus have a lot of "first builds". These are IMHO just a waste of agent-capacity, and I would love to be able to trigger a "index, but don't build".

      Regards,
      Jon

      Daniel Beck

      unread,
      Apr 24, 2020, 7:07:57 AM4/24/20
      to Jenkins Developers


      > On 24. Apr 2020, at 09:48, Jon Brohauge <jonbr...@gmail.com> wrote:
      >
      > I would love to be able to trigger a "index, but don't build".

      Isn't this possible with a suitable build strategies configuration? For example "Named Branches" and the branch name you specify doesn't exist and never will. Should also be possible to write a plugin that just builds nothing automatically, if it doesn't exist yet.

      It's a bit unfortunate that having no build strategy doesn't do what one would expect.

      Tony Noble

      unread,
      Apr 24, 2020, 7:23:01 AM4/24/20
      to jenkin...@googlegroups.com
      This gets my vote - exactly the sequence that I've wished existed for some time now.



      --
      You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

      Jon Brohauge

      unread,
      Apr 24, 2020, 7:42:38 AM4/24/20
      to jenkin...@googlegroups.com
      If implementing a solution like the one proposed by Adam works, then why not do it? It gives everyone choice, without the need to create yet another plugin that is basically an "on/off"-switch.

      In our case, our developer teams orders a Jenkins that, at first boot, gets prepopulated with the requested Github Organizations, that the developer team wants. This is fully automatic with some groovy-init-scripts in a dockerized Jenkins instance. 

      Our Jenkins instances contain no state, since everything is configured by a combination of JCasC and Jenkinsfiles. To keep the Jenkins instances from growing stale, we treat them as cattle, and as such these instances have a relative short lifetime.

      Whenever a developer wants to create a new instance, they must wait for all the builds, of all the branches in all the repositories in all their organizations to end, before actually moving forward, they have web-hook in their Github Organizations that trigger new builds on commit. This creates a lot of builds, that IMO are a waste of time, since all the code in those repositories probably already has been built. Having a flag that does what has been proposed, solves this issue for us.

      We do have a build strategy, albeit a simple one. Deployable artifacts are built from master-branch. Whatever processes, rules, and occult rites each developer team has to perform getting their code to master, is beyond our concern.

      We do not believe in requiring developer teams to name their branches in a certain way. It is the responsibility of the developer team to have a naming strategy for their organizations, repositories, and branches. We treat our developers to be professionals and thus expect them to be just that, professional. Being a professional implicitly means that you are a responsible employee, and thus do not need to be micromanaged.

      Adam Gabryś

      unread,
      Apr 24, 2020, 8:39:24 AM4/24/20
      to Jenkins Developers
      @Liam Newman

      It is impossible to implement in a nice way a new strategy without changing the BranchBuildStrategy class. I have access only to stuff, which are listed there. I wouldn't like not to implement a class which requires to configure job name, and next use Jenkins API to get the item by name etc.

      "Skip Build on First Job Indexing" is not enough. Users are able to configure indexing on regular basics (for example once per day). It is useful, if a webhook is not delivered, then on thenext day developers may execute the job from Jenkins UI. Without indexing the branch in such case will never be available.

      I think the cleanest solution is to pass information about the build reason to the strategy (indexing or event). This gives ability to implement simple strategies like "SkipBuildOnJobIndexing" or more advanced if any body needed. For us and I think most of the people "SkipBuildOnJobIndexing" is enough (probably most of the people use hooks instead of polling).


      @Daniel Beck

      The BranchBuildStrategy class gives access to SCMHead, so you are able to get a branch name. There is no information about the build cause.


      @Joh Brohauge

      My solution works, but it is not clean. As I wrote above, the cleanest solution is to add a build cause to the BranchBuildStrategy class. Then people may implement simple and complex strategies required by them.

      Finally, we have exactly the same situation as you, I provide CI as a Service, I don't own or influence teams repositories.

      Tony Noble

      unread,
      Apr 24, 2020, 9:07:51 AM4/24/20
      to jenkin...@googlegroups.com
      I'm confused.

      As far as I was aware, the 'Suppress automatic SCM triggering' property already does what you require - branches will be indexed, but no builds will ever be triggered by an SCM poll

      Or perhaps there's three different requirements here - as I've already said, I'm a big fan of the "different behaviour on first indexing" approach, but it looks like your requirement is a little more specific.



      --
      You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

      Adam Gabryś

      unread,
      Apr 24, 2020, 12:16:16 PM4/24/20
      to Jenkins Developers
      "Suppress automatic SCM triggering" skips all builds executed by SCM change - indexing and hooks (events). Only jobs manually executed are built. See source code.

      However you suggestion is very good. I could make this branch property parametrized:
      • all - skip both (default)
      • indexing - skip only indexing
      • events - skip only webhooks
      My current workaround skips builds of branches found during the job creation. It means that if you create a job, next a branch and a webhook won't be delivered, my workaround builds it on indexing. Unfortunately, to implement this the API must be extended in more places. The idea with extending the branch property does not require many changes and protects against building the whole world after the job creation. I think it solves the problem good enough as for the first try.

      I'll propose this in the PR.

      Thanks :)

      Adam Gabryś

      unread,
      Apr 24, 2020, 12:56:21 PM4/24/20
      to Jenkins Developers
      I added a new comment to the PR with a proposed solution.
      Reply all
      Reply to author
      Forward
      0 new messages