[JIRA] (JENKINS-48571) checkout scm fails silently after "Could not determine exact tip revision of <branch>" in logs

1,599 views
Skip to first unread message

tristan@techsoft3d.com (JIRA)

unread,
Feb 12, 2018, 4:25:03 PM2/12/18
to jenkinsc...@googlegroups.com
Tristan Lewis commented on Bug JENKINS-48571
 
Re: checkout scm fails silently after "Could not determine exact tip revision of <branch>" in logs

The PR which worked around the issue got closed

This issue is giving us a huge headache. We have a system of dozens of multibranch pipelines that are generated by a jobDSL script. Every time the jobs are re-seeded after updates to the jobDSL script, they all get into a broken state where none of them can successfully perform an SCM checkout step, manifesting with this "could not determine exact tip revision" error. I've had to create a utility job to force re-indexing all of the jobs. This resolves the jobs being in a broken state but due to how many multibranch pipeline jobs we have and how many branches each of them indexes, only about half the jobs get indexed before our git provider (BitBucket) starts rate-limiting us due to too many API requests are made from the branch indexing operations.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

andrew.bayer@gmail.com (JIRA)

unread,
Feb 12, 2018, 5:08:03 PM2/12/18
to jenkinsc...@googlegroups.com

alexander.suter@axonivy.com (JIRA)

unread,
Feb 16, 2018, 3:53:02 PM2/16/18
to jenkinsc...@googlegroups.com
Alex Suter commented on Bug JENKINS-48571
 
Re: checkout scm fails silently after "Could not determine exact tip revision of <branch>" in logs

This problem is really annoying. After jenkins master restart, we have to rescan all builds, which builds all branches! And we quite often restart jenkins to update jenkins itself and also the system on which jenkins is running. The closed PR would at least solve the problem for the moment. In the meantime, can we work on a better solution?

ygorth@gmail.com (JIRA)

unread,
Feb 16, 2018, 4:14:04 PM2/16/18
to jenkinsc...@googlegroups.com

I'm in the same boat, Alex Suter. My team is already talking about a possible migration to GoCD. Like you, we are always updating Jenkins and its plugins and this problem is causing a series of unexpected deployments. Anyways, total chaos in my environment right now.

ygorth@gmail.com (JIRA)

unread,
Feb 16, 2018, 4:16:04 PM2/16/18
to jenkinsc...@googlegroups.com
Ygor Almeida edited a comment on Bug JENKINS-48571
I'm in the same boat, [~alexsuter]. My team is already talking about a possible migration to GoCD. Like you, we are always updating Jenkins and its plugins and this problem is causing a series of unexpected deployments. Anyways, total chaos in my environment right now.  At this point, even a workaround is very welcome.

ygorth@gmail.com (JIRA)

unread,
Feb 16, 2018, 4:17:03 PM2/16/18
to jenkinsc...@googlegroups.com
Ygor Almeida edited a comment on Bug JENKINS-48571
I'm in the same boat, [~alexsuter]. My team is already talking about a possible migration to GoCD. Like you, we are always updating Jenkins and its plugins and this problem . This issue is causing a series of unexpected deployments. Anyways, total chaos in my environment right now. At this point, even a workaround is very welcome.

mneale@cloudbees.com (JIRA)

unread,
Feb 16, 2018, 9:58:03 PM2/16/18
to jenkinsc...@googlegroups.com

mneale@cloudbees.com (JIRA)

unread,
Feb 16, 2018, 9:58:05 PM2/16/18
to jenkinsc...@googlegroups.com
Michael Neale commented on Bug JENKINS-48571
 
Re: checkout scm fails silently after "Could not determine exact tip revision of <branch>" in logs

ugh this sounds bad. I have a server that I update every day and have not seen this, but tyler has similar and has.. 

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 17, 2018, 4:42:03 AM2/17/18
to jenkinsc...@googlegroups.com

So I suspect the affected users are not configuring jobs through the UI.

When I configure the job via the UI, then the ID gets assigned correctly.

So unless somebody can show otherwise, this smells a lot like "user error" (not necessarily that it is the user's fault for the error mind)

Case in point, I took my own Jenkins and created a fresh multibranch project and added a GitSCMSource via the UI and saved =>

I inspected the config.xml on-disk manually and we have:

  <sources class="jenkins.branch.MultiBranchProject$BranchSourceList" plugin="branc...@2.0.18">
    <data>
      <jenkins.branch.BranchSource>
        <source class="jenkins.plugins.git.GitSCMSource" plugin="g...@3.7.0">
          <id>d3e70531-9f4d-4a7b-972f-339296b80997</id>
          <remote></remote>
          <credentialsId></credentialsId>
          <traits>
            <jenkins.plugins.git.traits.BranchDiscoveryTrait/>
          </traits>
        </source>
        <strategy class="jenkins.branch.DefaultBranchPropertyStrategy">
          <properties class="empty-list"/>
        </strategy>
      </jenkins.branch.BranchSource>
    </data>
    <owner class="org.jenkinsci.plugins.workflow.multibranch.WorkflowMultiBranchProject" reference="../.."/>
  </sources>

which matches exactly the id that is in memory.

My initial suspect would be code like the job-dsl plugin. If the user was relying on an accidental side-effect that resulted in the id getting queried before the save, then that would explain things.

As I understand, CodeValet uses some automatic configuration mechanism to define jobs... looks like that mechanism is not assigning an id before configuring the list of BranchSources.

Now in https://github.com/jenkinsci/scm-api-plugin/blob/master/docs/consumer.adoc#scmsourceowner-contract we have

Ensure that SCMSource.setOwner(owner) has been called before any SCMSource instance is returned from either SCMSourceOwner.getSCMSources() or SCMSourceOwner.getSCMSource(id).

We should probably make more a explicit part of the contract of a SCMSourceOwner that all SCMSource instances that are added to the owner must have an id assigned before ownership is set (or we can fall back to ensuring an id has been assigned as a (minimal) side-effect of calling SCMSource.setOwner(owner).

That would minimize the issue for users, but probably should be a separate ticket.

I also thought I had documented somewhere (but I cannot find where... and https://github.com/jenkinsci/branch-api-plugin/blob/master/docs/user.adoc seems considerably more anemic than I thought I had left it) that it was critical - if using stuff like JobDSL - that you must assign an id...

Hmmm https://github.com/jenkinsci/scm-api-plugin/blob/master/docs/implementation.adoc#implementing-jenkinsscmapiscmsource has a note on IDs...

SCMSource IDs
The SCMSource’s IDs are used to help track the SCMSource that a SCMHead instance originated from.

If - and only if - you are certain that you can construct a definitive ID from the configuration details of your SCMSource then implementations are encouraged to use a computed ID.

When instantiating an SCMSource from a SCMNavigator the navigator is responsible for assigning IDs such that two observations of the same source will always have the same ID.

In all other cases, implementations should use the default generated ID mechanism when the ID supplied to the constructor is null.

An example of how a generated ID could be definitively constructed would be:

Start with the definitive URL of the server including the port

Append the name of the source

Append a SHA-1 hash of the other configuration options (this is because users can add the same source with different configuration options)

If users add the same source with the same configuration options twice to the same owner, with the above ID generation scheme, it should not matter as both sources would be idempotent.

By starting with the server URL and then appending the name of the source we might be able to more quickly route events.

The observant reader will spot the issue above, namely that we need to start from an URL that is definitive. Most SCM systems can be accessed via multiple URLs. For example, GitHub can be accessed at both https://github.com/ and https://github.com./. For internal source control systems, this can get even more complex as some users may configure using the IP address, some may configure using a hostname without a domain, some may configure using a fully qualified hostname…​ also ID generation should not require a network connection or any external I/O.

But that was not the note I thought I wrote.

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 17, 2018, 5:05:06 AM2/17/18
to jenkinsc...@googlegroups.com

Action Items

  • R. Tyler Croy Can you provide details as to how you create the multibranch projects, specifically the code snippet where you set the sources on the multibranch project.
  • Leandro Narosky / Tristan Lewis / Ygor Almeida Can you confirm/deny whether you are using something like the JobDSL or other non-UI mechanism to create the affected multibranch projects. (if my theory is correct, the solution is completely in your hands by fixing your scripts to assign a non-null ID. The ID can be anything, e.g. dummy, as long as it is unique within that SCMSourceOwner. The purpose of the ID is to allow determining branch take-over in the case where multiple sources have been configured in the multi-branch project, which results from the original vision that things like pull requests would be discovered by using a second source rather than having the primary source discover them through traits.
  • Andrew Bayer The correct hack fix, if we need a fix (see note), should be in branch-api, we should be able to ensure that we call getId() on all sources before the save() which would prevent the issue on behalf of users that are unaware of the requirements to assign IDs.
    • There are two ways you could achieve this: 1. BranchSource's constructors could just call getId(). 2. MultibranchProject's save() could just iterate all the sources calling getId(). I'll let you decide whether you want to do one or both of these. With this done, you should have the minimum fix.
    • You may want to consider adding a reflection based pre-emptive fix for the special case where there is one and only one source without an ID.
    • Basically, in onLoad you could look at the sources and by reflection peek at id if exactly one of those inspected id}}s is {{null then you can look at all the child branches and see if there is exactly one unknown id among all the child projects (and that id should not be the special "dead branch" id)... if that set of conditions is met then you can assign the discovered id and trigger a save()
    • The above could be a lot of work, but it would benefit users as anyone affected by this issue and in the majority case of exactly one source would get their jobs fixed automatically on restart.
    • The above would be the maximal fix, which would be nicer in that it fixes things for users.
    • NOTE: R. Tyler Croy / Leandro Narosky / Tristan Lewis / Ygor Almeida if Andrew Bayer implements this then you will get build storms every time you reconfigure the project using whatever mechanism you are using to update the job configuration.

Assuming my theory is correct, and you are overwriting the sources periodically. Since the sources you are overwriting with do not have an id assigned, we will keep assigning new ones, thus all the branches will be rebuilt as there was a "takeover"... but the events will be picked up correctly.

This is the reason why I did not like the fix that Andrew Bayer attempted in https://github.com/jenkinsci/scm-api-plugin/pull/49 although I could not articulate it properly at the time.

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 17, 2018, 6:40:03 AM2/17/18
to jenkinsc...@googlegroups.com

tyler@monkeypox.org (JIRA)

unread,
Feb 17, 2018, 8:39:02 AM2/17/18
to jenkinsc...@googlegroups.com

Stephen Connolly none of the Multibranch Pipelines I configure are "automated." They are however largely GitHub Organization Folders (see also).

alexander.suter@axonivy.com (JIRA)

unread,
Feb 17, 2018, 9:58:03 AM2/17/18
to jenkinsc...@googlegroups.com

Hi Stephen Connolly

I always create my multibranch build pipelines the following way:

  1. Blue Ocean UI
  2. Create new pipeline
  3. Choose Git
  4. Enter a git repository url from Bitbucket
  5. Create

So I always use the UI. Please let me know, if I can give any other informations. In my environement it is always reproducibale.

alexander.suter@axonivy.com (JIRA)

unread,
Feb 17, 2018, 9:59:02 AM2/17/18
to jenkinsc...@googlegroups.com
Alex Suter edited a comment on Bug JENKINS-48571
Hi [~stephenconnolly]


I always create my multibranch build pipelines the following way:
# Blue Ocean UI
# Create new pipeline
# Choose Git
# Enter a git repository url from Bitbucket
# Create

So I always use the UI. Please let me know, if I can give any other informations. In my environement it is always reproducibale.
(And I always use declarative pipelines)

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 17, 2018, 10:39:02 AM2/17/18
to jenkinsc...@googlegroups.com

So looking at https://github.com/jenkinsci/blueocean-plugin/blob/master/blueocean-git-pipeline/src/main/java/io/jenkins/blueocean/blueocean_git_pipeline/GitPipelineCreateRequest.java

I note that it appears BlueOcean does not provide an ID when creating jobs (and BlueOcean bypasses the classic UI screen, so should be responsible for setting an ID)

In the case of GitHub, BlueOcean is providing an ID: https://github.com/jenkinsci/blueocean-plugin/blob/master/blueocean-github-pipeline/src/main/java/io/jenkins/blueocean/blueocean_github_pipeline/GithubPipelineCreateRequest.java

(I cannot get line links from my phone, so I can only point at the files and I may be misreading on a small screen)

R. Tyler Croy by any chance are you creating the jobs through BlueOcean?

@all is it only GitSCMSource that is affected?

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 17, 2018, 4:27:06 PM2/17/18
to jenkinsc...@googlegroups.com

So this is how BlueOcean creates its multibranch projects: https://github.com/jenkinsci/blueocean-plugin/blob/efb9de930e73454ebcda7625f168a426bc04f416/blueocean-pipeline-scm-api/src/main/java/io/jenkins/blueocean/scm/api/AbstractMultiBranchCreateRequest.java#L77-L78

Now this should only show up on Multibranch projects. If it is an org folder then the SCMNavigator is supposed to be assigning IDs based on a strict formula, e.g. https://github.com/jenkinsci/github-branch-source-plugin/blob/d60cc7617ee9ad56fd3ea3a3c3ad2569dc07c827/src/main/java/org/jenkinsci/plugins/github_branch_source/GitHubSCMNavigator.java#L1560-L1564 and https://github.com/jenkinsci/bitbucket-branch-source-plugin/blob/9f5551b9c05e3bb51c9046204f8871157804401b/src/main/java/com/cloudbees/jenkins/plugins/bitbucket/BitbucketSCMNavigator.java#L874-L883

My analysis

There are two separate issues here:

  1. The case of Multibranch Projects created by BlueOcean, in these cases BlueOcean is not assigning an ID and as a result, until the job has been reconfigured in the classic UI (should be sufficient to just open & save the job) the job will have the issue on every restart. IOW I claim a workaround of open and resave the multibranch project in classic UI. Please demonstrate otherwise.
  2. The case of Job DSL plugin being used incorrectly (no blame, just a statement of fact). This should be fixable by users assigning a static ID in the job definition. IOW I claim not a defect - at least once JENKINS-49610 has documented this as being a requirement.

We need to identify if there are any other issues being lumped in.

Vivek Pandey BlueOcean should be assigning an ID to all SCMSource instances it creates... I think blueocean would be the perfect ID to set. Can probably just do that by changing: https://github.com/jenkinsci/blueocean-plugin/blob/efb9de930e73454ebcda7625f168a426bc04f416/blueocean-pipeline-scm-api/src/main/java/io/jenkins/blueocean/scm/api/AbstractMultiBranchCreateRequest.java#L77 from

SCMSource source = createSource(project, scmConfig);

to

SCMSource source = createSource(project, scmConfig).withId("blueocean");

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 18, 2018, 2:38:03 AM2/18/18
to jenkinsc...@googlegroups.com

I think JENKINS-46290 is demonstrating the same issue as the Job DSL half of this. Andrew Bayer if you need to replicate a JobDSL configuration that is leaving the SCMSource.id == null I believe https://github.com/samrocketman/jervis/blob/b25af324cce229255fd34c9070f32da4d0d8b393/jobs/jenkins_job_multibranch_pipeline.groovy#L35-L62 is such an example. The fix to that should just be adding id 'some-value' within the github section.

alexander.suter@axonivy.com (JIRA)

unread,
Feb 18, 2018, 5:25:02 AM2/18/18
to jenkinsc...@googlegroups.com

I can confirm that when I save the multibranch build pipeline in the traditional jenkins ui (changing the description), the problem does not occur anymore after restart of jenkins.

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 18, 2018, 6:00:02 AM2/18/18
to jenkinsc...@googlegroups.com

Alex Suter you shouldn’t even need to change the description. Just clicking “Save” on the classic UI screen should suffice if BlueOcean created the job.

You will need to repeat if BlueOcean updates the job config... and at that point you will get a rebuild storm (because BlueOcean is not round-tripping the ID)

stephen.alan.connolly@gmail.com (JIRA)

unread,
Feb 18, 2018, 12:24:05 PM2/18/18
to jenkinsc...@googlegroups.com

Vivek Pandey so thinking on this some more, my suggested simple fix for BlueOcean actually needs to be slightly more complex. There are already a significant number of exist jobs that were created by BlueOcean and either have a null id on disk (and are suffering from this issue) or have a non-null id on disk.

If BlueOcean creates a new job, the simple fix is fine.

If BlueOcean updates a job, it needs to round-trip the existing id if and only if the SCMSource type remains the same, otherwise it will trigger a rebuild storm on restart. To be clear, the rebuild storm on restart is an issue right now with BlueOcean even if using a version of the git plugin that ensures a non-null id in the constructor.

So irrespective of everything else, BlueOcean needs to fix the round tripping of IDs during a configuration update... I claim BlueOcean is supposed to assign an ID during initial creation, but if you feel you have a strong argument to counter I am happy to hear it

mneale@cloudbees.com (JIRA)

unread,
Feb 19, 2018, 3:52:03 AM2/19/18
to jenkinsc...@googlegroups.com

Stephen Connolly  Vivek Pandey I think open/save from classic is a perfectly fine work around for blue users. If blueis patched so new created pipelines are saved correctly, I still think it is ok to apply that fix/work around of open and save for older pipelines. 

Whilst this issue was flagged originally by Tyler with codevalet and blue ocean creation, it seems most comments are from jobDSL users (who I think can fix it in the scripts?). So perhaps all that is needed is for blue ocean to do the right thing for new pipelines (older ones can be worked around), and perhaps some warning for jobDSL users that they need to set an id? 

Also - nice sleuthing! this is a tricky one. 

 

mneale@cloudbees.com (JIRA)

unread,
Feb 19, 2018, 4:00:02 AM2/19/18
to jenkinsc...@googlegroups.com
Michael Neale edited a comment on Bug JENKINS-48571
[~stephenconnolly]  [~vivek] I think open/save from classic is a perfectly fine work around for blue users. If blueis blue is patched so *new* created pipelines are saved correctly, I still think it is ok to apply that fix/work around of open and save for *older* pipelines. 


Whilst this issue was flagged originally by Tyler with codevalet and blue ocean creation, it seems most comments are from jobDSL users (who I think can fix it in the scripts?). So perhaps all that is needed is for blue ocean to do the right thing for new pipelines (older ones can be worked around), and perhaps some warning for jobDSL users that they need to set an id? 

Also - nice sleuthing! this is a tricky one. 

 

vivek.pandey@gmail.com (JIRA)

unread,
Feb 19, 2018, 3:45:05 PM2/19/18
to jenkinsc...@googlegroups.com

>If BlueOcean updates a job, it needs to round-trip the existing id if and only if the SCMSource type remains the same, otherwise it will trigger a rebuild storm on restart.

BlueOcean doesn't update a job. Even using API, if they try to create a new job for a gihub/bitbucket/git repo, it errors out if there is job with same name exists. Once user creates a blueocean pipeline job, at most they can do is to trigger re-indexing. So a simple solution of creating a blueocean specific id should be ok. 

vivek.pandey@gmail.com (JIRA)

unread,
Feb 19, 2018, 3:55:03 PM2/19/18
to jenkinsc...@googlegroups.com

vivek.pandey@gmail.com (JIRA)

unread,
Feb 19, 2018, 6:19:02 PM2/19/18
to jenkinsc...@googlegroups.com

mneale@cloudbees.com (JIRA)

unread,
Feb 19, 2018, 8:45:03 PM2/19/18
to jenkinsc...@googlegroups.com

Leandro Narosky Tristan Lewis Alex Suter Ygor Almeida

Sorry for the drawn out things. 

For users of JobDSL the solution/work around is to always set an id for the SCM, something like: 

source {
 github {
 //github
 id "owner-${project_folder}:repo-${project_name}"

This will then correct the behavior. Does this work for you? (similarly for other SCMs)

If the jobs were created via some other way, then opening the config and saving it (no change needed) will fix it, and blue ocean has been patched to set the id correctly. 

 

See Sam Gleske's last comment here: https://issues.jenkins-ci.org/browse/JENKINS-46290?focusedCommentId=329162&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-329162 and his fix here: https://github.com/samrocketman/jervis/commit/e4cd6324ff22c3593d7e6feab88dff79e516e14b for an example of a jobDSL using it. 

mneale@cloudbees.com (JIRA)

unread,
Feb 19, 2018, 8:47:02 PM2/19/18
to jenkinsc...@googlegroups.com

R. Tyler Croy so far this looks like jobs created by blue ocean (before the fix) and jobDSLs without an id set get this, but as for the github organisation folder - are you seeing it there ? (it may have a related issue)

 

Stephen Connolly do you know if github organisation folders could be bitten by this, not setting the id correctly? 

tyler@monkeypox.org (JIRA)

unread,
Feb 20, 2018, 12:55:03 PM2/20/18
to jenkinsc...@googlegroups.com

I could have sworn I saw this with the GitHub Organization Folders on ci.jenkins.io as well as from my Code Valet instances, but I cannot find any record of it.

I might just be seeing ghosts in the machine, disregard

mneale@cloudbees.com (JIRA)

unread,
Feb 20, 2018, 7:01:04 PM2/20/18
to jenkinsc...@googlegroups.com

ok good to know - I might close this now given the 'id' solution and the blue ocean fix. there is a linked follow on ticket to make the api (specifically for jobDSL) clearer here. Feel free to reopen if new information. 

R. Tyler Croy ok - well that fits with the theory, so that is good. 

mneale@cloudbees.com (JIRA)

unread,
Feb 20, 2018, 7:01:07 PM2/20/18
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages