[JIRA] (JENKINS-58692) Change in treatment of Success - Stable vs. Unstable

0 views
Skip to first unread message

bcooksley@kde.org (JIRA)

unread,
Jul 28, 2019, 5:36:01 AM7/28/19
to jenkinsc...@googlegroups.com
Ben Cooksley created an issue
 
Jenkins / Bug JENKINS-58692
Change in treatment of Success - Stable vs. Unstable
Issue Type: Bug Bug
Assignee: Unassigned
Components: core
Created: 2019-07-28 09:35
Priority: Major Major
Reporter: Ben Cooksley

We've recently noticed on our Jenkins instance (at build.kde.org) that builds which are unstable are no longer considered "Successful" by Jenkins.

This means that all of our views are now broken, because we've used "Successful" as meaning it successfully built (even if tests failed). Our expectations appear to align with the Jenkins terminology guide ( https://wiki.jenkins.io/display/JENKINS/Terminology )

This behaviour appeared sometime after Jenkins 2.184, and can be viewed at https://build.kde.org/job/Applications/view/Everything%20-%20stable-kf5-qt5/job/kopete/job/stable-kf5-qt5%20SUSEQt5.12/

(Note that only Build #1 is considered Successful, even though all builds of that job had the result of being Unstable. The correct behaviour in this instance should be for the latest Successful activity for that job to be Build #4 - as it did complete successfully, even if it is unstable)

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

bcooksley@kde.org (JIRA)

unread,
Aug 7, 2019, 5:40:02 AM8/7/19
to jenkinsc...@googlegroups.com
Ben Cooksley commented on Bug JENKINS-58692
 
Re: Change in treatment of Success - Stable vs. Unstable

Over the past week we've started receiving additional complaints that a number of projects were not getting builds triggered. Examination of the Polling logs would show something like the following:

Started on Aug 7, 2019 8:33:46 AM
Using strategy: Default
[poll] Last Built Revision: Revision 597ffa6a5e89b7e05180ccb3517973b3867d72fa (refs/remotes/origin/Applications/19.08)
No credentials specified
 > git --version # timeout=10
 > git ls-remote -h git://anongit.kde.org/konsole # timeout=10
Found 54 remote heads on git://anongit.kde.org/konsole
[poll] Latest remote head revision on refs/heads/Applications/19.04 is: 550cd447bc4bb79cc8920a147e84f7afb35406d6 - already built by 2
no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
no polling baseline in /home/jenkins/workspace/Applications/konsole/stable-kf5-qt5 SUSEQt5.12 on Docker Swarm-353db413f644
Done. Took 0.1 sec
No changes

Examining the Jenkins Core changelog indicated that maintenance of symlinks within Jenkins Core for jobs/projects had been removed and transferred to a plugin. Following installation of that plugin and running jobs again, we've found that correct functionality (both in terms of the branches being polled by Jenkins and the views being updated).

As such this now appears to be a regression, and given it prevents Git polling from working properly in certain cases (when it's managed as part of a Declarative Pipeline) it actually breaks core functionality of Jenkins.

 

bcooksley@kde.org (JIRA)

unread,
Aug 7, 2019, 5:41:02 AM8/7/19
to jenkinsc...@googlegroups.com

bcooksley@kde.org (JIRA)

unread,
Aug 7, 2019, 5:41:02 AM8/7/19
to jenkinsc...@googlegroups.com

mark.earl.waite@gmail.com (JIRA)

unread,
Aug 18, 2019, 12:20:02 AM8/18/19
to jenkinsc...@googlegroups.com
Mark Waite commented on Bug JENKINS-58692
 
Re: Change in treatment of Success - Stable vs. Unstable

Ben Cooksley I'm not understanding how the change from using symlinks has affected the git polling. Could you provide more details so that I can better understand. We may need help from Jesse Glick on differences related to the removal of symlinks.

bcooksley@kde.org (JIRA)

unread,
Aug 18, 2019, 4:15:01 AM8/18/19
to jenkinsc...@googlegroups.com

Mark Waite I'm not sure how it managed to make an impact either, however the behaviour we were seeing prior to the restoration of the maintenance of the symlinks by the plugin was the behaviour noted above - namely, that changes to the branch name weren't being picked up.

Interestingly, it picked up that the last successful build was 'Applications/19.08', yet for reasons unknown continued to poll an older branch - 'Applications/19.04'.

You can find copies of the Pipeline templates, along with the Job DSL scripts we use to provision all the jobs on our Jenkins instance at https://invent.kde.org/sysadmin/ci-tooling/tree/master/ (running of helpers/gather-jobs.py is required prior to trying to evaluate the dsl/*.groovy scripts)

To provide a bit of background, we reuse the same jobs when the stable branches for our software changes, and just update the job to refer to the new branches as needed. This functionality has to date worked perfectly reliably, until the release of Jenkins 2.185/2.186 (we jumped straight from 2.184 to 2.186 due to the Trilead SSH issues in 2.185).

The solution for us was to install the Symlink plugin, after which normal functionality was restored with 2.186+

jglick@cloudbees.com (JIRA)

unread,
Aug 19, 2019, 9:44:02 AM8/19/19
to jenkinsc...@googlegroups.com

So this is hypothesized to be a regression from JENKINS-37862? I cannot think offhand of any reason why that would be so; SCM plugins should not be relying on the existence of symlinks to resolve logical permalinks. The change in question did change how permalinks are cached, so as not to read symlinks for this purpose (now a plain text file is used instead), but the build-symlink plugin does not override this new mechanism, so it should not be able to fix any regression from that aspect.

Is there any known way to reproduce this bug, from scratch, using minimal instructions?

jglick@cloudbees.com (JIRA)

unread,
Aug 19, 2019, 9:59:03 AM8/19/19
to jenkinsc...@googlegroups.com
Jesse Glick edited a comment on Bug JENKINS-58692
So this is hypothesized to be a regression from JENKINS-37862? I cannot think offhand of any reason why that would be so; SCM plugins should {{workflow-job}} (the source of the {{no polling baseline in …}} message noted above) does not be relying rely on the existence of symlinks to resolve logical permalinks. The change in question _did_ change how permalinks are cached, so as not to read symlinks for this purpose (now a plain text file is used instead), but the {{build-symlink}} plugin does not override this new mechanism, so it should not be able to _fix_ any regression from that aspect. The existence of the {{RunListener}} in that plugin could perhaps be forcing a cache update that would not otherwise occur, but I do not see how that could be so either, since {{PeepholePermalink}} already updates the cache for every standard permalink at the end of every build.

Is there any known way to reproduce this bug, from scratch, using minimal instructions?

bcooksley@kde.org (JIRA)

unread,
Aug 20, 2019, 7:31:03 AM8/20/19
to jenkinsc...@googlegroups.com

I'm afraid i've not attempted to reproduce this bug, and experimenting with returning our production systems to a potentially broken state isn't really an option.

The only thing I could recommend in this case would be using https://invent.kde.org/sysadmin/ci-tooling/blob/master/pipeline-templates/SUSEQt5.12.template as a starting point.

The only Stage that matters in that job from the perspective of this bug is the Checkout Sources stage, so you can probably delete the rest without too much impact (although it may be worth forcing the build to always be UNSTABLE)

It is worth noting that we were also experiencing issues with job runs not being considered Successful by Jenkins unless they were also Stable, which impacted views as noted above. As only some jobs were experiencing the issue of not having the correct branches polled, it is possible that these two issues are somehow related - especially given they both disappeared when the plugin is installed.

Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here?

The additional Groovy declarations you'll need to include to use the above template are as follows:

```def repositoryUrl = "git://anongit.kde.org/konsole"
def browserUrl = "https://cgit.kde.org/konsole.git"
def branchToBuild = "master"
def productName = "Applications"
def projectName = "konsole"
def branchGroup = "kf5-qt5"
def currentPlatform = "SUSEQt5.12"
def ciEnvironment = "production"
def buildFailureEmails = "konsol...@kde.org"
def unstableBuildEmails = ""```

jglick@cloudbees.com (JIRA)

unread,
Aug 20, 2019, 2:42:02 PM8/20/19
to jenkinsc...@googlegroups.com

Is it possible that the plugin is causing side effects within Jenkins - so it isn't the symlinks themselves that matter - but rather something else that it causes within Jenkins when performing the symlink update - which resolves our problem here?

That would be my guess, which is why I suspect that the exact sequence of operations matters.

bcooksley@kde.org (JIRA)

unread,
Aug 21, 2019, 6:13:03 AM8/21/19
to jenkinsc...@googlegroups.com

In this case our sequence could broadly be summarised as:

1) Jobs created, using the Pipeline templates as noted above, with an initial branch of Applications/19.04
2) Jobs are then run, at which point Jenkins becomes aware that the jobs are tracking Applications/19.04
3) Jobs are subsequently updated by re-running the DSL Job, which updates the Pipeline templates to refer to Applications/19.08
4) Jobs are run again manually, which should have updated Jenkins to make it aware that Applications/19.08 should now be tracked
5) Subsequent polling results in Applications/19.04 still being polled...

Our Pipeline templates (aside from the additional declarations i've posted above) have remained broadly the same for some time now and haven't changed much in quite some time.

 

jglick@cloudbees.com (JIRA)

unread,
Aug 21, 2019, 9:36:05 AM8/21/19
to jenkinsc...@googlegroups.com

I was unable to reproduce such a problem in a new installation of 2.186 using a very simple setup. I made a local Git repo with one file and two branches a and b. I made a Pipeline like

node {
    git url: '/…/JENKINS-58692-repo', branch: 'a'
    sh 'cat file'
}

with SCM polling set to happen every minute. I did an initial build, then made a change to the a branch and waited. Build 2 ran as expected. Then I changed the branch in the script to b, ran another manual build, edited the b branch and waited. Build 4 ran as expected. The SCM polling log showed the expected things. The permalinks file had the expected contents at the end:

lastFailedBuild -1
lastStableBuild 4
lastSuccessfulBuild 4
lastUnstableBuild -1
lastUnsuccessfulBuild -1

Adding

currentBuild.result = 'UNSTABLE'

to the end of the script and doing it all over did not break anything; now the permalinks are

lastFailedBuild -1
lastStableBuild 4
lastSuccessfulBuild 8
lastUnstableBuild 8
lastUnsuccessfulBuild 8

as expected. (Yes, lastUnsuccessfulBuild is confusing: JENKINS-21706.)

I was about to ask whether you would be willing to install an experimental build which merely adds more detailed messages to the SCM polling log that might help narrow down the problem, but

experimenting with returning our production systems to a potentially broken state isn't really an option

Of course; but do you have some sort of staging server available where a mirror of at least a representative subset of jobs could be installed, without interfering with production workflows? If not, and Mark Waite has no further ideas for reproducing, then I am afraid we would need to close this in the absence of any similar reports.

mark.earl.waite@gmail.com (JIRA)

unread,
Aug 24, 2019, 12:49:02 PM8/24/19
to jenkinsc...@googlegroups.com

mark.earl.waite@gmail.com (JIRA)

unread,
Aug 26, 2019, 7:40:02 AM8/26/19
to jenkinsc...@googlegroups.com
Mark Waite commented on Bug JENKINS-58692
 
Re: Change in treatment of Success - Stable vs. Unstable

I don't have any further ideas to offer.

o.v.nenashev@gmail.com (JIRA)

unread,
Aug 27, 2019, 6:05:02 AM8/27/19
to jenkinsc...@googlegroups.com

bcooksley@kde.org (JIRA)

unread,
Aug 28, 2019, 6:21:02 AM8/28/19
to jenkinsc...@googlegroups.com
Ben Cooksley commented on Bug JENKINS-58692
 
Re: Change in treatment of Success - Stable vs. Unstable

Unfortunately we don't have a test environment for this (due to the size of our Jenkins instance and the resources involved in operating even a small number of the jobs on it) and it would need to run jobs in order to try to reproduce this.

I'll report back if this issue reoccurs, but for now this can be closed. Thanks for investigating.

 

mark.earl.waite@gmail.com (JIRA)

unread,
Aug 28, 2019, 8:19:03 AM8/28/19
to jenkinsc...@googlegroups.com
Change By: Mark Waite
Status: Open Fixed but Unreleased
Resolution: Cannot Reproduce

mark.earl.waite@gmail.com (JIRA)

unread,
Aug 28, 2019, 8:19:03 AM8/28/19
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages