Stuck build numbers

29 views
Skip to first unread message

Marius Gedminas

unread,
Sep 16, 2014, 1:11:28 AM9/16/14
to jenkins...@googlegroups.com
I've got a Jenkins job 'ivija-my360' that shows builds #1-#19 in the
history sidebar, builds a build #20 every time I try to build it, and
then forgets about it immediately.

It also says "last stable build" is #24.

$ ls -l /var/lib/jenkins/jobs/ivija-my360/
total 28
drwxr-xr-x 26 jenkins jenkins 4096 Sep 16 07:23 builds
-rw-r--r-- 1 jenkins jenkins 2412 Sep 15 22:38 config.xml
-rw-r--r-- 1 jenkins jenkins 5130 Sep 16 07:23 disk-usage.xml
lrwxrwxrwx 1 jenkins jenkins 22 Sep 15 23:08 lastStable -> builds/lastStableBuild
lrwxrwxrwx 1 jenkins jenkins 26 Sep 15 23:08 lastSuccessful -> builds/lastSuccessfulBuild
-rw-r--r-- 1 jenkins jenkins 3 Sep 15 23:08 nextBuildNumber
drwxr-xr-x 4 jenkins jenkins 4096 Sep 15 19:49 outOfOrderBuilds
-rw-r--r-- 1 jenkins jenkins 244 Sep 16 07:36 scm-polling.log

$ cat /var/lib/jenkins/jobs/ivija-my360/nextBuildNumber
21

$ ls -l /var/lib/jenkins/jobs/ivija-my360/builds/
total 96
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 1 -> 2013-02-22_16-31-37
lrwxrwxrwx 1 jenkins jenkins 19 Feb 28 2013 10 -> 2013-02-28_08-21-38
lrwxrwxrwx 1 jenkins jenkins 19 Mar 20 2013 11 -> 2013-03-20_07-04-11
lrwxrwxrwx 1 jenkins jenkins 19 Apr 9 2013 12 -> 2013-04-09_14-22-59
lrwxrwxrwx 1 jenkins jenkins 19 Apr 9 2013 13 -> 2013-04-09_17-14-00
lrwxrwxrwx 1 jenkins jenkins 19 Apr 9 2013 14 -> 2013-04-09_19-02-26
lrwxrwxrwx 1 jenkins jenkins 19 Apr 9 2013 15 -> 2013-04-09_20-46-00
lrwxrwxrwx 1 jenkins jenkins 19 Apr 16 2013 16 -> 2013-04-16_15-40-00
lrwxrwxrwx 1 jenkins jenkins 19 Apr 19 2013 17 -> 2013-04-19_09-24-25
lrwxrwxrwx 1 jenkins jenkins 19 Sep 15 19:59 19 -> 2014-09-15_19-59-26
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 2 -> 2013-02-22_16-31-55
lrwxrwxrwx 1 jenkins jenkins 19 Sep 15 23:08 20 -> 2014-09-15_23-08-09
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_16-31-37
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_16-31-55
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_16-32-55
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_16-33-55
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_18-56-57
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_19-30-01
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_20-22-16
drwxr-xr-x 2 jenkins jenkins 4096 Feb 22 2013 2013-02-22_20-56-34
drwxr-xr-x 2 jenkins jenkins 4096 Feb 27 2013 2013-02-22_21-04-49
drwxr-xr-x 2 jenkins jenkins 4096 Mar 12 2013 2013-02-28_08-21-38
drwxr-xr-x 2 jenkins jenkins 4096 Apr 5 2013 2013-03-20_07-04-11
drwxr-xr-x 2 jenkins jenkins 4096 Apr 9 2013 2013-04-09_14-22-59
drwxr-xr-x 3 jenkins jenkins 4096 Apr 9 2013 2013-04-09_17-14-00
drwxr-xr-x 3 jenkins jenkins 4096 Apr 9 2013 2013-04-09_19-02-26
drwxr-xr-x 3 jenkins jenkins 4096 Apr 15 2013 2013-04-09_20-46-00
drwxr-xr-x 3 jenkins jenkins 4096 Apr 16 2013 2013-04-16_15-40-00
drwxr-xr-x 3 jenkins jenkins 4096 May 29 2013 2013-04-19_09-24-25
drwxr-xr-x 3 jenkins jenkins 4096 Jul 15 2013 2013-06-27_06-10-43
drwxr-xr-x 3 jenkins jenkins 4096 Jul 16 2013 2013-07-16_13-56-38
drwxr-xr-x 2 jenkins jenkins 4096 Jul 16 2013 2013-07-16_15-33-53
drwxr-xr-x 2 jenkins jenkins 4096 Jul 16 2013 2013-07-16_17-47-12
drwxr-xr-x 2 jenkins jenkins 4096 Mar 16 2014 2013-07-17_15-40-49
drwxr-xr-x 2 jenkins jenkins 4096 Sep 16 00:02 2014-09-15_19-59-26
drwxr-xr-x 2 jenkins jenkins 4096 Sep 16 07:23 2014-09-15_23-08-09
lrwxrwxrwx 1 jenkins jenkins 19 Jul 16 2013 21 -> 2013-07-16_15-33-53
lrwxrwxrwx 1 jenkins jenkins 19 Jul 16 2013 22 -> 2013-07-16_17-47-12
lrwxrwxrwx 1 jenkins jenkins 19 Jul 17 2013 24 -> 2013-07-17_15-40-49
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 3 -> 2013-02-22_16-32-55
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 4 -> 2013-02-22_16-33-55
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 5 -> 2013-02-22_18-56-57
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 6 -> 2013-02-22_19-30-01
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 7 -> 2013-02-22_20-22-16
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 8 -> 2013-02-22_20-56-34
lrwxrwxrwx 1 jenkins jenkins 19 Feb 22 2013 9 -> 2013-02-22_21-04-49
lrwxrwxrwx 1 jenkins jenkins 2 Sep 16 01:06 lastFailedBuild -> 19
lrwxrwxrwx 1 jenkins jenkins 2 Jul 17 2013 lastStableBuild -> 24
lrwxrwxrwx 1 jenkins jenkins 2 Jul 17 2013 lastSuccessfulBuild -> 24
lrwxrwxrwx 1 jenkins jenkins 2 Dec 11 2013 lastUnstableBuild -> -1
lrwxrwxrwx 1 jenkins jenkins 2 Sep 16 07:23 lastUnsuccessfulBuild -> 20

$ ls -l /var/lib/jenkins/jobs/ivija-my360/outOfOrderBuilds/
drwxr-xr-x 3 jenkins jenkins 4096 May 31 2013 2013-05-29_12-19-08
drwxr-xr-x 3 jenkins jenkins 4096 Jul 17 2013 2013-07-17_14-15-20


I have seen the "some builds are out of order, do you want to sweep them
under the rug" warning in Manage Jenkins once or twice, and pushed the
button. This warning doesn't show up now.

I'm running Jenkins 1.580.


This is similar to https://issues.jenkins-ci.org/browse/JENKINS-15156.
What's different from the comments is that restarting Jenkins doesn't
bring the missing builds back. I'm also pretty sure my on-disk data
structures are messed up and I should fix them first -- or maybe the
out-or-order build monitor thingy should be fixed to be able to do that,
correctly?

I found https://github.com/docwhat/jenkins-job-checker in the comments
of that bug and ran it against my job directory. Its output:

**** PROBLEMS ****
ivija-my360:
Problem: ORDER: The link builds/19 -> 2014-09-15_19-59-26 is out of order.
Problem: ORDER: The link builds/20 -> 2014-09-15_23-08-09 is out of order.
Problem: STOLEN: The date build builds/2013-06-27_06-10-43 had its number stolen by builds/19 -> 2014-09-15_19-59-26
Problem: STOLEN: The date build builds/2013-07-16_13-56-38 had its number stolen by builds/20 -> 2014-09-15_23-08-09
Problem: NEXT: The nextBuildNumber is set to 21 but I expected at least 25
Proposal: Archive out-of-order builds/19 -> 2014-09-15_19-59-26
Proposal: Archive out-of-order builds/20 -> 2014-09-15_23-08-09
Proposal: Relink 19 to builds/2013-06-27_06-10-43
Proposal: Archive newer build builds/2014-09-15_19-59-26
Proposal: Relink 20 to builds/2013-07-16_13-56-38
Proposal: Archive newer build builds/2014-09-15_23-08-09
Proposal: Reset nextBuildNumber

The proposals seemed sane, so I ran jobber.rb --solve, stumbled upon
some bug[*], reran it and now the build directory seems fine.

[*] https://github.com/docwhat/jenkins-job-checker/issues/3

Trying to schedule a new build, of course, scheduled build #21 while
completely ignoring the fact build #21 was now showing up in history.
Asking Jenkins to reload configuration to disk fixed that (I canceled
the scheduled build and scheduled a new one, which got number #25).

I'll see whether the fix works once my Jenkins gets through the build
queue, which will take a few hours (I'm moving to new slaves in fresh
VMs and am rebuilding EVERYTHING to make sure they're set up correctly).

It seems that OutOfOrderBuildMonitor ought to be a bit more stringent in
its checks, especially with the incorrect nextBuildNumber setting.

Marius Gedminas
--
UNIX is user friendly. It's just selective about who its friends are.
signature.asc

Marius Gedminas

unread,
Sep 16, 2014, 11:45:39 AM9/16/14
to jenkins...@googlegroups.com
On Tue, Sep 16, 2014 at 08:11:16AM +0300, Marius Gedminas wrote:
> I've got a Jenkins job 'ivija-my360' that shows builds #1-#19 in the
> history sidebar, builds a build #20 every time I try to build it, and
> then forgets about it immediately.
...
> This is similar to https://issues.jenkins-ci.org/browse/JENKINS-15156.
> What's different from the comments is that restarting Jenkins doesn't
> bring the missing builds back. I'm also pretty sure my on-disk data
> structures are messed up and I should fix them first -- or maybe the
> out-or-order build monitor thingy should be fixed to be able to do that,
> correctly?
>
> I found https://github.com/docwhat/jenkins-job-checker in the comments
> of that bug and ran it against my job directory
...
> The proposals seemed sane, so I ran jobber.rb --solve, stumbled upon
> some bug[*], reran it and now the build directory seems fine.
...
> Trying to schedule a new build, of course, scheduled build #21 while
> completely ignoring the fact build #21 was now showing up in history.
> Asking Jenkins to reload configuration to disk fixed that (I canceled
> the scheduled build and scheduled a new one, which got number #25).
>
> I'll see whether the fix works once my Jenkins gets through the build
> queue, which will take a few hours (I'm moving to new slaves in fresh
> VMs and am rebuilding EVERYTHING to make sure they're set up correctly).

The fix works.

A side effect of reloading configuration from disk is that any[+] jobs that
were running get lost! They're not listed in history or anywhere and
accessing them gives you 404 pages, while they're still running and once
they're finished.

[+] extra fun: most of the running jobs got lost this way, but one
persistent job did not exhibit any problems, even though it was
running during multiple config reloads.

Reloading configuration from disk again, once the running job is done,
makes it appear in history again -- but last stable build symlinks do
not get updated and point to a previous successful build.

(I may just have found a way to reproduce one case of JENKINS-15156.)

Marius Gedminas
--
"Linux: the operating system with a CLUE... Command Line User Environment".
(seen in a posting in comp.software.testing)
signature.asc

Ritica R

unread,
Apr 19, 2018, 2:06:42 PM4/19/18
to Jenkins Users
I am having a similar issue with Jenkins. I have been trying to solve it for days now.
I posted my issue in Stack overflow below:
https://stackoverflow.com/questions/49661191/jenkins-symlinks-permalinks-broken-after-restart

Do you have any suggestions on how to fix this?
Reply all
Reply to author
Forward
0 new messages