Jenkins failing to restart properly.

890 views
Skip to first unread message

Doug Whitfield

unread,
Sep 9, 2021, 4:54:28 PM9/9/21
to Jenkins Users
Hi folks

My team has been working on an issue on-and-off since July 23rd.

I think we might have hit the jackpot in terms of trying to reproduce the issue that affected us initially on July 23rd. Here’s what happened:

  • Once the copy of the Prod Jenkins Home finished, I started Jenkins into quiet mode (I didn’t want a prod deployment that runs on a schedule running in stage by mistake). Jenkins started without issues.
  • Then, I disabled all the jobs (again to prevent a job from running by mistake whenever I took Jenkins out of quiet mode).
  • Then, since we were running stage with production’s config, the stage controller actually connected to the prod AWS account to create the agents there. Ooops.
  • Since having stage create its agents in the wrong AWS account is not ideal, I ran my ansible configuration playbook in stage. Three restarts later and Jenkins didn’t crash in any of them. Stage configuration was successful!
  • From the UI, I disabled quiet mode, but I noticed the builds were not starting.


2021-09-07 20:19:11.628+0000 [id=29] SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@7a94f7bb failed

java.lang.IllegalStateException: The class jenkins.security.QueueItemAuthenticatorConfiguration was not found, potentially not yet loaded

at hudson.ExtensionList.getInstance(ExtensionList.java:166)

at jenkins.security.QueueItemAuthenticatorConfiguration.get(QueueItemAuthenticatorConfiguration.java:61)

at jenkins.security.QueueItemAuthenticatorConfiguration$ProviderImpl.getAuthenticators(QueueItemAuthenticatorConfiguration.java:70)

at jenkins.security.QueueItemAuthenticatorProvider$IteratorImpl.hasNext(QueueItemAuthenticatorProvider.java:44)

at hudson.model.Queue$Item.authenticate(Queue.java:2331)

at hudson.model.Node.canTake(Node.java:401)

at hudson.model.Queue.makeFlyWeightTaskBuildable(Queue.java:1736)

at hudson.model.Queue.makeBuildable(Queue.java:1698)

at hudson.model.Queue.maintain(Queue.java:1546)

at hudson.model.Queue$MaintainTask.doRun(Queue.java:2902)

at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:91)

at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)

at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)

at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)

at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

at java.base/java.lang.Thread.run(Thread.java:829)

  • So I restarted Jenkins one more time (again, with the same configuration my playbook had left in the previous restart, no changes), when suddenly


java.lang.IllegalStateException: Expected 1 instance of jenkins.security.s2m.AdminWhitelistRule but got 0

at hudson.ExtensionList.lookupSingleton(ExtensionList.java:451)

at io.jenkins.plugins.casc.core.AdminWhitelistRuleConfigurator.instance(AdminWhitelistRuleConfigurator.java:59)

at io.jenkins.plugins.casc.core.AdminWhitelistRuleConfigurator.instance(AdminWhitelistRuleConfigurator.java:42)

at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:286)

at io.jenkins.plugins.casc.BaseConfigurator.configure(BaseConfigurator.java:351)

at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:287)

at io.jenkins.plugins.casc.ConfigurationAsCode.lambda$checkWith$8(ConfigurationAsCode.java:777)

at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:713)

at io.jenkins.plugins.casc.ConfigurationAsCode.checkWith(ConfigurationAsCode.java:777)

at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:762)

at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:638)

at io.jenkins.plugins.casc.ConfigurationAsCode.configure(ConfigurationAsCode.java:307)

at io.jenkins.plugins.casc.ConfigurationAsCode.init(ConfigurationAsCode.java:299)

This is an issue that has shown up before. Usually another restart fixes the issue, but I’ve now restarted Jenkins about 4 times and it still shows up that error. I’m hoping this will allow us to investigate a bit more what’s going on.

I have the GC logs, logs, thread dumps and an SOS report from stage. The latest PID is 2058587, so the last GC logs is this file gc-2058587-2021-09-07_16-11-45.log.

Some of those would need to be sanitized before I can share, but let me know if any of that would be useful.

First and foremost, is there a fix for this? Secondly, is this a known bug?

Best Regards,

Doug Whitfield

Reply all
Reply to author
Forward
0 new messages