[JIRA] (JENKINS-56140) Failed to count the # of live instances on Kubernetes because of expired bearer token

3 views
Skip to first unread message

filipbrychta@gmail.com (JIRA)

unread,
Feb 14, 2019, 8:59:02 AM2/14/19
to jenkinsc...@googlegroups.com
Filip Brychta created an issue
 
Jenkins / Bug JENKINS-56140
Failed to count the # of live instances on Kubernetes because of expired bearer token
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Components: kubernetes-plugin
Created: 2019-02-14 13:58
Environment: Jenkins ver. 2.150.2
kubernetes-plugin 1.14.3
OpenShift Master:
    v3.9.30
Kubernetes Master:
    v1.9.1+a0ce1bc657
Priority: Major Major
Reporter: Filip Brychta

After upgrade to 1.14.3 from 1.12.7 everything was working fine but after 24 hours no new slave pods were created and following exception was visible in the log:

Feb 11, 2019 3:53:01 PM okhttp3.internal.platform.Platform log
INFO: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}

Feb 11, 2019 3:53:01 PM okhttp3.internal.platform.Platform log
INFO: <-- END HTTP (129-byte body)
Feb 11, 2019 3:53:01 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
WARNING: Failed to count the # of live instances on Kubernetes
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://b22.jonqe.lab.eng.bos.redhat.com:8443/api/v1/namespaces/jenkins-slaves/pods?labelSelector=jenkins%3Dslave. Message: Unauthorized. Received status: Status(apiVersion=v1, code=401, details=null, kind=Status, message=Unauthorized, metadata=ListMeta(_continue=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Unauthorized, status=Failure, additionalProperties={}).
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:478)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:417)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:381)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:328)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:193)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:618)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:68)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.addProvisionedSlave(KubernetesCloud.java:505)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud.provision(KubernetesCloud.java:458)
at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:61)
at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:809)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

 

 

When the jenkins master is restarted, everything is working fine again but after 24 hours it's failing again. Following is visible in openshift log:

Feb 14 07:23:45 b22 atomic-openshift-master-api: E0214 07:23:45.537998 2177 authentication.go:64] Unable to authenticate the request due to an error: [invalid bearer token, [invalid bearer token, oauthaccesstokens.oauth.openshift.io "7BM2kmQ6wu8GZx9vOFVupO8W-a5Wc9Unf2ltJtogg2c" not found]]

 

It seems that new version of plugin is using one client with one token which expires by default in 24 hours (OS config: accessTokenMaxAgeSeconds: 86400). When the jenkins is restarted, new client with new token is created and it's working for another 24 hours and then it expires again.

 

It's possible to increase accessTokenMaxAgeSeconds on OS side but it's not a good workaround. It still requires to restart jenkins when the token expires.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Feb 14, 2019, 9:09:01 AM2/14/19
to jenkinsc...@googlegroups.com
Carlos Sanchez started work on Bug JENKINS-56140
 
Change By: Carlos Sanchez
Status: Open In Progress

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Feb 14, 2019, 9:09:02 AM2/14/19
to jenkinsc...@googlegroups.com

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Feb 14, 2019, 9:09:02 AM2/14/19
to jenkinsc...@googlegroups.com

filipbrychta@gmail.com (JIRA)

unread,
Feb 14, 2019, 12:12:01 PM2/14/19
to jenkinsc...@googlegroups.com
Filip Brychta commented on Bug JENKINS-56140
 
Re: Failed to count the # of live instances on Kubernetes because of expired bearer token

Thanks a lot for quick response. IIUC the PR#429  forces created clients to be flushed from the cache by default after 24 hours and newly created clients should get new token. This looks like a good solution to me.

Thank you

jglick@cloudbees.com (JIRA)

unread,
Jul 16, 2019, 3:21:02 PM7/16/19
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages