jenkins & service account failure

897 views
Skip to first unread message

jer...@saasindustries.com

unread,
Nov 22, 2016, 12:12:20 AM11/22/16
to fabric8
We had just set up a new deployment, and after deploying some apps successfully, all of a sudden Jenkins was not able to spin up worker pods. Kept getting an error saying that the agents were offline.

I did a dump of the jenkins logs, and many of the logs at the end of it said there was a service account issue.

Nov 22, 2016 3:44:02 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback call
SEVERE
: Error in provisioning; slave=KubernetesSlave name: kubernetes-45a3293ad7d84770a6ace9aa571584ab-7acb7b6b299, template=org.csanchez.jenkins.plugins.kubernetes.PodTemplate@b1d042f
io
.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://kubernetes.default/api/v1/namespaces/default/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked..
 at io
.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:310)
 at io
.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:261)
 at io
.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:232)
 at io
.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:207)
 at io
.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:547)
 at io
.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:243)
 at org
.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback.call(KubernetesCloud.java:573)
 at org
.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback.call(KubernetesCloud.java:553)
 at jenkins
.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
 at java
.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java
.lang.Thread.run(Thread.java:745)

I checked, and the jenkins service account was still there. And the secrets were all still there as well, which contained the certificates.

I have two namespaces deployed with jenkins deployed to each. It was affecting both namespaces and both could not initialize the agents.

I was not able to figure out how to recover from this, so I had to redo the deployment with a fresh install. The fresh install resolved the issue. I am unable to replicate the issue.

Any idea on what may have caused this or why jenkins would be reporting a service account being revoked? Any idea how I might recover from this in the future?

James Strachan

unread,
Nov 22, 2016, 7:47:28 AM11/22/16
to Jeremy Wilson, fabric8
I've not seen this before. I wonder if the service account token was regenerated or something? 

All I can think of is to bounce the master jenkins pod and see if that helps? If that fixes it we may need a jenkins controller to bounce the master if the ServiceAccount gets a new generated token or something (or we could try detect the ServiceAccount token getting regenerated)? 

Still no idea what causes the regeneration though! Sounds like something that could break any app.

Anyone else got any ideas what could cause this?

--
You received this message because you are subscribed to the Google Groups "fabric8" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fabric8+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
James
-------
Red Hat

Twitter: @jstrachan
Email: james.s...@gmail.com
Blog: https://medium.com/@jstrachan/

open source microservices platform

James Strachan

unread,
Nov 22, 2016, 7:48:18 AM11/22/16
to Jeremy Wilson, fabric8
Ah hang on - its trying to create the pod in the 'default' namespace. Am guessing thats wrong and that you've 2 namespaces outside of 'default' right?

So it sounds like the jenkins pod has got itself confused and is not using the current namespace for new pods?

On 22 November 2016 at 05:12, <jer...@saasindustries.com> wrote:

--
You received this message because you are subscribed to the Google Groups "fabric8" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fabric8+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jer...@saasindustries.com

unread,
Nov 22, 2016, 3:48:48 PM11/22/16
to fabric8, jer...@saasindustries.com
Yeah you're right. I did not make that connection when I was seeing that happen. Normally the pods are being created in the namespace which hosts it, but in this instance, it was being created in default, which is incorrect.

Playing back the events that happened which may have contributed to the problem, I went into Jenkins admin and changed the max workers from 1 to 5. I wonder if in that UI screen, the Kubernetes option for the namespace got messed up. Maybe loaded some defaults in the UI that overwrote however the integration was setup? I am building a second test cluster I am experimenting in, so I will try to do that again and see if it causes the same issue. I bet that is it.
To unsubscribe from this group and stop receiving emails from it, send an email to fabric8+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

James Rawlings

unread,
Jan 24, 2017, 2:55:41 PM1/24/17
to fabric8, jer...@saasindustries.com
You were totally right Jeremy!  

I hit the same problem as you today and found that as whenever I changed a simple config via the UI then I started getting these errors too.  I've proved that if you hit save on the Jenkins system config UI then it adds the 'default' namespace setting for the kubernetes-plugin causing this issue.

To track the problem I've raised https://issues.jenkins-ci.org/browse/JENKINS-41388
Reply all
Reply to author
Forward
0 new messages