Kubernetes Plugin - Fails When Executing Commands on Second Container

57 views
Skip to first unread message

Simon Young

unread,
Dec 10, 2018, 11:00:46 AM12/10/18
to Jenkins Users
Hi,

We are trying to use the Kubernetes Plugin to run tests in a different container - but *almost* every time we try and execute a command in the second container, the build fails immediately. I say "almost" because the build always succeeds after restarting the Jenkins master (or possibly after it's been idle for a long time).

Here's the Jenkinsfile we're using to test:

def label = "k8s-test-${UUID.randomUUID().toString()}"

podTemplate(
    cloud: 'kubernetes-test.k8s.local',
    namespace: 'mynamespace',
    label: label,
    yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: busybox
    image: busybox
    command: ['cat']
    tty: true
"""
    ) {

    node (label) {
        stage ('Get Agent Info') {
            sh "java -version"
        }
        stage ('Run tests') {
            container('busybox') {
                sh "echo foo"
                sh "netstat -tln"
                sh "df -h"
            }
        }
        stage ('Grab Logs') {
            containerLog('jnlp')
        }
    }
}

This fails at the "echo foo" step. The error reported by the pipeline is:

Task okhttp3.RealCall$AsyncCall@b5277fa rejected from java.util.concurrent.ThreadPoolExecutor@6c383ec7[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3]

What we know so far:

* Both containers are successfully deployed to the same pod, as expected.
* If we don't try and run the shell commands on the 'busybox' container, the build completes successfully.
* If Jenkins is restarted, the build passes. But subsequent builds fail.
* The Agent logs contain similar exceptions whether the builds pass or fail, so it's hard to pinpoint a cause.

The ThreadPool exception may imply that *something* has run out of Executors, but it's not clear what, or how to rectify the situation.

Has anyone seen anything like this? Any ideas how to get to the bottom of it? As per the plugin's README, I've created Jenkins log recorders for org.csanchez.jenkins.plugins.kubernetes and okhttp3, but there are no obvious problems reported in the logs.

Software Versions:

Jenkins: v2.150.1
Kubernetes Plugin: 1.13.7
Kubernetes: 1.10.11 (same behaviour observed on 1.8.7)

All suggestions appreciated!

Thanks,

Simon.

Carlos Sanchez

unread,
Dec 10, 2018, 11:10:27 AM12/10/18
to jenkins...@googlegroups.com
You may be hitting the limit of concurrent connections to k8s api, see https://github.com/jenkinsci/kubernetes-plugin/blob/master/CHANGELOG.md#1136

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/141b526e-f00b-434d-9f43-ec97fdb23df7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Young

unread,
Dec 11, 2018, 3:49:14 AM12/11/18
to Jenkins Users
Hi Carlos,

Thanks for the suggestion. I have tried changing the "Max connections to Kubernetes API" and this does have an effect, but not in a way I'd expect... I've found that if I change that value in any way at all (either increase or decrease), then the next build always passes, and all subsequent builds fail.

So for example, I tried raising the value to 200 via the Jenkins UI. I hit 'Apply' and re-ran the job and it passed, but it failed the next time it ran, so I raised the value to 400. Again, the next job passed, but any after that failed. At this point, I suspected that changing the value in any way might have this effect, so I reduced the value back to 32. Same thing happened - pass, then subsequent fails. I reduced the value all the way down to 1, and observed the same symptoms.

I tried changing other plugin configuration values such as "Connection Timeout" and "Container Cap", but this did not have the same effect.

It's also worth noting if I don't run the job for about an hour, the next build passes and subsequent builds fail. Same happens if I restart Jenkins (but not if I just reload the config from disk).

So it seems like some internal counter on the Jenkins master is reaching a limit, and the counter is reset when the config changes or after a period of activity. Neither master nor slave appear to be running out of physical resources, and this is the only job we're running on the master.

Have you any idea what could be going on here?

Simon.

Carlos Sanchez

unread,
Dec 11, 2018, 10:36:58 AM12/11/18
to jenkins...@googlegroups.com
I would rollback to 1.13.5 for now then

Simon Young

unread,
Dec 14, 2018, 3:30:56 AM12/14/18
to Jenkins Users
Hi Carlos,

Rolling back to 1.13.5 seems to have fixed the issue. I can now run multiple consecutive builds and they all pass.

I'm a bit concerned that we're unable to upgrade to the latest version though - Do you have enough information to identify the problem?

Thanks,

Simon.

Carlos Sanchez

unread,
Dec 14, 2018, 6:39:59 AM12/14/18
to jenkins...@googlegroups.com
This should be fixed in 1.13.8

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.

Simon Young

unread,
Dec 14, 2018, 8:08:55 AM12/14/18
to Jenkins Users
Unfortunately it's not fixed in 1.13.8. We now see this:

Connection was rejected, you should increase the Max connections to Kubernetes API

I tried increasing Max connections to 400. The next build passed, but subsequent builds are failing, just like we saw with 1.13.7.

I have to roll back to 1.13.5.

Sardar Junaid Mukhtar

unread,
Sep 9, 2019, 4:43:11 AM9/9/19
to Jenkins Users
Hi Carlos

I have upgraded the plugin to the latest version 1.18.3 and we are still hitting the same issue but with a slightly different error message.

Any chance that you can look into this?

Thanks,
Junaid
Reply all
Reply to author
Forward
0 new messages