[JIRA] (JENKINS-53790) Kubernetes plugin shows failing templates to only admins

2 views
Skip to first unread message

ataylor@cloudbees.com (JIRA)

unread,
Sep 26, 2018, 9:23:02 AM9/26/18
to jenkinsc...@googlegroups.com
Alex Taylor created an issue
 
Jenkins / Improvement JENKINS-53790
Kubernetes plugin shows failing templates to only admins
Issue Type: Improvement Improvement
Assignee: Carlos Sanchez
Components: kubernetes-plugin
Created: 2018-09-26 13:22
Priority: Major Major
Reporter: Alex Taylor

Background:
With the advent of CJE2/Core on modern platforms, we started leveraging the Kubernetes plugin to define agents using kubernetes templates. This is a great new feature but allows non admins to generate new templates even within their pipelines. But since these non admins do not have access to the Kubernetes back end or the logging within Jenkins, they do not see when or why one of these templates fails

Issue:
When a non-admin user creates a k8s template which is badly formed they are unable to see that the container/pod is failing because it is just "waiting on $LABEL"

Steps to reproduce:

Create a pipeline job
Create a template in that job with a badly defined docker image name
Watch the job fail to start because it can not find its label
If you are not an admin you can not see why the container/pod is failing to start because you can not access the k8s logs or the `Manage Jenkins> System Log` area of Jenkins to create a custom logger and see the cause for failure
Resolution:
We need a way in the job or similar to see why the container is failing to start, perhaps just a return code from Kubernetes. Or we need to not allow them to define templates on a job level so that non-admins can not create templates at all.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

ataylor@cloudbees.com (JIRA)

unread,
Sep 26, 2018, 9:51:03 AM9/26/18
to jenkinsc...@googlegroups.com
Alex Taylor commented on Improvement JENKINS-53790
 
Re: Kubernetes plugin shows failing templates to only admins

Right now the only workaround is to do some strange-ness within a pipeline where you create a pipeline and parallel the template along with a groovy script(In my case I included it within a global shared library).

So here are the steps I used:
1. Create a system log called "Kubernetes Log" with the `org.csanchez.jenkins.plugins.kubernetes` logger set to ALL
2. Create a pipeline job with a parallel statement which runs the template and loops through the following script until the template spins up:

import java.util.logging.Level
import java.util.logging.Logger
import hudson.logging.LogRecorderManager
import hudson.logging.LogRecorder
import java.util.logging.LogRecord
import hudson.util.RingBufferLogHandler;


def AgentName= searchString
List<LogRecord> records = new ArrayList<LogRecord>();

//Grabs the log manager
LogRecorderManager mgr = Jenkins.instance.getLog();

//Grabs the records
mgr.logRecorders.each{
   if (it.getValue().getName() == "KubernetesLog")
   {
     records = it.getValue().getLogRecords()
   }
 }

//Iterates over the record messages looking for the agent name
for (LogRecord r : records) {
  if (r.getMessage().contains(AgentName)){
    println(r.getMessage().toString())
  }
}

//Clears the logger
mgr.logRecorders.each{
   if (it.getValue().getName() == "KubernetesLog")
   {
     it.getValue().doClear()
   }
 }

This will show the messages relating to `searchString` which should be the name of your template.

Pierson_Yieh@intuit.com (JIRA)

unread,
Feb 7, 2019, 6:17:02 PM2/7/19
to jenkinsc...@googlegroups.com

We implemented a change to the kubernetes-plugin that will check the message of containers in the Waiting state. The message will container the String "Back-off pulling image" when it can't locate the docker image (e.g. a badly defined docker image). We then grab the corresponding build job from the Jenkins Queue, print a message to the build's console output to notify users that they've specified a bad docker image, then cancel the build. Canceling the job and not simply labeling it as failed was necessary or else Jenkins would continuously re-try to create the Kubernetes pod using the bad docker image and fail. 
Our solution solves the problem of customers not knowing why their job is stuck in a perpetual limbo due to a bad docker image and not knowing / having permissions to view the kubernetes logs, as well as the problem of jobs being stuck in the aforementioned perpetual waiting state due to a malformed docker image. 

We are currently in the process of refining it and will submit a formal PR once that's ready. Any suggestions and comments would be appreciated.

Pierson_Yieh@intuit.com (JIRA)

unread,
Feb 7, 2019, 6:18:02 PM2/7/19
to jenkinsc...@googlegroups.com
Pierson Yieh edited a comment on Improvement JENKINS-53790
We implemented a change to the kubernetes-plugin that will check the message of containers in the Waiting state. The message will container contain the String "Back-off pulling image" when it can't locate the docker image (e.g. a badly defined docker image). We then grab the corresponding build job from the Jenkins Queue, print a message to the build's console output to notify users that they've specified a bad docker image, then cancel the build. Canceling the job and not simply labeling it as failed was necessary or else Jenkins would continuously re-try to create the Kubernetes pod using the bad docker image and fail. 

Our solution solves the problem of customers not knowing why their job is stuck in a perpetual limbo due to a bad docker image and not knowing / having permissions to view the kubernetes logs, as well as the problem of jobs being stuck in the aforementioned perpetual waiting state due to a malformed docker image. 

We are currently in the process of refining it and will submit a formal PR once that's ready. Any suggestions and comments would be appreciated.

Pierson_Yieh@intuit.com (JIRA)

unread,
Mar 4, 2019, 5:36:03 PM3/4/19
to jenkinsc...@googlegroups.com

Pierson_Yieh@intuit.com (JIRA)

unread,
Mar 4, 2019, 5:36:03 PM3/4/19
to jenkinsc...@googlegroups.com

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Mar 5, 2019, 2:09:02 PM3/5/19
to jenkinsc...@googlegroups.com
Carlos Sanchez updated an issue
 
Change By: Carlos Sanchez
Background:
With the advent of CJE2/Core on modern platforms, we We started leveraging the Kubernetes plugin to define agents using kubernetes templates. This is a great new feature but allows non admins to generate new templates even within their pipelines. But since these non admins do not have access to the Kubernetes back end or the logging within Jenkins, they do not see when or why one of these templates fails


Issue:
When a non-admin user creates a k8s template which is badly formed they are unable to see that the container/pod is failing because it is just "waiting on $LABEL"

Steps to reproduce:

Create a pipeline job
Create a template in that job with a badly defined docker image name
Watch the job fail to start because it can not find its label
If you are not an admin you can not see why the container/pod is failing to start because you can not access the k8s logs or the `Manage Jenkins> System Log` area of Jenkins to create a custom logger and see the cause for failure
Resolution:
We need a way in the job or similar to see why the container is failing to start, perhaps just a return code from Kubernetes. Or we need to not allow them to define templates on a job level so that non-admins can not create templates at all.

jglick@cloudbees.com (JIRA)

unread,
Mar 27, 2019, 1:37:03 PM3/27/19
to jenkinsc...@googlegroups.com
Jesse Glick assigned an issue to Pierson Yieh
Change By: Jesse Glick
Assignee: Carlos Sanchez Pierson Yieh

jglick@cloudbees.com (JIRA)

unread,
Mar 27, 2019, 1:37:04 PM3/27/19
to jenkinsc...@googlegroups.com
Jesse Glick started work on Improvement JENKINS-53790
 
Change By: Jesse Glick
Status: Open In Progress

jglick@cloudbees.com (JIRA)

unread,
Mar 27, 2019, 1:37:05 PM3/27/19
to jenkinsc...@googlegroups.com

vincent@latombe.net (JIRA)

unread,
Feb 19, 2020, 4:45:04 AM2/19/20
to jenkinsc...@googlegroups.com
Change By: Vincent Latombe
Status: In Review Resolved
Resolution: Fixed
Released As: kubernetes 1.24.0
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages