Docker-Plugin spawns more containers than needed

Alexis Morelle

unread,

Jan 6, 2015, 6:24:59 AM1/6/15

to jenkin...@googlegroups.com

Hello,

I've been playing around at work with the Jenkins Docker plugin and right before making it available, I hit a strange behavior that holds me from using it widely for now.

I have a pretty simple setup with few images I made myself but I only install the tools I need, set up a jenkins user and add our public key then finally start a SSH server. The plugin is configured to contact a Docker daemon which is hosted on a CoreOS instance.
It very straightforward and works fine. I can even start a container from one of these images manually and create a new node out of it if a persistent slave is needed.

But from time to time (quite often in fact), the plugin takes a little bit of time to start up a container or may be to ssh into it, I can't really say. The build stays on a "waiting" state "because all the slaves are offline". When that happens, a container spawns but still the build isn't attached to it, as if it was offline. And then another container is spawned... and another one... and another one... It's generally not more than 5 or 6 for one build but the containers are left there and never killed or removed. I can stop them manually of course from the interface or the host but I don't think that's an expected behavior. It's not a glitch since the containers are very well alive on the host. It happens as well when the image is already present on the docker host so the "offline" time does not seem related to the downloading time of the image.

It might have something to do with the tags associated with the images. It didn't happen since I associated a single unique tag for each image. May be if multiple images are possible for one build (with common tags), then may be that is the trigger of this behavior.

I've seen this behavior mentioned once in a previous thread from a while ago but no further discussion about it. I'm not sure where to start looking for answers, let me know if that belongs to the Jenkins Developers group.

Has anybody experienced that as well and may be has some answers/explanations?

Thanks in advance for your answers.
Alexis.

Stephen Connolly

unread,

Jan 6, 2015, 6:36:08 AM1/6/15

to jenkin...@googlegroups.com

This is most likely because the plugin is probably not maintaining a track of its state outside of the Jenkins UI instantiated classes.

My considered opinion is that the Jenkins Cloud provider API is almost impossible to implement correctly. There are maybe 1 or 2 almost correct implementations (I believe our operations-center-cloud implementation is correct, I just found out that there is an issue in our nectar-vmware implementation... all the other implementations that I have looked at are in worse states of implementation)

A large chunk of the issues will not show until you start to stress your cloud implementation. Then subtle deadlocks and over/under provisioning behaviour starts to kick in... followed by leaking resources, etc.

I am hoping that with the new plugable NodeProvisioner strategy changes that I have made post 1.580 I will be able to create a new and simpler cloud API for Jenkins that will meet peoples needs much better and allow simpler implementation of clouds

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/b0de2b52-cf02-4c39-bb9b-03c60a48debd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexis Morelle

unread,

Jan 7, 2015, 5:40:09 AM1/7/15

to jenkin...@googlegroups.com

Oh I see. Thank you for your answers Stephen.

Alexis.

Reply all

Reply to author

Forward