Image garbage collection issues

545 views
Skip to first unread message

Andrew Smith

unread,
Oct 6, 2016, 9:56:06 AM10/6/16
to kubernet...@googlegroups.com
Hi

We're running kubernetes 1.2. We are finding some cases where nodes fill up with images. When I check the logs I can see ImageManager attempting to remove images (sometimes failing) and the disk slowly but surely completely fill.

In today's particular instance I can see there are some images that are weeks old and not in use by any containers. However, when I check the ImageManager logs, I see stuff like:

Oct 06 11:22:32 featuretest-12 kubelet[4206]: I1006 11:22:32.436162    4206 image_manager.go:282] [ImageManager]: Removing image "5b0bde439b3f53bee6e341cd07caba7b3db7eb7863e871294ee0fa8b43c11e63" to free 276839339 bytes
Oct 06 11:22:34 featuretest-12 kubelet[4206]: E1006 11:22:34.386336    4206 kubelet.go:956] Image garbage collection failed: API error (409): Conflict, cannot delete 5b0bde439b3f because the running container e79a884a0ba1 is using it, stop it and use -f to force

Which seems pretty odd - there's plenty of images there that could be deleted and aren't being used - but it's picked one that is being used. 

After doing this dance for a bit we hit 100% usage an the box falls over. Our threshold is 80%.

Has anyone experiencing anything similar? I know we are on an older version and intend to upgrade but I've failed to find any relevant bugs on github.

Thanks
--
Andy Smith

Andrew Smith

unread,
Oct 6, 2016, 9:57:20 AM10/6/16
to kubernet...@googlegroups.com
Oops, actually our thresholds are:

--image-gc-high-threshold=80
--image-gc-low-threshold=70

Rodrigo Campos

unread,
Oct 6, 2016, 10:29:07 AM10/6/16
to kubernet...@googlegroups.com
Are you sure it's not some container (like graphana) not using an attached volume and filling the node disk?

If you are using a cluster in aws created with kube-up, you should be aware of this. It may affect some others providers too.

If it's this, in newer versions of kubernetes this is handled correctly by the kubelet killing pods on disk usage too. But the proper fix is to not have pods doing that (either write to an external persistent volume, or don't run or something).

I have an 1.2 cluster too and haven't seen that images are not deleted. But it might has been luck :-)



Thanks,
Rodrigo
--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Andrew Smith

unread,
Oct 6, 2016, 10:31:48 AM10/6/16
to kubernet...@googlegroups.com
I don't think so. Even if it was a container using a lot of the space I'm mystified as to why 3 week old images that are not in use would not be garbage collected.

We set up kubernetes ourselves on bare metal.

Yu-Ju Hong

unread,
Oct 6, 2016, 1:01:18 PM10/6/16
to kubernet...@googlegroups.com
Do your images have multiple tags? There was a bug where deleting images by ID would fail if the image has multiple tags (issue: https://github.com/kubernetes/kubernetes/issues/28491, fix: https://github.com/kubernetes/kubernetes/pull/29316). The fix was cherry-picked to Kubernetes 1.3.3+. If that was the case, you should be able to find error messages (e.g., "cannot delete image <id> because it is tagged in multiple repositories") in the docker logs. 

Andrew Smith

unread,
Oct 8, 2016, 7:05:32 AM10/8/16
to kubernet...@googlegroups.com
We tag with :x.x version and :latest but I don't get the multiple tags error - and I've definitely seen some of the :x.x/:latest images get garbage collected.
Reply all
Reply to author
Forward
0 new messages