flannel:v0.6.2 + kubewrapper eats up disk space on k8 controller

59 views
Skip to first unread message

Andrew Webber

unread,
Jan 17, 2017, 1:16:38 PM1/17/17
to CoreOS User
When running a kubernetes controller similar to coreos-baremetal ignition (link below), over time the root file system runs out of space.

Flannel config - {"Network":"11.2.0.0/16","Backend":{"Type":"udp"}}

Once I restart flanneld and run a rkt gc Gigabytes of space are reclaimed. 

Initial scene:
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G  1.1M  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        901G  377G  487G  44% /
/dev/mapper/usr  985M  625M  309M  67% /usr
/dev/sda1        128M   72M   57M  56% /boot
tmpfs            3.9G     0  3.9G   0% /media
tmpfs            3.9G     0  3.9G   0% /tmp
/dev/sda6        108M   52K   99M   1% /usr/share/oem
overlay          901G  377G  487G  44% /var/lib/rkt/pods/run/1809dc07-e44e-4b15-beb8-e2f97ff0ebc2/stage1/rootfs
overlay          901G  377G  487G  44% /var/lib/rkt/pods/run/1809dc07-e44e-4b15-beb8-e2f97ff0ebc2/stage1/rootfs/opt/stage2/flannel/rootfs
tmpfs            3.9G     0  3.9G   0% /var/lib/rkt/pods/run/1809dc07-e44e-4b15-beb8-e2f97ff0ebc2/stage1/rootfs/opt/stage2/flannel/rootfs/tmp
overlay          901G  377G  487G  44% /var/lib/rkt/pods/exited-garbage/678c3c72-05ef-4fbf-bfbc-f4e7b0185675/stage1/rootfs
overlay          901G  377G  487G  44% /var/lib/rkt/pods/exited-garbage/678c3c72-05ef-4fbf-bfbc-f4e7b0185675/stage1/rootfs/opt/stage2/flannel/rootfs
tmpfs            3.9G     0  3.9G   0% /var/lib/rkt/pods/exited-garbage/678c3c72-05ef-4fbf-bfbc-f4e7b0185675/stage1/rootfs/opt/stage2/flannel/rootfs/tmp
overlay          901G  377G  487G  44% /var/lib/rkt/pods/run/38590297-9a97-4f8c-aa91-b320a23298b2/stage1/rootfs
overlay          901G  377G  487G  44% /var/lib/rkt/pods/run/38590297-9a97-4f8c-aa91-b320a23298b2/stage1/rootfs/opt/stage2/hyperkube/rootfs
tmpfs            3.9G     0  3.9G   0% /var/lib/rkt/pods/run/38590297-9a97-4f8c-aa91-b320a23298b2/stage1/rootfs/opt/stage2/hyperkube/rootfs/tmp
overlay          901G  377G  487G  44% /var/lib/docker/overlay/760b5c301571769239637d50c3fe06c56e6e2c6b7629d6c8ab03206566ce8357/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/9f273054346dd874157dad48e5b3c8146c757a2ed35426db2f617e799360c11c/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/844964f511a0de1d834d214669dcfb1c60e97b45b448e1b39e3ed981dcaf9886/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/7243d1964cd2c4c20c9282edae2d36d26a1a277bc97cd44534f388d6fe61710b/merged
shm               64M     0   64M   0% /var/lib/docker/containers/9402e2432e8e98629559ca7caea4f93d460efebd54fecb424d6b55ac211f3a69/shm
shm               64M     0   64M   0% /var/lib/docker/containers/fde9b5b8fe4511254c3211fdfc48993fda724459c1310d474338415c01408a05/shm
shm               64M     0   64M   0% /var/lib/docker/containers/acfbc1e928fb2517dfc3c4eec7af4979e76ba4157595e2622552fc70a2a25142/shm
shm               64M     0   64M   0% /var/lib/docker/containers/03cad4faeddc5c3624fd9188c55a497033f23bf6bf03df53145c0334078c2844/shm
overlay          901G  377G  487G  44% /var/lib/docker/overlay/1621e7931c4cbaebe0025557fa1a9bb309f290fe01971e4f47b7f0b0252966e6/merged
tmpfs            3.9G   12K  3.9G   1% /var/lib/kubelet/pods/bca96648-dc9d-11e6-bfad-001e4f520ea4/volumes/kubernetes.io~secret/default-token-d9ws1
tmpfs            3.9G   12K  3.9G   1% /var/lib/kubelet/pods/bd7fca73-dc9d-11e6-bfad-001e4f520ea4/volumes/kubernetes.io~secret/default-token-mrlsv
tmpfs            3.9G   12K  3.9G   1% /var/lib/kubelet/pods/bce3d820-dc9d-11e6-bfad-001e4f520ea4/volumes/kubernetes.io~secret/default-token-d9ws1
overlay          901G  377G  487G  44% /var/lib/docker/overlay/00b1bf333a87dce77bffc4eca61db8b6cb360da4fd15102debc37b6d99f43c5b/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/32f32ad49d53ac81f8d4468299f38252c57b1d700b544d6647cbfd349abed444/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/ec9b1b5380eb857ac4b04758551c7ae6fa18d18f9a1c4dabd5b90f4e82d55822/merged
shm               64M     0   64M   0% /var/lib/docker/containers/ca70160e02c643438f6b106630b12a26efeb151903c1ca4e3838729f22ea7b0b/shm
shm               64M     0   64M   0% /var/lib/docker/containers/99774ac304849d2eac5e65b48c3bb65a227c2f7314f1c2043c7ef7abb3cd6645/shm
shm               64M     0   64M   0% /var/lib/docker/containers/18c5bad3016d4c7f80885164e05d993a63c383d096817c5e248f004315f8b824/shm
overlay          901G  377G  487G  44% /var/lib/docker/overlay/12ce562d242907c966f37a01556564bef1018bd0a6f804af4aeebe9fbfeb524c/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/07798efcb878b86f961dfd23a3d902b972146c56ce2ee09afd82acdde103af1f/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/0f9af53f5be4964bfb431ebb0aabd287ff51bad5dd945747d9fc73d14fd3444a/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/d0ba21203b24f107fbe020959b5e5073371b46f0590c13442435894c1b2579f8/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/a8dc87351d5f8b4ce6e498f80731e71052bc2a1c35769e96b577b23ed4b5c5df/merged
overlay          901G  377G  487G  44% /var/lib/docker/overlay/15a61a45514f106c8a5286758e19913db3f274ef988e6eb38ab8b958dd8fc0f0/merged
tmpfs            786M     0  786M   0% /run/user/0


After kill and removing some k8 docker containers we are left with only flannel and hyperkube rkt containers:
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G 1008K  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        901G  391G  473G  46% /
/dev/mapper/usr  985M  625M  309M  67% /usr
/dev/sda1        128M   72M   57M  56% /boot
tmpfs            3.9G     0  3.9G   0% /media
tmpfs            3.9G     0  3.9G   0% /tmp
/dev/sda6        108M   52K   99M   1% /usr/share/oem
overlay          901G  391G  473G  46% /var/lib/rkt/pods/run/1809dc07-e44e-4b15-beb8-e2f97ff0ebc2/stage1/rootfs
overlay          901G  391G  473G  46% /var/lib/rkt/pods/run/38590297-9a97-4f8c-aa91-b320a23298b2/stage1/rootfs

Restart flannel and do a rkt gc:
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G  1.1M  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        901G   29G  835G   4% /
/dev/mapper/usr  985M  625M  309M  67% /usr
/dev/sda1        128M   72M   57M  56% /boot
tmpfs            3.9G     0  3.9G   0% /media
tmpfs            3.9G     0  3.9G   0% /tmp
/dev/sda6        108M   52K   99M   1% /usr/share/oem
overlay          901G   29G  835G   4% /var/lib/rkt/pods/run/38590297-9a97-4f8c-aa91-b320a23298b2/stage1/rootfs
overlay          901G   29G  835G   4% /var/lib/rkt/pods/run/da96664f-b7ee-44d4-8b8f-8249feb15497/stage1/rootfs

Restart hyperkube:
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G  1.1M  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        901G   35G  829G   5% /
/dev/mapper/usr  985M  625M  309M  67% /usr
/dev/sda1        128M   72M   57M  56% /boot
tmpfs            3.9G     0  3.9G   0% /media
tmpfs            3.9G     0  3.9G   0% /tmp
/dev/sda6        108M   52K   99M   1% /usr/share/oem
overlay          901G   35G  829G   5% /var/lib/rkt/pods/run/da96664f-b7ee-44d4-8b8f-8249feb15497/stage1/rootfs

Hyperkube and flannel have now been restarted, unfortunately disk space is getting eaten up:
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G  1.1M  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        901G   36G  828G   5% /
/dev/mapper/usr  985M  625M  309M  67% /usr
/dev/sda1        128M   72M   57M  56% /boot
tmpfs            3.9G     0  3.9G   0% /media
tmpfs            3.9G     0  3.9G   0% /tmp
/dev/sda6        108M   52K   99M   1% /usr/share/oem
overlay          901G   36G  828G   5% /var/lib/rkt/pods/run/da96664f-b7ee-44d4-8b8f-8249feb15497/stage1/rootfs
overlay          901G   36G  828G   5% /var/lib/rkt/pods/run/dc3dce80-fc51-4840-86f4-44c58ef4b0c0/stage1/rootfs


Andrew Webber

unread,
Jan 17, 2017, 4:22:07 PM1/17/17
to CoreOS User
I later found out this was an issue with etcd-operator killing my controller manager by making is create infinite log entries:

Error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:metatypes.OwnerReference{APIVersion:"coreos.com/v1", Kind:"EtcdCluster", UID:"6c7b92d5-dc11-11e6-9ff0-001e4f520ea4", Name:"etcd-client", Controller:(*bool)(0xc422af1a28)}, Namespace:"rhino-ci"}, dependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, dependents:map[*garbagecollector.node]struct {}{(*garbagecollector.node)(0xc4223c9c20):struct {}{}, (*garbagecollector.node)(0xc421eeeab0):struct {}{}, (*garbagecollector.node)(0xc420bc94d0):struct {}{}, (*garbagecollector.node)(0xc422b64000):struct {}{}, (*garbagecollector.node)(0xc422b9d320):struct {}{}, (*garbagecollector.node)(0xc422a05e60):struct {}{}, (*garbagecollector.node)(0xc4213c2870):struct {}{}}, owners:[]metatypes.OwnerReference(nil)}: unable to get REST mapping for kind: EtcdCluster, version: coreos.com/v1
E0117 21:18:00.595128       1 garbagecollector.go:593] 

Xiang Li

unread,
Jan 17, 2017, 4:59:59 PM1/17/17
to CoreOS User
Hi Andrew,

This is a problem with controller manager itself failed to understand TPR correctly and fell into a busy loop. There is a upstream issue tracking this regression: https://github.com/kubernetes/kubernetes/issues/39816.

Thanks,
Xiang

Andrew Webber

unread,
Jan 18, 2017, 5:51:02 PM1/18/17
to CoreOS User
Thanks Xiang,

We are following etcd-operator and got stung by this.
Thank you for managing this issue.

kind regards,

Andrew
Reply all
Reply to author
Forward
0 new messages