Re: [kubernetes/kubernetes] Nodes with DiskPressure=True while using low disk space/inodes (#52336)

219 views
Skip to first unread message

Dawn Chen

unread,
Oct 10, 2017, 7:50:22 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

This is a serious regression in 1.7 for disk management. We should patch both 1.7 and 1.8 to disable the feature through: LocalStorageCapacityIsolation first.

cc/ @kubernetes/sig-storage-bugs


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Kubernetes Submit Queue

unread,
Oct 10, 2017, 7:52:06 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

[MILESTONENOTIFIER] Milestone Labels Incomplete

@dashpole @felipejfc @jingxu97

Action required: This issue requires label changes. If the required changes are not made within 2 days, the issue will be moved out of the v1.8 milestone.

priority: Must specify exactly one of priority/critical-urgent, priority/important-longterm or priority/important-soon.

Help

Kubernetes Submit Queue

unread,
Oct 10, 2017, 7:53:38 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

[MILESTONENOTIFIER] Milestone Issue Needs Approval

@dashpole @felipejfc @jingxu97 @kubernetes/sig-node-bugs @kubernetes/sig-storage-bugs

Action required: This issue must have the status/approved-for-milestone label applied by a SIG maintainer. If the label is not applied within 6 days, the issue will be moved out of the v1.8 milestone.

Issue Labels
  • sig/node sig/storage: Issue will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the issue owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/bug: Fixes a bug discovered during the current release.

Kubernetes Submit Queue

unread,
Oct 10, 2017, 7:53:46 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

[MILESTONENOTIFIER] Milestone Issue Current

@dashpole @felipejfc @jingxu97

Dawn Chen

unread,
Oct 10, 2017, 7:54:15 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@dashpole Can we reproduce the issue with Overlay?

David Ashpole

unread,
Oct 10, 2017, 7:58:08 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I am currently testing with aufs, since that was the first alternate image I could find. I can test with overlay soon.

David Ashpole

unread,
Oct 10, 2017, 8:07:18 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

It appears to work correctly on aufs on ubuntu xenial:
imageFsInfo.CapacityBytes: 20.7 GB, imageFsInfo.AvailableBytes: 1.91 GB
rootFsInfo.CapacityBytes: 20.7 GB, rootFsInfo.AvailableBytes: 1.91 GB
Pod: container-disk-hog-pod
--- summary Container: container-disk-hog-container UsedBytes: 16.9 GB
Pod: innocent-pod
--- summary Container: innocent-container UsedBytes: 12 KB

David Ashpole

unread,
Oct 10, 2017, 8:56:26 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Looks like overlay on COS 59 (docker 1.11) is fine as well:
imageFsInfo.CapacityBytes: 16.7 GB, imageFsInfo.AvailableBytes: 1.62 GB
rootFsInfo.CapacityBytes: 16.7 GB, rootFsInfo.AvailableBytes: 1.62 GB
Pod: container-disk-hog-pod
--- summary Container: container-disk-hog-container UsedBytes: 13.8 GB
Pod: innocent-pod
--- summary Container: innocent-container UsedBytes: 49 KB

David Ashpole

unread,
Oct 10, 2017, 8:59:10 PM10/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

My best guess is that this is an issue with docker 1.13, or an issue with overlay2. Ill do some more testing tomorrow.
@felipejfc @thomas-riccardi can you share your docker storage driver and docker version?

Thomas Riccardi

unread,
Oct 11, 2017, 4:15:11 AM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Another takeaway from this bug is that it is confusing to have some eviction signals and thresholds be in the format:
Signal < eviction threshold
and have the allocatable signals be
Signal - eviction threshold < 0

This is indeed confusing, and incoherent with kubelet cli parameters to control eviction, as explained in a previous comment.

As for my docker version and info (I did not change anything from the default there):

$ docker version
Client:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.7.6
 Git commit:   a82d35e
 Built:        Wed Sep 20 22:27:13 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.7.6
 Git commit:   a82d35e
 Built:        Wed Sep 20 22:27:13 2017
 OS/Arch:      linux/amd64
$ docker info
Containers: 51
 Running: 37
 Paused: 0
 Stopped: 14
Images: 16
Server Version: 1.12.6
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null bridge host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp selinux
Kernel Version: 4.12.14-coreos
Operating System: Container Linux by CoreOS 1465.8.0 (Ladybug)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 29.46 GiB
Name: gpu-europe-west-1-v177-minion-1
ID: 4JER:JG7I:KIYV:D36C:HCK3:QVCD:OPST:WX3H:EVGV:VB5F:BDIO:KDKL
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

Jing Xu

unread,
Oct 11, 2017, 11:42:58 AM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@dashpole I think we can get the feature gate at here to avoid allocatable feature for local storage
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/container_manager_linux.go#L597

Dawn Chen

unread,
Oct 11, 2017, 1:55:59 PM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Re: #52336 (comment)

That is what I guess. We had the issue accounting overlay 2's disk usage in cAdvisor. But that doesn't explain overlay issue on container-linux (core os) image reported by @thomas-riccardi above.

David Ashpole

unread,
Oct 11, 2017, 2:27:18 PM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@thomas-riccardi strange, I cannot reproduce this on our testing coreos image:
coreos-alpha-1122-0-0-v20160727

David Ashpole

unread,
Oct 11, 2017, 3:49:40 PM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I believe this is an issue with stats collection on overlay2

David Ashpole

unread,
Oct 11, 2017, 4:29:37 PM10/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I have tested a fix, and can confirm that it fixes issues found in my testing. However, the cause of @thomas-riccardi's issues is still unknown.

Jing Xu

unread,
Oct 16, 2017, 8:04:58 PM10/16/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@davidopp, which @thomas-riccardi's issue you are referring in you last comment? Thanks!

Derek Carr

unread,
Oct 18, 2017, 11:40:23 AM10/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Thomas Riccardi

unread,
Oct 18, 2017, 11:53:02 AM10/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Maybe reopen for other impacted storage drivers?
See google/cadvisor#1770 (comment)

Thomas Riccardi

unread,
Oct 23, 2017, 8:49:28 AM10/23/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@dashpole @derekwaynecarr ping: shouldn't we reopen this issue for other impacted storage drivers? cf google/cadvisor#1770 (comment)

David Ashpole

unread,
Oct 23, 2017, 2:27:09 PM10/23/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@thomas-riccardi I need to take a closer look at overlay, but at first glance it looks like both overlay and aufs report correct disk stats. See my earlier comment

William Chang

unread,
Dec 14, 2017, 1:03:32 AM12/14/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

👍

KeithTt

unread,
Dec 20, 2017, 4:05:35 AM12/20/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

same problem...

The node was low on resource: nodefs
The node was low on resource: [DiskPressure].

kube version: 1.8.5
docker version: 17.06.2-ce

David Ashpole

unread,
Dec 20, 2017, 1:09:13 PM12/20/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@KeithTt can you open a new issue? This particular issue was fixed in 1.8.5, but you may have run into something different.

Benjamin Zaitlen

unread,
Jan 2, 2018, 1:17:24 PM1/2/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Will this issue also be patched into the 1.7.X branch ?

KeithTt

unread,
Jan 2, 2018, 1:37:22 PM1/2/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@dashpole I am sorry I missed your message, I have opened a new issue and cc you.

David Ashpole

unread,
Jan 2, 2018, 6:37:46 PM1/2/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@quasiben this was fixed by #54958 in 1.7.11

Kavya Bhadre Gowda

unread,
Aug 11, 2021, 11:34:06 AM8/11/21
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Is there any fix for this issue. Still facing the same issue with Kubernetes v1.20.4.
"Kubectl is evicting pods throwing failed to release ephemeral-storage and node is under disk pressure."


You are receiving this because you are on a team that was mentioned.

Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.

Reply all
Reply to author
Forward
0 new messages