Re: [kubernetes/kubernetes] DeletionTimeStamp not set for some evicted pods (#54525)

7 views
Skip to first unread message

k8s-ci-robot

unread,
Oct 24, 2017, 8:01:21 PM10/24/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@patrickshan: Reiterating the mentions to trigger a notification:
@kubernetes/sig-api-machinery-misc

In response to this:

@kubernetes/sig-api-machinery-misc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Klaus Ma

unread,
Oct 25, 2017, 1:24:57 AM10/25/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/sig node

Di Xu

unread,
Oct 25, 2017, 10:38:37 AM10/25/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/assign

Daniel Smith

unread,
Oct 26, 2017, 3:49:23 PM10/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Most likely this is sig-node.

Clayton Coleman

unread,
Oct 26, 2017, 7:45:45 PM10/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Deletion timestamp is set by the apiserver, it cannot be set by clients

Jordan Liggitt

unread,
Oct 26, 2017, 7:47:39 PM10/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Are they actually evicted/deleted or do they just have failed status?

Patrick Shan

unread,
Oct 26, 2017, 9:34:07 PM10/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

They are evicted and marked as "Failed" status first. For pods created through daemonset, their DeletionTimeStamp get set after "Failed" status set. But for pods created through deployment, their DeletionTimeStamp just keep zero value and never set.

Clayton Coleman

unread,
Oct 26, 2017, 10:00:06 PM10/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention
That means that no one deleted them. The ReplicaSet controller is
responsible for performing that deletion.
<https://github.com/kubernetes/kubernetes/issues/54525#issuecomment-339847242>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_pzHzsDIc9myC2sswt7sa5hK9ger_ks5swTLwgaJpZM4QFQOk>

Di Xu

unread,
Oct 27, 2017, 11:12:14 AM10/27/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

That means that no one deleted them. The ReplicaSet controller is responsible for performing that deletion.

I reproduced this. But I found the pods were successfully evicted and deleted only from kubelet, not apiserver. Apiserver still kept a copy, with nil DeletionTimeStamp and ContainerStatuses. @liggitt Do you know why? Quite abnormal.

Jordan Liggitt

unread,
Oct 27, 2017, 11:35:09 AM10/27/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

I see the kubelet sync loop construct a pod status like what you describe if an internal module decides the pod should be evicted:

https://github.com/kubernetes/kubernetes/blob/b00c15f1a40162d46fc4b96f4e6714f20aef9e6c/pkg/kubelet/kubelet_pods.go#L1293-L1305

The kubelet then syncs status back to the API server:
https://github.com/kubernetes/kubernetes/blob/b00c15f1a40162d46fc4b96f4e6714f20aef9e6c/pkg/kubelet/status/status_manager.go#L437-L488

But unless the pod's deletion timestamp is already set, the kubelet won't delete the pod:
https://github.com/kubernetes/kubernetes/blob/b00c15f1a40162d46fc4b96f4e6714f20aef9e6c/pkg/kubelet/status/status_manager.go#L504-L509

@kubernetes/sig-node-bugs that doesn't seem like the kubelet does a complete job of evicting the pod from the API's perspective

Yu-Ju Hong

unread,
Oct 27, 2017, 1:24:06 PM10/27/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@kubernetes/sig-node-bugs that doesn't seem like the kubelet does a complete job of evicting the pod from the API's perspective. Would you expect the kubelet to delete a pod directly in that case or to still go through posting a pod eviction (should pod disruption budget be honored in cases where the kubelet is out of resources?)

I think this is intentional.

AFAIk, kubelet's pod eviction includes failing the pod (i.e., setting the pod status) and reclaiming the resources used by the pod on the node. There is no "deleting the pod from the apiserver" involved in the eviction. Users/controllers and check the pod status to know what happened to the pod if needed.

/cc @derekwaynecarr @dashpole

David Ashpole

unread,
Oct 27, 2017, 1:33:01 PM10/27/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Yes, this is intentional. In order for evicted pods to be inspected after eviction, we do not remove the pod API object. Otherwise it would appear that the pod simply disappeared
We do still stop and remove all containers, clean up cgroups, unmount volumes, etc to ensure that we reclaim all resources that were in use by the pod.
I dont think we set the deletion timestamp for daemon set pods. I suspect that the daemon set controller deletes evicted pods.

fejta-bot

unread,
Jan 25, 2018, 5:33:20 PM1/25/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Janet Kuo

unread,
Feb 22, 2018, 2:06:55 PM2/22/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

In order for evicted pods to be inspected after eviction, we do not remove the pod API object.

If the controller that creates the evicted pod is scaled down, it should kill those evicted pods first before killing any others, right? Most workload controllers don't do that today.

I dont think we set the deletion timestamp for daemon set pods. I suspect that the daemon set controller deletes evicted pods.

DaemonSet controller actively deletes failed pods (#40330), to ensure that DaemonSet can recover from the transient error (#36482). Evicted DaemonSet pods get killed because they're also failed pods.

/remove-lifecycle stale

Anthony Yeh

unread,
Feb 22, 2018, 5:32:35 PM2/22/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

In order for evicted pods to be inspected after eviction, we do not remove the pod API object.

If the controller that creates the evicted pod is scaled down, it should kill those evicted pods first before killing any others, right? Most workload controllers don't do that today.

For something like StatefulSet, it's actually necessary to immediately delete any Pods evicted by kubelet, so the Pod name can be reused. As @janetkuo also mentioned, DaemonSet does this as well. For such controllers, you're thus not gaining anything from kubelet leaving the Pod record.

Even for something like ReplicaSet, it probably makes the most sense for the controller to delete Pods evicted by kubelet (though it doesn't do that now, see #60162) to avoid carrying along Failed Pods indefinitely.

So I would argue that in pretty much all cases, Pods with restartPolicy: Always that go to Failed should be expediently deleted by some controller, so users can't expect such Pods to stick around.

If we can agree that some controller should delete them, the only question left is which controller? I suggest that the Node controller makes the most sense: delete any Failed Pods with restartPolicy: Always that are scheduled to me. Otherwise, we effectively shift the responsibility to "all Pod/workload controllers that exist or ever will exist." Given the explosion of custom controllers that's coming thanks to CRD, I don't think it's prudent to put that responsibility on every controller author.

Otherwise it would appear that the pod simply disappeared

With the /eviction subresource and Node drains, we have already set the precedent that your Pods might simply disappear (if the eviction succeeds, the Pod is deleted from the API server) at any time, without a trace.

fejta-bot

unread,
May 23, 2018, 7:20:08 PM5/23/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot

unread,
Jun 22, 2018, 8:07:30 PM6/22/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle rotten
/remove-lifecycle stale

David J. M. Karlsen

unread,
Jun 23, 2018, 3:47:18 AM6/23/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/remove-lifecycle rotten

fejta-bot

unread,
Sep 21, 2018, 4:41:21 AM9/21/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle stale

fejta-bot

unread,
Oct 21, 2018, 4:57:43 AM10/21/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle rotten

k8s-ci-robot

unread,
Nov 20, 2018, 4:44:50 AM11/20/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.


Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot

unread,
Nov 20, 2018, 4:44:56 AM11/20/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Closed #54525.

fejta-bot

unread,
Nov 20, 2018, 4:44:58 AM11/20/18
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Kubernetes Prow Robot

unread,
Jul 2, 2019, 5:38:23 AM7/2/19
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Reopened #54525.

Tomáš Nožička

unread,
Jul 2, 2019, 5:38:23 AM7/2/19
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/reopen
/remove-lifecycle rotten
/sig apps

Yeah, some controllers like DS delete evicted pods on it's own, and Statefulset needs it because of pod identity.

If the controller that creates the evicted pod is scaled down, it should kill those evicted pods first before killing any others, right? Most workload controllers don't do that today.

I think at some point we have made Deployments not account Evicted pods into it's state as it was causing problems.

So I would argue that in pretty much all cases, Pods with restartPolicy: Always that go to Failed should be expediently deleted by some controller, so users can't expect such Pods to stick around.

Except when someone creates the pod manually (not by a controller) then he likely cares about it being evicted.

How about we make the workload controllers (that use restart policyAlways) default .spec.ttlSecondsAfterFinished to some reasonable value? That would clean those up and also give a chance to see them for a while if desired.
ref: https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/0026-ttl-after-finish.md#finished-pods

Kubernetes Prow Robot

unread,
Jul 2, 2019, 5:38:34 AM7/2/19
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@tnozicka: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten
/sig apps

Yeah, some controllers like DS delete evicted pods on it's own, and Statefulset needs it because of pod identity.

If the controller that creates the evicted pod is scaled down, it should kill those evicted pods first before killing any others, right? Most workload controllers don't do that today.

I think at some point we have made Deployments not account Evicted pods into it's state as it was causing problems.

So I would argue that in pretty much all cases, Pods with restartPolicy: Always that go to Failed should be expediently deleted by some controller, so users can't expect such Pods to stick around.

Except when someone creates the pod manually (not by a controller) then he likely cares about it being evicted.

How about we make the workload controllers (that use restart policyAlways) default .spec.ttlSecondsAfterFinished to some reasonable value? That would clean those up and also give a chance to see them for a while if desired.
ref: https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/0026-ttl-after-finish.md#finished-pods

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fejta-bot

unread,
Sep 30, 2019, 6:09:07 AM9/30/19
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot

unread,
Oct 30, 2019, 6:52:15 AM10/30/19
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle rotten


You are receiving this because you are on a team that was mentioned.

Reply to this email directly, view it on GitHub, or unsubscribe.

Oliver L Schoenborn

unread,
Mar 5, 2020, 2:45:14 PM3/5/20
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@dashpole is this documented anywhere in the online docs?

David Ashpole

unread,
Mar 5, 2020, 6:43:33 PM3/5/20
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@schollii I suppose we don't document the list of things that do not happen during evictions. The out-of-resource documentation says "it terminates all of its containers and transitions its PodPhase to Failed". It doesn't explicitly call out that it does not set the deletion timestamp.

Some googling says you can reference evicted pods with: --field-selector=status.phase=Failed. You should be able to list, delete, etc with that.

Oliver L Schoenborn

unread,
Mar 6, 2020, 8:26:59 AM3/6/20
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@dashpole I saw the mentions of --field-selector=status.phase=Failed but the problem there is that the "reason" is actually what says "evicted", so there could be failed pods that were not evicted. And you cannot select on status.reason, I tried. So we are left with grepping and awking the output of get pods -o wide. This needs fixing. E.g. make status.reason selectable, or have a phase called Evicted (although I doubt this is acceptable because not backwards compat). Or just have a command kubectl delete pods --evicted-only. If it can be fixed by a newbie to the k8s code base, I'd be happy to do it.

barney-s

unread,
Mar 14, 2020, 3:30:30 AM3/14/20
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

should we be explicit in setting the deletion timestamp ?

Daniel Smith

unread,
Mar 16, 2020, 11:47:54 AM3/16/20
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

grepping and awking the output of get pods -o wide

Use jq and -o json for stuff like this.

There's a "podgc" controller which deletes old pods, is it not triggering for evicted pods? How many do you accumulate? Why is it problematic?

should we be explicit in setting the deletion timestamp ?

I am not sure what the contract between kubelet / scheduler / controller is for evictions. Which entity is supposed to delete the pod? I assume they are not deleted by kubelet to give signal to scheduler/controller about the lack of fit?

Ning Xie

unread,
Feb 5, 2021, 12:08:23 AM2/5/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Should Deployment check and delete Failed pods like what has been done in SatefulSet and DaemonSet?

Ning Xie

unread,
Feb 5, 2021, 12:09:48 AM2/5/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Or pod GC should come in and cover this for other Resources besides StatefulSet and DaemonSet?

Ning Xie

unread,
Feb 5, 2021, 12:12:45 AM2/5/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Just for someone who also interested in how failed pod deleted is done in StatefulSet controller: https://github.com/kubernetes/kubernetes/blob/c5759ab86d9813269bd61108dec43ef36a993e02/pkg/controller/statefulset/stateful_set_control.go#L384-L394

Matthias Bertschy

unread,
Jun 25, 2021, 9:45:16 AM6/25/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/triage accepted
/help
/priority important-longterm

Kubernetes Prow Robot

unread,
Jun 25, 2021, 9:45:18 AM6/25/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

@matthyx:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/triage accepted
/help
/priority important-longterm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Bryan Boreham

unread,
Dec 15, 2021, 5:57:52 AM12/15/21
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

According to StackOverflow:

the evicted pods will hang around until the number of them reaches the terminated-pod-gc-threshold limit (it's an option of kube-controller-manager and is equal to 12500 by default)


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

Triage notifications on the go with GitHub Mobile for iOS or Android.

Kubernetes Triage Robot

unread,
Feb 8, 2023, 3:27:52 AM2/8/23
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are on a team that was mentioned.Message ID: <kubernetes/kubernetes/issues/54525/1422215197@github.com>

Kubernetes Prow Robot

unread,
Feb 8, 2023, 3:27:56 AM2/8/23
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are on a team that was mentioned.Message ID: <kubernetes/kubernetes/issues/54525/1422215290@github.com>

Kevin Hannon

unread,
Dec 11, 2024, 10:48:48 AM12/11/24
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

/close

This ticket is old and the information seems out dated.

Please try and reproduce on a newer cluster and write a new ticket if it still exists.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are on a team that was mentioned.Message ID: <kubernetes/kubernetes/issues/54525/2536367009@github.com>

Kubernetes Prow Robot

unread,
Dec 11, 2024, 10:48:50 AM12/11/24
to kubernetes/kubernetes, k8s-mirror-api-machinery-misc, Team mention

Closed #54525 as completed.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are on a team that was mentioned.Message ID: <kubernetes/kubernetes/issue/54525/issue_event/15620834742@github.com>

Reply all
Reply to author
Forward
0 new messages