[RFC] Protecting users of kubectl delete

688 views
Skip to first unread message

Eddie Zaneski

unread,
May 27, 2021, 3:35:23 PMMay 27
to kuberne...@googlegroups.com, kubernete...@googlegroups.com

Hi Kuberfriendos,


We wanted to start a discussion about mitigating some of the potential footguns in kubectl.


Over the years we've heard stories from users who accidentally deleted resources in their clusters. This trend seems to be rising lately as newer folks venture into the Kubernetes/DevOps/Infra world.


First some background.


When a namespace is deleted it also deletes all of the resources under it. The deletion runs without further confirmation, and can be devastating if accidentally run against the wrong namespace (e.g. thanks to hasty tab completion use).


```

kubectl delete namespace prod-backup

```


When all namespaces are deleted essentially all resources are deleted. This deletion is trivial to do with the `--all` flag, and it also runs without further confirmation. It can effectively wipe out a whole cluster.


```

kubectl delete namespace --all

```


The difference between `--all` and `--all-namespaces` can be confusing.


There are certainly things cluster operators should be doing to help prevent this user error (like locking down permissions) but we'd like to explore what we can do to help end users as maintainers.


There are a few changes we'd like to propose to start discussion. We plan to introduce this as a KEP but wanted to gather early thoughts.


Change 1: Require confirmation when deleting with --all and --all-namespaces


Confirmation when deleting with `--all` and `--all-namespaces` is a long requested feature but we've historically determined this to be a breaking change and declined to implement. Existing scripts would require modification or break. While it is indeed breaking, we believe this change is necessary to protect users.


We propose moving towards requiring confirmation for deleting resources with `--all` and `--all-namespaces` over 3 releases (1 year). This gives us ample time to warn users and communicate the change through blogs and release notes.

  • Alpha

    • Introduce a flag like `--ask-for-confirmation | -i` that requires confirmation when deleting ANY resource. For example the `rm` command to delete files on a machine has this built in with `-i`. This provides a temporary safety mechanism for users to start using now.

    • Add a flag to enforce the current behavior and skip confirmation. `--force` is already used for removing stuck resources (see change 3 below) so we may want to use `--auto-approve` (inspired by Terraform). Usage of `--ask-for-confirmation` will always take precedence and ignore `--auto-approve`. We can see this behavior with `rm -rfi`.

 -i          Request confirmation before attempting to remove each file, regardless of the file's permissions, or whether or not the standard input device is a terminal.  The -i option overrides any previous -f options.

    • Begin warning to stderr that by version x.x.x deleting with `--all` and `--all-namespaces` will require interactive confirmation or the `--auto-approve` flag.

    • Introduce a 10 second sleep when deleting with `--all` or `--all-namespaces` before proceeding to give the user a chance to react to the warning and interrupt their command.

  • Beta

    • Address user feedback from alpha.

  • GA

    • Deleting with `--all` or `--all-namespaces` now requires interactive confirmation as the default unless `--auto-approve` is passed.

    • Remove the 10-second deletion delay introduced in the alpha, and stop printing the deletion warning when interactive mode is disabled.


Change 2: Throw an error when --namespace provided to cluster-scoped resource deletion


Since namespaces are a cluster resource using the `--namespace | -n` flag when deleting them should error. This flag has no effect on cluster resources and confuses users. We believe this to be an implementation bug that should be fixed for cluster scoped resources. Although it is true that this may break scripts that are incorrectly including the flag on intentional mass deletion operations, the inconvenience to those users of removing the misused flag must be weighed against the material harm this implementation mistake is currently causing to other users in production. This will follow a similar rollout to above.


Change 3: Rename related flags that commonly cause confusion


The `--all` flag should be renamed to `--all-instances`. This makes it entirely clear which "all" it refers to. This would follow a 3-release rollout as well, starting with the new flag and warning about deprecation.


The `--force` flag is also a frequent source of confusion, and users do not understand what exactly is being forced. Alongside the `--all` change (in the same releases), we should consider renaming `--force` to something like `--force-reference-removal`.


These are breaking changes that shouldn't be taken lightly. Scripts, docs, and applications will all need to be modified. Putting on our empathy hats we believe that the benefits and protections to users are worth the hassle. We will do all we can to inform users of these impending changes and follow our standard guidelines for deprecating a flag.


Please see the following for examples of users requesting or running into this. This is a sample from a 5 minute search.


From GitHub:

From StackOverflow:

Eddie Zaneski - on behalf of SIG CLI

Tim Hockin

unread,
May 27, 2021, 3:47:41 PMMay 27
to Eddie Zaneski, Kubernetes developer/contributor discussion, kubernetes-sig-cli
On Thu, May 27, 2021 at 12:35 PM Eddie Zaneski <eddi...@gmail.com> wrote:

Hi Kuberfriendos,


We wanted to start a discussion about mitigating some of the potential footguns in kubectl.


Over the years we've heard stories from users who accidentally deleted resources in their clusters. This trend seems to be rising lately as newer folks venture into the Kubernetes/DevOps/Infra world.


First some background.


When a namespace is deleted it also deletes all of the resources under it. The deletion runs without further confirmation, and can be devastating if accidentally run against the wrong namespace (e.g. thanks to hasty tab completion use).


```

kubectl delete namespace prod-backup

```


When all namespaces are deleted essentially all resources are deleted. This deletion is trivial to do with the `--all` flag, and it also runs without further confirmation. It can effectively wipe out a whole cluster.


```

kubectl delete namespace --all

```


The difference between `--all` and `--all-namespaces` can be confusing.


There are certainly things cluster operators should be doing to help prevent this user error (like locking down permissions) but we'd like to explore what we can do to help end users as maintainers.


There are a few changes we'd like to propose to start discussion. We plan to introduce this as a KEP but wanted to gather early thoughts.


Change 1: Require confirmation when deleting with --all and --all-namespaces


Confirmation when deleting with `--all` and `--all-namespaces` is a long requested feature but we've historically determined this to be a breaking change and declined to implement. Existing scripts would require modification or break. While it is indeed breaking, we believe this change is necessary to protect users.


We propose moving towards requiring confirmation for deleting resources with `--all` and `--all-namespaces` over 3 releases (1 year). This gives us ample time to warn users and communicate the change through blogs and release notes.


Can we start with a request for confirmation when the command is run interactively and a printed warning (and maybe the sleep). 
 

Change 2: Throw an error when --namespace provided to cluster-scoped resource deletion


Since namespaces are a cluster resource using the `--namespace | -n` flag when deleting them should error. This flag has no effect on cluster resources and confuses users. We believe this to be an implementation bug that should be fixed for cluster scoped resources. Although it is true that this may break scripts that are incorrectly including the flag on intentional mass deletion operations, the inconvenience to those users of removing the misused flag must be weighed against the material harm this implementation mistake is currently causing to other users in production. This will follow a similar rollout to above.


The "material harm" here feels very low and I am not convinced it rises to the level of breaking users. 
 

Change 3: Rename related flags that commonly cause confusion


The `--all` flag should be renamed to `--all-instances`. This makes it entirely clear which "all" it refers to. This would follow a 3-release rollout as well, starting with the new flag and warning about deprecation.


I think 3 releases is too aggressive to break users.  We know that it takes months or quarters for releases to propagate into providers' stable-channels.  In the meantime, docs and examples all over the internet will be wrong.

If we're to undertake any such change I think it needs to be more gradual.  Consider 6 to 9 releases instead.  Start by adding new forms and warning on use of the old forms.  Then add small sleeps to the deprecated forms.  Then make the sleeps longer and the warnings louder.  By the time it starts hurting people there will be ample information all over the internet about how to fix it.  Even then, the old commands will still work (even if slowly) for a long time.  And in fact, maybe we should leave it in that state permanently.  Don't break users, just annoy them.
 
Tim

Brian Topping

unread,
May 27, 2021, 3:54:40 PMMay 27
to Eddie Zaneski, kuberne...@googlegroups.com, kubernete...@googlegroups.com
Please also consider this issue

There are good examples of solving this issue in Rook and Gardener. My personal preference from those projects is requiring an annotation to be placed on critical resources before any deletion workflow is allowed to start. If these annotation requirements could be defined declaratively, projects and users could create the constraints on installation. As well, the constraints could be removed if they became onerous in dev / test environments. 

Creating basic safeguards not just about junior users, I have deleted massive amounts of infrastructure several times because I was in the wrong kubectl context. I can’t judge whether I am an idiot or not.

I have started making windows with critical resources present to have obnoxious backgrounds that are unmistakeable. Another idea in this genre is for better kubectl support for PS1 resources. Contexts and/or namespaces could contain API resources with PS1 sequences that are played when a context is activated. Again, this would be easily modified or removed when they aren’t desired.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAN9Ncmx-a6qLr_%3D74Mv%2B%2Bp5rJJkPA%3Dk8vtFNTKs5LY1xB4x_Xw%40mail.gmail.com.

Tim Hockin

unread,
May 27, 2021, 3:58:58 PMMay 27
to Brian Topping, Eddie Zaneski, Kubernetes developer/contributor discussion, kubernetes-sig-cli
default context is a good point.

I'd like a way to set my kubeconfig to not have defaults, and to REQUIRE me to specify --context or --cluster and --namespace.  I have absolutely flubbed this many times.

Jordan Liggitt

unread,
May 27, 2021, 3:59:43 PMMay 27
to Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion, kubernetes-sig-cli
I appreciate the desire to help protect users, but I agree with Tim that rollouts take way longer than you expect, and that the bar for breaking existing users that are successful is very high.

The project's deprecation periods are the minimum required. For the core options of the core commands of a tool like kubectl which is used as a building block, I don't think we should ever break compatibility if we can possibly avoid it.


On Thu, May 27, 2021 at 3:47 PM 'Tim Hockin' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
On Thu, May 27, 2021 at 12:35 PM Eddie Zaneski <eddi...@gmail.com> wrote:

Change 1: Require confirmation when deleting with --all and --all-namespaces


Confirmation when deleting with `--all` and `--all-namespaces` is a long requested feature but we've historically determined this to be a breaking change and declined to implement. Existing scripts would require modification or break. While it is indeed breaking, we believe this change is necessary to protect users.


We propose moving towards requiring confirmation for deleting resources with `--all` and `--all-namespaces` over 3 releases (1 year). This gives us ample time to warn users and communicate the change through blogs and release notes.


Can we start with a request for confirmation when the command is run interactively and a printed warning (and maybe the sleep). 

+1 for limiting behavior changes to interactive runs, and starting with warnings and maybe sleeps.
 
 

Change 2: Throw an error when --namespace provided to cluster-scoped resource deletion


Since namespaces are a cluster resource using the `--namespace | -n` flag when deleting them should error. This flag has no effect on cluster resources and confuses users. We believe this to be an implementation bug that should be fixed for cluster scoped resources. Although it is true that this may break scripts that are incorrectly including the flag on intentional mass deletion operations, the inconvenience to those users of removing the misused flag must be weighed against the material harm this implementation mistake is currently causing to other users in production. This will follow a similar rollout to above.


The "material harm" here feels very low and I am not convinced it rises to the level of breaking users. 

Setting the namespace context of an invocation is equivalent to putting a default namespace in your kubeconfig file. I don't think we should break compatibility with this option. It is likely to disrupt tools that wrap kubectl and set common options on all kubectl invocations.

 
 

Change 3: Rename related flags that commonly cause confusion


The `--all` flag should be renamed to `--all-instances`. This makes it entirely clear which "all" it refers to. This would follow a 3-release rollout as well, starting with the new flag and warning about deprecation.


I think 3 releases is too aggressive to break users.  We know that it takes months or quarters for releases to propagate into providers' stable-channels.  In the meantime, docs and examples all over the internet will be wrong.

If we're to undertake any such change I think it needs to be more gradual.  Consider 6 to 9 releases instead.  Start by adding new forms and warning on use of the old forms.  Then add small sleeps to the deprecated forms.  Then make the sleeps longer and the warnings louder.  By the time it starts hurting people there will be ample information all over the internet about how to fix it.  Even then, the old commands will still work (even if slowly) for a long time.  And in fact, maybe we should leave it in that state permanently.  Don't break users, just annoy them.

If we wanted to add parallel flag names controlling the same variables and hide the old flags, that could be ok, but we should never remove the old flags. Even adding parallel flags means the ecosystem gets fragmented between scripts written against the latest kubectl and ones written using previous flags.

Clayton Coleman

unread,
May 27, 2021, 4:07:02 PMMay 27
to Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
This is somewhat terrifying to me from a backward compatibility perspective.  We have never changed important flags like this, and we have in fact explicitly stated we should not.  I might almost argue that if we were to do this, we'd create a new CLI that has different flags
 



These are breaking changes that shouldn't be taken lightly. Scripts, docs, and applications will all need to be modified. Putting on our empathy hats we believe that the benefits and protections to users are worth the hassle. We will do all we can to inform users of these impending changes and follow our standard guidelines for deprecating a flag.


Please see the following for examples of users requesting or running into this. This is a sample from a 5 minute search.


From GitHub:

From StackOverflow:

Eddie Zaneski - on behalf of SIG CLI

--

Brendan Burns

unread,
May 27, 2021, 4:21:44 PMMay 27
to ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.

We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also consider "k8s.io/lock" which actively blocks the delete.

We could also support those annotations at a namespace level if we wanted to.

This is similar to Management Locks that we introduced in Azure (https://docs.microsoft.com/en-us/rest/api/resources/managementlocks) for similar reasons to prevent accidental deletes and force an explicit action (remove the lock) for a delete to proceed.

--brendan



From: kuberne...@googlegroups.com <kuberne...@googlegroups.com> on behalf of Clayton Coleman <ccol...@redhat.com>
Sent: Thursday, May 27, 2021 1:06 PM
To: Eddie Zaneski <eddi...@gmail.com>
Cc: kubernetes-dev <kuberne...@googlegroups.com>; kubernetes-sig-cli <kubernete...@googlegroups.com>
Subject: [EXTERNAL] Re: [RFC] Protecting users of kubectl delete
 

Jordan Liggitt

unread,
May 27, 2021, 4:23:08 PMMay 27
to Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
I like the "opt into deletion protection" approach. That got discussed a long time ago (e.g. https://github.com/kubernetes/kubernetes/pull/17740#issuecomment-217461024), but didn't get turned into a proposal/implementation

There's a variety of ways that could be done... server-side and enforced, client-side as a hint, etc.

Daniel Smith

unread,
May 27, 2021, 5:11:24 PMMay 27
to Jordan Liggitt, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
I'm in favor of server-side enforced deletion protection, however it's not clear how that will protect a single "locked" item in a namespace if someone deletes the entire namespace.

The last deletion protection mechanism conversation that comes to mind got bogged down in, well what if multiple actors all want to lock an object, how do you know that they have all unlocked it? I can imagine a mechanism like Finalizers (Brian suggested this--"liens"), but I'm not convinced the extra complexity (and implied delay agreeing on / building something) is worth it.

I think I disagree with all those who don't want to make kubectl safer for fear of breaking users, because I think there's probably some middle ground, e.g. I can imagine something like: detect if a TTY is present; if so, give warnings / make them confirm destructive action; otherwise, assume it's a script that's already been tested and just execute it.



You received this message because you are subscribed to the Google Groups "kubernetes-sig-cli" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-cli/CAMBP-pJwPtDk0Zz6XqTo4XFKox8k3RsfQ2b%2B-rLR%2BeeDrTKG4Q%40mail.gmail.com.

Benjamin Elder

unread,
May 27, 2021, 5:20:04 PMMay 27
to Daniel Smith, Jordan Liggitt, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
This is perhaps veering a bit off topic but detecting actual interactivity can be tricky FWIW ... E.G. operating in travis-ci you will detect a TTY as it's intentionally trying to get output from tools that match developer's terminals.

https://github.com/kubernetes-sigs/kind/pull/1479/files
https://github.com/travis-ci/travis-ci/issues/8193
https://github.com/travis-ci/travis-ci/issues/1337

I wouldn't recommend the TTY detection route in particular.

Antonio Ojea

unread,
May 27, 2021, 6:07:57 PMMay 27
to Benjamin Elder, Daniel Smith, Jordan Liggitt, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli

raghvenders raghvenders

unread,
May 28, 2021, 12:12:56 AMMay 28
to Antonio Ojea, Benjamin Elder, Brendan Burns, Daniel Smith, Eddie Zaneski, Jordan Liggitt, ccoleman, kubernetes-dev, kubernetes-sig-cli
Is it worth considering like old school substitution if archive for delete so can be restored then or the same could be achieved through scheduled eviction process for delete ?

Regards,
Raghvender

abhishek....@gmail.com

unread,
May 28, 2021, 6:56:53 AMMay 28
to Kubernetes developer/contributor discussion
My +1 to this proposal.

As much as we care about giving utility to all users, it is also a basic need to provide some cover from accidental disasters. RBAC is a very wide topic and I understand kubernetes administrators has responsibility to restrict access.
At the same time there are cases where a cluster is very big with many applications on it and admin access to a namespace has to be given to different people to ease some work. At last we are all human, one "--all" or "-A| --all-namespaces" is just needed to put down a otherwise running cluster with a 'delete' call.
I would say, it is very possible for anyone to make such mistake but the payment must not be whole cluster going down.
That's the same reason "rm" in Linux has an "--interactive|-i" feature because any level of experts sometimes even make such mistakes.
I am in total favor of having something like "--interactive|-i" or "--ask-for-confirmation" in place as Alpha feature with warning at first, and then slowly graduate it to GA. That would give every one a lot of time to change any breaking automation scripts. 

Siyu Wang

unread,
May 28, 2021, 6:56:53 AMMay 28
to Eddie Zaneski, kuberne...@googlegroups.com, kubernete...@googlegroups.com
Hi, you may look at the OpenKruise project. The latest v0.9.0 version has provided a feature called Deletion Protection, which can not only protect the namespaces from cascading deletion, but also for other resources like workloads and CRD.

The defense by webhook can also protect deletion operations from kubectl or any other api sources.



Rory McCune

unread,
May 28, 2021, 6:56:53 AMMay 28
to Kubernetes developer/contributor discussion
Hi All, 

Looking at this, and seeing that making changes to the operation of kubectl will take a while, would it make sense to start with some more guidance for cluster operators around least privilege RBAC designs and using things like impersonation to reduce the risk of mistakes being made?

If I relate this back to other setups like Windows domain admin, standard good practice is for them not to use their domain admin account for day to day administration but to have a separate account to use where destructive actions are needed. Then of course in Linux we have sudo.

If cluster operators made use of read-only accounts for standard troubleshooting and then had impersonation rights to an account with deletion rights, it may reduce the likelihood of accidents happening as an additional switch would need to be provided.

Kind Regards

Rory

Douglas Schilling Landgraf

unread,
May 28, 2021, 7:48:57 AMMay 28
to Jordan Liggitt, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
On Thu, May 27, 2021 at 4:23 PM 'Jordan Liggitt' via
kubernetes-sig-cli <kubernete...@googlegroups.com> wrote:
>
> I like the "opt into deletion protection" approach. That got discussed a long time ago (e.g. https://github.com/kubernetes/kubernetes/pull/17740#issuecomment-217461024), but didn't get turned into a proposal/implementation
>

+1 Recently I talked with a coworker looking for such a feature.

> There's a variety of ways that could be done... server-side and enforced, client-side as a hint, etc.
>
> On Thu, May 27, 2021 at 4:21 PM 'Brendan Burns' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
>>
>> I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.
>>
>> We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also
> consider "k8s.io/lock" which actively blocks the delete.

Annotation seems a pretty straightforward approach IMO if such a
feature was enabled in the cluster by the user.
> You received this message because you are subscribed to the Google Groups "kubernetes-sig-cli" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-cli/CAMBP-pJwPtDk0Zz6XqTo4XFKox8k3RsfQ2b%2B-rLR%2BeeDrTKG4Q%40mail.gmail.com.

Tim Hockin

unread,
May 28, 2021, 10:31:16 AMMay 28
to Douglas Schilling Landgraf, Jordan Liggitt, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
There are are lots of good ideas here.  I look forward to a solution that takes the best parts of each of them :)

Zizon Qiu

unread,
May 28, 2021, 10:58:23 AMMay 28
to Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
On Fri, May 28, 2021 at 4:21 AM 'Brendan Burns' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.

We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also consider "k8s.io/lock" which actively blocks the delete.
 
Or abuse the existing finalizer mechanism.  

Daniel Smith

unread,
May 28, 2021, 11:35:14 AMMay 28
to Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
Finalizers prevent a deletion from finishing, not from starting.

Tim Hockin

unread,
May 28, 2021, 12:14:40 PMMay 28
to Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
On Fri, May 28, 2021 at 7:58 AM Zizon Qiu <zzd...@gmail.com> wrote:
On Fri, May 28, 2021 at 4:21 AM 'Brendan Burns' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.

We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also consider "k8s.io/lock" which actively blocks the delete.
 
Or abuse the existing finalizer mechanism.  

Finalizers are not "deletion inhibitors" just "deletion delayers".  Once you delete, the finalizer might stop it from happening YET but it *is* going to happen.  I'd rather see a notion of opt-in delete-inhibit.  It is not clear to me what happens if I have a delete-inhibit on something inside a namespace and then try to delete the namespace - we don't have transactions, so we can't abort the whole thing - it would be stuck in a weird partially-deleted state and I expect that to be a never-ending series of bug reports.

 

Tim Hockin

unread,
May 28, 2021, 12:55:32 PMMay 28
to Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli


On Fri, May 28, 2021 at 9:21 AM Zizon Qiu <zzd...@gmail.com> wrote:
I`m thinking of finalizers as some kind of reference counter, like smart pointers in C++ or something like that.

Resources are deallocated when the counter turns down to zero(no more finalizer).
And keeping alive whenever counter > 0(with any arbitrary finalizer).

That's correct, but there's a fundamental difference between "alive" and "waiting to die".  A delete operation moves an object, irrevocably from "alive" to "waiting to die".  That is a visible "state" (the deletionTimestamp is set) and there's no way to come back from it.  Let's not abuse that to mean something else.

Tabitha Sable

unread,
May 28, 2021, 1:54:48 PMMay 28
to Rory McCune, Kubernetes developer/contributor discussion
I really love this suggestion, Rory. I've heard it come up in other contexts before and I think it's really smart.

WDYT about taking this idea to our friends at sig-security-docs?

Tabitha

Abhishek Tamrakar

unread,
May 29, 2021, 1:13:41 AMMay 29
to Tim Hockin, Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
The current deletion strategy provides is easy but very risky without any gates, the deletion could risk whole cluster, this is where it needs some cover.
The reason I would still prefer the client-side approach as mentioned in the original proposal is because the decision of deletion of a certain object or objects should remain in control of the end user at the same time providing the safest for them to operate the cluster.


You received this message because you are subscribed to a topic in the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-dev/y4Q20V3dyOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAO_RewaP8-96m-Tjg4wQ6Gv0yTgL2EeDpmQNaZUK8-BdwM1s7g%40mail.gmail.com.

Zizon Qiu

unread,
May 29, 2021, 1:13:46 AMMay 29
to Tim Hockin, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
I`m thinking of finalizers as some kind of reference counter, like smart pointers in C++ or something like that.

Resources are deallocated when the counter turns down to zero(no more finalizer).
And keeping alive whenever counter > 0(with any arbitrary finalizer).

raghvenders raghvenders

unread,
May 29, 2021, 1:13:50 AMMay 29
to Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
Or Finalizing through consensus :)

raghvenders raghvenders

unread,
Jun 1, 2021, 11:25:25 AMJun 1
to Abhishek Tamrakar, Tim Hockin, Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
Since client-side changes would potentially go about  6-9 releases as mentioned by Tim and potentially breaking changes, a server-side solution would be a reasonable and worthy option to consider and finalize.

Quickly Summarizing the options discussed so far (Server-side):
  • Annotation and Delete Prohibitors
  • Finalizers
  • RBAC and Domain accounts/sudo-like
Please add, if I missed anything or correct me if it is not the option.

And parallelly continuing with proposed Kubectl client-based changes - Change 1 (Interactive), Change 2, and 3 for the targeted release timelines.


I would be curious to see how will it be like, choosing 1 of 3 options or combine the options, then a WBS and stakeholder approvals, components changes, and release rollouts?

Regards,
Raghvender


Josh Berkus

unread,
Jun 1, 2021, 12:26:34 PMJun 1
to Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion, kubernetes-sig-cli
On 5/27/21 12:47 PM, 'Tim Hockin' via Kubernetes developer/contributor
discussion wrote:
>
> If we're to undertake any such change I think it needs to be more
> gradual.  Consider 6 to 9 releases instead.  Start by adding new forms
> and warning on use of the old forms.  Then add small sleeps to the
> deprecated forms.  Then make the sleeps longer and the warnings louder.
> By the time it starts hurting people there will be ample information all
> over the internet about how to fix it.  Even then, the old commands will
> still work (even if slowly) for a long time.  And in fact, maybe we
> should leave it in that state permanently.  Don't break users, just
> annoy them.

My experience is that taking more releases to roll out a breaking change
doesn't really make any difference ... users just ignore the change
until it goes GA, regardless.

Also consider that releases are currently 4 months, so 6 to 9 releases
means 2 to 3 years.

What I would rather see here is a switch that supports the old behavior
in the kubectl config. Then deprecate that over 3 releases or so. So:

Alpha: feature gate
Beta: feature gate, add config switch (on if not set)
GA: on by default, config switch (off if not set)
GA +3: drop config switch -- or not?

... although, now that I think about it, is it *ever* necessary to drop
the config switch? As a scriptwriter, I prefer things I can put into my
.kube config to new switches.

Also, of vital importance here is: how many current popular CI/CD
platforms rely on automated namespace deletion? If the answer is
"several" then that's gonna slow down rollout.

--
-- Josh Berkus
Kubernetes Community Architect
OSPO, OCTO

Tim Hockin

unread,
Jun 1, 2021, 12:52:52 PMJun 1
to raghvenders raghvenders, Abhishek Tamrakar, Zizon Qiu, Brendan Burns, ccoleman, Eddie Zaneski, kubernetes-dev, kubernetes-sig-cli
On Tue, Jun 1, 2021 at 8:20 AM raghvenders raghvenders
<raghv...@gmail.com> wrote:
>
> Since client-side changes would potentially go about 6-9 releases as mentioned by Tim and potentially breaking changes, a server-side solution would be a reasonable and worthy option to consider and finalize.

To be clear - the distinction isn't really client vs. server. It's
about breaking changes without users EXPLICITLY opting in. You REALLY
can't make something that used to work suddenly stop working, whether
that is client or server implemented.

On the contrary, client-side changes like "ask for confirmation" and
"print stuff in color" are easier because they can distinguish between
interactive and non-interactive execution.

Adding a confirmation to interactive commands should not require any
particular delays in rollout.

Eddie Zaneski

unread,
Jun 1, 2021, 4:58:02 PMJun 1
to kubernetes-dev, Abhishek Tamrakar, Zizon Qiu, Brendan Burns, ccoleman, raghvenders raghvenders, Tim Hockin, kubernetes-sig-cli
Thanks to everyone for the great thoughts and discussion so far!

There are some good ideas throughout this thread (please keep them coming) that could probably stand alone as KEPs. I believe anything opt-in/server-side is orthogonal to what we're currently trying to achieve.

I think the big takeaway so far is that the flag and error changes should be separated from the warning/delay/confirmation changes.

We're thinking in the context of an imperative CLI that takes user input and executes administrative actions. Users don't intend to delete the resources they are accidentally deleting - not that there are things that should never be deleted. It doesn't matter how many mistakes have to pile up to create a perfect storm of a bad thing because we're allowing a bad thing to happen without a confirmation gate.

With confirmation in place we significantly lower the chances of accidentally deleting everything in your cluster. This will most likely be the scope of our starting point.

If you want to join us for more we will be discussing during the SIG-CLI call tomorrow (Wednesday 9am PT).


Eddie Zaneski


On Tue, Jun 01, 2021 at 10:52 AM, Tim Hockin <tho...@google.com> wrote:

On Tue, Jun 1, 2021 at 8:20 AM raghvenders raghvenders
<raghvenders@gmail.com> wrote:

Since client-side changes would potentially go about 6-9 releases as mentioned by Tim and potentially breaking changes, a server-side solution would be a reasonable and worthy option to consider and finalize.

To be clear - the distinction isn't really client vs. server. It's about breaking changes without users EXPLICITLY opting in. You REALLY can't make something that used to work suddenly stop working, whether that is client or server implemented.

On the contrary, client-side changes like "ask for confirmation" and
"print stuff in color" are easier because they can distinguish between interactive and non-interactive execution.

Adding a confirmation to interactive commands should not require any particular delays in rollout.

Quickly Summarizing the options discussed so far (Server-side):

Annotation and Delete Prohibitors
Finalizers
RBAC and Domain accounts/sudo-like

Please add, if I missed anything or correct me if it is not the option.

And parallelly continuing with proposed Kubectl client-based changes - Change 1 (Interactive), Change 2, and 3 for the targeted release timelines.

I would be curious to see how will it be like, choosing 1 of 3 options or combine the options, then a WBS and stakeholder approvals, components changes, and release rollouts?

Regards,
Raghvender

On Sat, May 29, 2021 at 12:13 AM Abhishek Tamrakar <abhishek.tamrakar08@gmail.com> wrote:

The current deletion strategy provides is easy but very risky without any gates, the deletion could risk whole cluster, this is where it needs some cover. The reason I would still prefer the client-side approach as mentioned in the original proposal is because the decision of deletion of a certain object or objects should remain in control of the end user at the same time providing the safest for them to operate the cluster.

On Fri, May 28, 2021, 22:25 'Tim Hockin' via Kubernetes developer/contributor discussion <kubernetes-dev@googlegroups.com> wrote:

On Fri, May 28, 2021 at 9:21 AM Zizon Qiu <zzdtsv@gmail.com> wrote:

I`m thinking of finalizers as some kind of reference counter, like smart pointers in C++ or something like that.

Resources are deallocated when the counter turns down to zero(no more finalizer). And keeping alive whenever counter > 0(with any arbitrary finalizer).

That's correct, but there's a fundamental difference between "alive" and "waiting to die". A delete operation moves an object, irrevocably from "alive" to "waiting to die". That is a visible "state" (the deletionTimestamp is set) and there's no way to come back from it. Let's not abuse that to mean something else.

On Sat, May 29, 2021 at 12:14 AM Tim Hockin <thockin@google.com> wrote:

On Fri, May 28, 2021 at 7:58 AM Zizon Qiu <zzdtsv@gmail.com> wrote:

On Fri, May 28, 2021 at 4:21 AM 'Brendan Burns' via Kubernetes developer/contributor discussion <kubernetes-dev@googlegroups.com> wrote:

I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.

We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also consider "k8s.io/lock" which actively blocks the delete.

Or abuse the existing finalizer mechanism.

Finalizers are not "deletion inhibitors" just "deletion delayers". Once you delete, the finalizer might stop it from happening YET but it *is* going to happen. I'd rather see a notion of opt-in delete-inhibit. It is not clear to me what happens if I have a delete-inhibit on something inside a namespace and then try to delete the namespace - we don't have transactions, so we can't abort the whole thing - it would be stuck in a weird partially-deleted state and I expect that to be a never-ending series of bug reports.

We could also support those annotations at a namespace level if we wanted to.

This is similar to Management Locks that we introduced in Azure (https://docs.microsoft.com/en-us/rest/api/resources/managementlocks) for similar reasons to prevent accidental deletes and force an explicit action (remove the lock) for a delete to proceed.

--brendan

________________________________
From: kubernetes-dev@googlegroups.com <kubernetes-dev@googlegroups.com> on behalf of Clayton Coleman <ccoleman@redhat.com> Sent: Thursday, May 27, 2021 1:06 PM
To: Eddie Zaneski <eddiezane@gmail.com>
Cc: kubernetes-dev <kubernetes-dev@googlegroups.com>; kubernetes-sig-cli <kubernetes-sig-cli@googlegroups.com> Subject: [EXTERNAL] Re: [RFC] Protecting users of kubectl delete

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAN9Ncmx-a6qLr_%3D74Mv%2B%2Bp5rJJkPA%3Dk8vtFNTKs5LY1xB4x_Xw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAH16ShKfeUTY2L8dq%2BZr0Eagun_AUtOmpC7sExuuvC8OTZ6YSw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/SA0PR21MB2011CEA6073A236826EC84C3DB239%40SA0PR21MB2011.namprd21.prod.outlook.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAKTRiEK%3Dbu6HQMT9xZ8PCvhQxJT5AX5WsFO_EkkucS%2Btbf4UBA%40mail.gmail.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-dev/y4Q20V3dyOk/unsubscribe. To unsubscribe from this group and all its topics, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAO_RewaP8-96m-Tjg4wQ6Gv0yTgL2EeDpmQNaZUK8-BdwM1s7g%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAGBZAhGEUQ9bd0zcbt3aB2Z7Z958Wqv9Qx7iHwUfnnWWdHvGkA%40mail.gmail.com.


Brian Topping

unread,
Jun 1, 2021, 5:48:40 PMJun 1
to Eddie Zaneski, kubernetes-dev, Abhishek Tamrakar, Zizon Qiu, Brendan Burns, ccoleman, raghvenders raghvenders, Tim Hockin, kubernetes-sig-cli
An especially dangerous situation is one where Ceph storage is managed by Rook. Rook itself is incredibly reliable, but hostStorage is used for the critical Placement Group (PG) maps on monitor nodes (stored in RocksDB). Loss of PG maps would result in loss of *all* PV data in the storage cluster! 

IMO this is more critical than loss of the API object store – assuming they are both backed up, restoring etcd and waiting for reconciliation is several orders of magnitude less downtime than restoring TB/PB/EB of distributed storage. Some resilient application architectures are designed not to need backup, but cannot tolerate a complete storage failure. 

Raising this observation in case it’s worth considering hierarchical confirmation gates with something basic like reference counting. It should be *even harder* to delete PV storage providers, cluster providers or other items that have multiple dependencies.

Maybe this indicates a “deletion provider interface” for pluggable tools. Default no-op implementations echo existing behavior, advanced implementations might be installed with Helm, use LDAP for decision processing and automatically archive deleted content. Let the community build these implementations instead of trying to crystal ball the best semantics. This also pushes tooling responsibility out to deployers. 

$0.02...

To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAN9Ncmwa0zBn1pjvfArcPezj%2B1AupLhwENKpgf1rrQL6p5Nocw%40mail.gmail.com.

fillz...@gmail.com

unread,
Jun 2, 2021, 1:18:54 AMJun 2
to Kubernetes developer/contributor discussion
Server-side solution is reasonable. However, finalizers can just protect the resource to be deleted in etcd, but the resources belongs to it will still be deleted.
Webhook might be a better way to extend this.

OpenKruise, one of the CNCF sandbox projects, has already provides the Protection for Cascading Deletion.



Fury kerry

unread,
Jun 2, 2021, 1:19:06 AMJun 2
to Eddie Zaneski, kubernetes-dev, Abhishek Tamrakar, Zizon Qiu, Brendan Burns, ccoleman, raghvenders raghvenders, Tim Hockin, kubernetes-sig-cli
server side deletion protections are already implemented in OpenKruise (https://openkruise.io/en-us/docs/deletion_protection.html), which cover both namespace and workload cascade detention. 

On Sat, May 29, 2021 at 12:13 AM Abhishek Tamrakar <abhishek....@gmail.com> wrote:

The current deletion strategy provides is easy but very risky without any gates, the deletion could risk whole cluster, this is where it needs some cover. The reason I would still prefer the client-side approach as mentioned in the original proposal is because the decision of deletion of a certain object or objects should remain in control of the end user at the same time providing the safest for them to operate the cluster.

On Fri, May 28, 2021, 22:25 'Tim Hockin' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:

On Fri, May 28, 2021 at 9:21 AM Zizon Qiu <zzd...@gmail.com> wrote:

I`m thinking of finalizers as some kind of reference counter, like smart pointers in C++ or something like that.

Resources are deallocated when the counter turns down to zero(no more finalizer). And keeping alive whenever counter > 0(with any arbitrary finalizer).

That's correct, but there's a fundamental difference between "alive" and "waiting to die". A delete operation moves an object, irrevocably from "alive" to "waiting to die". That is a visible "state" (the deletionTimestamp is set) and there's no way to come back from it. Let's not abuse that to mean something else.

On Sat, May 29, 2021 at 12:14 AM Tim Hockin <tho...@google.com> wrote:

On Fri, May 28, 2021 at 7:58 AM Zizon Qiu <zzd...@gmail.com> wrote:

On Fri, May 28, 2021 at 4:21 AM 'Brendan Burns' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:

I'd like to suggest an alternate approach that is more opt-in and is also backward compatible.

We can add an annotation ("k8s.io/confirm-delete: true") to a Pod and if that annotation is present, prompt for confirmation of the delete. We might also consider "k8s.io/lock" which actively blocks the delete.

Or abuse the existing finalizer mechanism.

Finalizers are not "deletion inhibitors" just "deletion delayers". Once you delete, the finalizer might stop it from happening YET but it *is* going to happen. I'd rather see a notion of opt-in delete-inhibit. It is not clear to me what happens if I have a delete-inhibit on something inside a namespace and then try to delete the namespace - we don't have transactions, so we can't abort the whole thing - it would be stuck in a weird partially-deleted state and I expect that to be a never-ending series of bug reports.

We could also support those annotations at a namespace level if we wanted to.

This is similar to Management Locks that we introduced in Azure (https://docs.microsoft.com/en-us/rest/api/resources/managementlocks) for similar reasons to prevent accidental deletes and force an explicit action (remove the lock) for a delete to proceed.

--brendan

________________________________

From: kuberne...@googlegroups.com <kuberne...@googlegroups.com> on behalf of Clayton Coleman <ccol...@redhat.com> Sent: Thursday, May 27, 2021 1:06 PM
To: Eddie Zaneski <eddi...@gmail.com>

Cc: kubernetes-dev <kuberne...@googlegroups.com>; kubernetes-sig-cli <kubernete...@googlegroups.com> Subject: [EXTERNAL] Re: [RFC] Protecting users of kubectl delete

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAN9Ncmx-a6qLr_%3D74Mv%2B%2Bp5rJJkPA%3Dk8vtFNTKs5LY1xB4x_Xw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAH16ShKfeUTY2L8dq%2BZr0Eagun_AUtOmpC7sExuuvC8OTZ6YSw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/SA0PR21MB2011CEA6073A236826EC84C3DB239%40SA0PR21MB2011.namprd21.prod.outlook.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAKTRiEK%3Dbu6HQMT9xZ8PCvhQxJT5AX5WsFO_EkkucS%2Btbf4UBA%40mail.gmail.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-dev/y4Q20V3dyOk/unsubscribe. To unsubscribe from this group and all its topics, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAO_RewaP8-96m-Tjg4wQ6Gv0yTgL2EeDpmQNaZUK8-BdwM1s7g%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAGBZAhGEUQ9bd0zcbt3aB2Z7Z958Wqv9Qx7iHwUfnnWWdHvGkA%40mail.gmail.com.


--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAN9Ncmwa0zBn1pjvfArcPezj%2B1AupLhwENKpgf1rrQL6p5Nocw%40mail.gmail.com.


--
Please consider the environment before you print this mail
Zhen Zhang
Zhejiang University
Yuquan Campus
MSN:Fury_...@hotmail.com

Eddie Zaneski

unread,
Jun 2, 2021, 5:02:43 PMJun 2
to Kubernetes developer/contributor discussion
Thanks to everyone who joined the call today and provided valuable input!

If you'd like to watch the recording you can find it here.

In summary we want to balance protecting users with breaking users. We will propose KEP 2775 to add two coupled changes:
  • Add a new `--interactive | -i` flag to kubectl delete that will require confirmation before deleting resources. This flag will be false by default.
  • `kubectl delete [--all | --all-namespaces]` will warn about the destructive action that will be performed and artificially delay for x seconds allowing users a chance to abort.
These changes are not breaking and immediately provide users a way to mitigate accidental deletions.

An opt-in mechanism to default to the new interactive behavior through user config files will be a fast follow.

Once these measures are in place we will re-visit and address community feedback.

Tim Hockin

unread,
Jun 2, 2021, 7:15:35 PMJun 2
to Eddie Zaneski, Kubernetes developer/contributor discussion
Was the idea of demanding interactive confirmation when the command is
executed interactively discarded?
> --
> You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/60247c7d-ed34-4698-a947-2b3bf72eb98cn%40googlegroups.com.

Tim Hockin

unread,
Jun 2, 2021, 8:08:55 PMJun 2
to Eddie Zaneski, Kubernetes developer/contributor discussion
Rephrasing for clarity:

Did we discard the idea of demanding interactive confirmation when a
"dangerous" command is executed in an interactive session? If so,
why? To me, that seems like the most approachable first vector and
likely to get a good return on investment.

Eddie Zaneski

unread,
Jun 3, 2021, 1:40:57 AMJun 3
to Kubernetes developer/contributor discussion
On Wednesday, June 2, 2021 at 6:08:55 PM UTC-6 Tim Hockin wrote:
Rephrasing for clarity:

Did we discard the idea of demanding interactive confirmation when a
"dangerous" command is executed in an interactive session? If so,
why? To me, that seems like the most approachable first vector and
likely to get a good return on investment.

Wasn't discarded. I'll include in the KEP and we can dig in for more cases of platforms doing funny things with spoofing TTY's.

Eddie Zaneski

unread,
Jun 3, 2021, 7:32:28 PMJun 3
to Kubernetes developer/contributor discussion
On Wednesday, June 2, 2021 at 11:40:57 PM UTC-6 Eddie Zaneski wrote:
On Wednesday, June 2, 2021 at 6:08:55 PM UTC-6 Tim Hockin wrote:
Rephrasing for clarity:

Did we discard the idea of demanding interactive confirmation when a
"dangerous" command is executed in an interactive session? If so,
why? To me, that seems like the most approachable first vector and
likely to get a good return on investment.

Wasn't discarded. I'll include in the KEP and we can dig in for more cases of platforms doing funny things with spoofing TTY's.

I've done a bit more digging into Ben's comments about TTY detection and think we may need to discard that route.

TravisCI is one provider we know of spoofing a TTY to trick tools into outputting things like color and status bars. It looks like CircleCI may do this as well.

kind hardcoded some vendor specific environment variables to get around this with Travis but I don't think we can/want to do that for all the vendors.

If we can't reliably detect TTY's inside these pipelines we will indeed break scripts.

Thoughts?

Tim Hockin

unread,
Jun 3, 2021, 8:11:39 PMJun 3
to Eddie Zaneski, Kubernetes developer/contributor discussion
On Thu, Jun 3, 2021 at 4:32 PM Eddie Zaneski <eddi...@gmail.com> wrote:
>
> On Wednesday, June 2, 2021 at 11:40:57 PM UTC-6 Eddie Zaneski wrote:
>>
>> On Wednesday, June 2, 2021 at 6:08:55 PM UTC-6 Tim Hockin wrote:
>>>
>>> Rephrasing for clarity:
>>>
>>> Did we discard the idea of demanding interactive confirmation when a
>>> "dangerous" command is executed in an interactive session? If so,
>>> why? To me, that seems like the most approachable first vector and
>>> likely to get a good return on investment.
>>
>>
>> Wasn't discarded. I'll include in the KEP and we can dig in for more cases of platforms doing funny things with spoofing TTY's.
>
>
> I've done a bit more digging into Ben's comments about TTY detection and think we may need to discard that route.
>
> TravisCI is one provider we know of spoofing a TTY to trick tools into outputting things like color and status bars. It looks like CircleCI may do this as well.

Well. That's unfortunate.

> kind hardcoded some vendor specific environment variables to get around this with Travis but I don't think we can/want to do that for all the vendors.
>
> If we can't reliably detect TTY's inside these pipelines we will indeed break scripts.

Yes, that's the conclusion I come to, also. Harumph.

> Thoughts?
>
> --
> You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/d12b0308-7f54-4e96-990b-23cc12d7b267n%40googlegroups.com.

Jordan Liggitt

unread,
Jun 3, 2021, 10:48:32 PMJun 3
to Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion
On Thu, Jun 3, 2021 at 8:11 PM 'Tim Hockin' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
On Thu, Jun 3, 2021 at 4:32 PM Eddie Zaneski <eddi...@gmail.com> wrote:
>
> On Wednesday, June 2, 2021 at 11:40:57 PM UTC-6 Eddie Zaneski wrote:
>>
>> On Wednesday, June 2, 2021 at 6:08:55 PM UTC-6 Tim Hockin wrote:
>>>
>>> Rephrasing for clarity:
>>>
>>> Did we discard the idea of demanding interactive confirmation when a
>>> "dangerous" command is executed in an interactive session? If so,
>>> why? To me, that seems like the most approachable first vector and
>>> likely to get a good return on investment.
>>
>>
>> Wasn't discarded. I'll include in the KEP and we can dig in for more cases of platforms doing funny things with spoofing TTY's.
>
>
> I've done a bit more digging into Ben's comments about TTY detection and think we may need to discard that route.
>
> TravisCI is one provider we know of spoofing a TTY to trick tools into outputting things like color and status bars. It looks like CircleCI may do this as well.

Well.  That's unfortunate.

> kind hardcoded some vendor specific environment variables to get around this with Travis but I don't think we can/want to do that for all the vendors.
>
> If we can't reliably detect TTY's inside these pipelines we will indeed break scripts.

Yes, that's the conclusion I come to, also.  Harumph.

If we insert a hard wait for confirmation for auto-detected TTYs, I agree that's potentially breaking (I have other questions about using stdin for confirmation when combined with other stdin uses like `-f -` or credential plugins that make use of stdin, but I'll save those questions for the KEP).

If instead of a hard wait for confirmation, we insert a stderr warning + delay for specific super-destructive things on detected TTY to give time to abort, that seems potentially acceptable.


Tim Hockin

unread,
Jun 4, 2021, 1:02:07 AMJun 4
to Jordan Liggitt, Eddie Zaneski, Kubernetes developer/contributor discussion
Potentially acceptable... Anyone using kubectl in CI will slow down...  But maybe that's ok if we limit the the automatic interact to just "very scary" things and not antly old "apply".

Jordan Liggitt

unread,
Jun 4, 2021, 1:04:43 AMJun 4
to Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion
Yeah, I'm envisioning scoping to commands like:

# delete all the things
kubectl delete namespaces --all

# ineffective namespace scoping
kubectl delete persistentvolumes --all --namespace=foo

Abhishek Tamrakar

unread,
Jun 4, 2021, 1:53:08 AMJun 4
to Jordan Liggitt, Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion
Strongly agree, if we limit to only potential "purge everything" like commands it would be good to have. 

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-dev/y4Q20V3dyOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAMBP-p%2BOf1KeQMDdvfcA21cWccSpi7HQULgKeq40ezq-FcydTg%40mail.gmail.com.

Brian Topping

unread,
Jun 4, 2021, 2:10:53 AMJun 4
to Abhishek Tamrakar, Jordan Liggitt, Tim Hockin, Eddie Zaneski, Kubernetes developer/contributor discussion
This is the same theme with `kubeadm reset` when the cluster only has a single node as well...

You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAGBZAhGjNTyxDswciNnqQ%2Bx3%3Duco0u9x8NP49oGoYVzMSz_BBw%40mail.gmail.com.

Paco俊杰Junjie 徐Xu

unread,
Jun 4, 2021, 4:31:26 AMJun 4
to Kubernetes developer/contributor discussion
+1 for interactive confirmation. The deletion protection for linux is `alias rm='rm -iv'`

  • This start rm (remove files or directories) in interactive and verbose mode to avoid to delete files by mistake.
  • Note that the -f (force) option ignore the interactive option.
alias krm='kubectl delete -i '

Bill WANG

unread,
Jun 5, 2021, 8:50:29 AMJun 5
to Kubernetes developer/contributor discussion
Frankly I hope to not support `--all` option at all. If we really need this function, I can run a simple `for loop` to go through all namespaces and delete them. `--all` with delete is evil 

What I need is, when run delete with `--all`, it provides some warnings, such as list some resources will be deleted, wait for confirmation, show some different color if can, and so on.

For change 2&3 I am not really care. 

Bill

raghvenders raghvenders

unread,
Jun 6, 2021, 8:56:16 PMJun 6
to Bill WANG, Kubernetes developer/contributor discussion
I am totally with the approach behind interactive and delete-all.

But I wish to log my initial thoughts where we may consider them for future reference or scope.
My initial thought around delete is through a consensus-based approach. Below is just the example but it is not limited or exactly looks the same.
For example, to delete a namespace, the same delete command to be executed twice ( cluster-admin) followed by a namespace owner. Likewise for delete all. 


The total idea I see here is sharing power and decentralized unlike a delete performed against a single Linux machine or a file system. 
It may demand a state-managed store like etcd or other where we have to main the state for delete consensus/rule. 

Note: This is not just a change for the client, of course, backward compatibility and others component changes come in. And the change in operational aspects for users
and I don't know how to make this behave in the pipeline and other automation approaches.

If you feel this is a meaningful and rightful approach to consider, I can log it against the scope for enhancements in the future.


Regards,
Raghvender