Dear Yafang,
> I'm interested in your idea,but not sure whether i have understood this correctly:
Thanks so much for your questions and interest! I think you already
raised an important limitation that I wasn't aware of.
> > Furthermore, we want to select pods to be deallocated based on the value in that second metric.
>
> Do you mean that you want to delete some specific pods first when scaling down, and which one to delete first is determined based on the value of the additional metric?
> As far as i know,knative implements scaling by modifying the replicas of k8s deployment and it could not decide the order of deleting pods. If you want to do that, you might need to modify k8s deployment controller.
Yes, that is what we want to do. Thank you for noticing this problem -
indeed, this is an important limitation.
Fortunately, the recent k8s release (1.22) includes a new feature of
assigning deletion costs to pods. When serving user applications, is
Knative using replica sets or just a replication controller?
The replication controller is supposed to release first the pods with
the lowest cost. This is done on the best-effort basis, but our model
can tolerate that. Furthermore, their system warns from updating costs
too often, but I think this should be fine since we will decrease cost
for a selected pod only when performing the actual down-scaling:
"Users should avoid updating the annotation frequently, such as
updating it based on a metric value, because doing so will generate a
significant number of pod updates on the apiserver."
https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
I'm not aware of better alternatives. When looking at k8s
documentation, it seems to me that I can always remove the replication
controller, manually manage pods, and recreate the controller.
However, it's a hack and I don't know if knative would allow us to
reallocate the controller.
> > we want to make the scaling policy dependent on additional metrics in each pod.
>
> Do you mean that you want to decide when to scale up or scale down based on two or more metrics? Or do you mean you just want to use some custom metrics to make decisions instead of using concurrency or rps?
It's the latter, mostly.
> > However, we would like to retain the
> scale-to-zero and scale-from-zero semantics and it seems to me that
> this is a rather complex process involving a switch to/from the
> activator.
>
> Maybe you could implement a new scaler, add a websocket connection to the scaler in activator, and use annotation in revision to determine which scaler to notify when scale-from-zero? Just a rough idea.
Do you mean that we could modify HPA for our needs and use it for
regular scaling, but use KPA when handling the edge case of scaling
from/to zero?
Best regards,
Marcin
> --
> You received this message because you are subscribed to a topic in the Google Groups "Knative Users" group.
> To unsubscribe from this topic, visit
https://groups.google.com/d/topic/knative-users/zk3hN_tYQyk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
knative-user...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/knative-users/d91e3a7e-afcb-47ad-89e6-8b0c5c340e56n%40googlegroups.com.