Questions on adjusting autoscaling

Marcin Copik

unread,

Nov 27, 2021, 7:42:35 PM11/27/21

to Knative Users

Hello everyone!

We want to adjust the implementation of KPA since our design cannot be expressed purely in Knative configuration: we want to make the scaling policy dependent on additional metrics in each pod. In particular, we don't want to perform scaling down if the new metric shows high activity, even if the standard metric of requests per second is low enough to trigger pod deallocation. Furthermore, we want to select pods to be deallocated based on the value in that second metric.

The documentation on the KPA algorithm and autoscaler design is written well, but I've had a bit of trouble matching the algorithm components to the codebase.

After looking at the documentation (serving/docs/scaling/SYSTEM.md), it seems that I need to apply the following actions:
- Adjust user pod to report additional metrics.
- Change Autoscaler/Metric to report a new type of data.
- Add a new scaling algorithm to the Autoscaler.

(1) Does this sound feasible in Knative or would you rather recommend something else? One of the previous discussions on this mailing list suggested writing a new scaler as an option. This is supposed to be supported by Knative, but I couldn't find any documentation on that.

(2) I'm not certain where is located the code responsible for the autoscaling policy. After looking at logs of my Knative deployment, it seems that at least part of the logic for a single pod is implemented in "pkg/reconciler/autoscaling/kpa/" package. The metric's collector is in "pkg/autoscaler/metrics/collector.go", and the rest of scaling logic is implemented in "pkg/autoscaler/scaling/autoscaler.go".

Is that correct, or have I missed something?

Thanks for the answers and help!

Best,
Marcin

Evan Anderson

unread,

Nov 27, 2021, 11:01:14 PM11/27/21

to Marcin Copik, Knative Users

I'm wondering whether it would make more sense to extend the HPA support to include HPA custom metric support, and then use an external metrics service to determine the proper number of replicas.

The code to create the HPA is here: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/hpa/hpa.go#L52; if you look in resources.MakeHPA here, you'll see that currently only "CPU" and "Memory" types are supported. If you're looking to contribute your changes upstream, I'd spend a bit of time thinking about what the "custom metrics" interface would look like before implementing; if you're working on your own fork, obviously you can go whichever way you want. 😉

As for how things are actuated; my reading of the code is that each Revision creates an (abstract) PodAutoscaler resource, which can be either the HPA or KPA type. Which controller to use is decided by an annotation:

HPA: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/hpa/controller.go#L55

KPA: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/kpa/controller.go#L61

You could use similar logic to create your own autoscaling resource / controller and reconcile it if you do want to implement your own scaler.

Another thing you might be able to do (untested) would be to modify the `minScale` annotation on the existing revision based on the observed metrics. This would have a slower feedback loop and you might occasionally have a scale-down race, but it might also be much quicker to write.

--
You received this message because you are subscribed to the Google Groups "Knative Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to knative-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/knative-users/46122781-ddf7-4bc7-ac99-089e0c943036n%40googlegroups.com.

Marcin Copik

unread,

Nov 28, 2021, 9:01:08 AM11/28/21

to Evan Anderson, Knative Users

Dear Evan,

Thanks for a swift reply!

> I'm wondering whether it would make more sense to extend the HPA support to include HPA custom metric support, and then use an external metrics service to determine the proper number of replicas.
>
> The code to create the HPA is here: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/hpa/hpa.go#L52; if you look in resources.MakeHPA here, you'll see that currently only "CPU" and "Memory" types are supported. If you're looking to contribute your changes upstream, I'd spend a bit of time thinking about what the "custom metrics" interface would look like before implementing; if you're working on your own fork, obviously you can go whichever way you want. 😉

This sounds interesting. However, we would like to retain the
scale-to-zero and scale-from-zero semantics and it seems to me that
this is a rather complex process involving a switch to/from the
activator.
Thus, I'm not certain if it would be easier to adjust KPA for our
needs? Porting the integration with an activator to HPA might not be
an easy task, and as you can see - we're quite new to Knative :-)

> As for how things are actuated; my reading of the code is that each Revision creates an (abstract) PodAutoscaler resource, which can be either the HPA or KPA type. Which controller to use is decided by an annotation:
> HPA: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/hpa/controller.go#L55
> KPA: https://github.com/knative/serving/blob/main/pkg/reconciler/autoscaling/kpa/controller.go#L61
>
> You could use similar logic to create your own autoscaling resource / controller and reconcile it if you do want to implement your own scaler.

Thank you! This is a good tip, I think - if there's one instance of
PodAutoscaler per Revision, then we should be able to deploy a new
version for our application. However, if my understanding of the
system is correct, then we still need to change the "Decider" part of
our PA to avoid continuous invocations of the reconciliation?

> Another thing you might be able to do (untested) would be to modify the `minScale` annotation on the existing revision based on the observed metrics. This would have a slower feedback loop and you might occasionally have a scale-down race, but it might also be much quicker to write.

Do you mean that we should change this option on the fly, so the new
value would be picked up by the KPA and we could prevent unnecessary
scaling down?

Best regards,
Marcin

Message has been deleted

Marcin Copik

unread,

Nov 29, 2021, 3:50:03 PM11/29/21

to Yafang Wu, Knative Users

Dear Yafang,

> I'm interested in your idea，but not sure whether i have understood this correctly：

Thanks so much for your questions and interest! I think you already
raised an important limitation that I wasn't aware of.

> > Furthermore, we want to select pods to be deallocated based on the value in that second metric.
>

> Do you mean that you want to delete some specific pods first when scaling down, and which one to delete first is determined based on the value of the additional metric?
> As far as i know，knative implements scaling by modifying the replicas of k8s deployment and it could not decide the order of deleting pods. If you want to do that, you might need to modify k8s deployment controller.

Yes, that is what we want to do. Thank you for noticing this problem -
indeed, this is an important limitation.
Fortunately, the recent k8s release (1.22) includes a new feature of
assigning deletion costs to pods. When serving user applications, is
Knative using replica sets or just a replication controller?

The replication controller is supposed to release first the pods with
the lowest cost. This is done on the best-effort basis, but our model
can tolerate that. Furthermore, their system warns from updating costs
too often, but I think this should be fine since we will decrease cost
for a selected pod only when performing the actual down-scaling:
"Users should avoid updating the annotation frequently, such as
updating it based on a metric value, because doing so will generate a
significant number of pod updates on the apiserver."
https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/

I'm not aware of better alternatives. When looking at k8s
documentation, it seems to me that I can always remove the replication
controller, manually manage pods, and recreate the controller.
However, it's a hack and I don't know if knative would allow us to
reallocate the controller.

> > we want to make the scaling policy dependent on additional metrics in each pod.
>

> Do you mean that you want to decide when to scale up or scale down based on two or more metrics? Or do you mean you just want to use some custom metrics to make decisions instead of using concurrency or rps?

It's the latter, mostly.

> > However, we would like to retain the
> scale-to-zero and scale-from-zero semantics and it seems to me that
> this is a rather complex process involving a switch to/from the
> activator.
>

> Maybe you could implement a new scaler, add a websocket connection to the scaler in activator, and use annotation in revision to determine which scaler to notify when scale-from-zero? Just a rough idea.

Do you mean that we could modify HPA for our needs and use it for
regular scaling, but use KPA when handling the edge case of scaling
from/to zero?

Best regards,
Marcin

> --
> You received this message because you are subscribed to a topic in the Google Groups "Knative Users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/knative-users/zk3hN_tYQyk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to knative-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/knative-users/d91e3a7e-afcb-47ad-89e6-8b0c5c340e56n%40googlegroups.com.

Yafang Wu

unread,

Nov 29, 2021, 9:18:22 PM11/29/21

to Knative Users

Dear Marcin,

I am glad to learn about this new feature and thanks so much!

> Fortunately, the recent k8s release (1.22) includes a new feature of
assigning deletion costs to pods. When serving user applications, is
Knative using replica sets or just a replication controller?

I noticed that knative use a duck type named PodScalable. Both deployment and replicaset implement that. Although the knative revision controller sets the "scaleTargetRef Kind" as Deployment for podautoscaler resources in hard code ，I think it is easy to modify the scaleTargetRef kind's value to the kind you need(maybe a replicaset?). But I am wondering how to update controller.kubernetes.io/pod-deletion-cost annotation in pod for knative. Maybe when scaling down，we get the pods of the deployment or replica sets which podautoscaler refers to, and update the annotation of the pods one by one?

// PodScalable is a duck type that the resources referenced by the
// PodAutoscaler's ScaleTargetRef must implement. They must also
// implement the `/scale` sub-resource for use with `/scale` based
// implementations (e.g. HPA), but this further constrains the shape
// the referenced resources may take.

> Do you mean that we could modify HPA for our needs and use it for

regular scaling, but use KPA when handling the edge case of scaling
from/to zero?

Not exactly. As the documents say, podAutoscaler is a Knative abstraction that encapsulates the interface by which Knative components instantiate autoscalers. This definition is an abstraction that may be backed by multiple definitions. For more information, see the Knative Pluggability presentation: https://docs.google.com/presentation/d/10KWynvAJYuOEWy69VBa6bHJVCqIsz1TNdEKosNvcpPY/edit

Maybe we could implement a new scaler instead of the kpa and hpa, just something new. And the new scaler gets the autoscaling params from podautoscaler， and be notified by activator when scale-from-zero.

Reply all

Reply to author

Forward