App Engine doesn't downscale when CPU utilisation is below target_utilization

548 views
Skip to first unread message

Harshit Dwivedi

unread,
Aug 28, 2019, 12:32:00 PM8/28/19
to Google App Engine
Hi all, my current setup has App Engine running in flex environment with the following yaml file : 

runtime: go
env: flexible

automatic_scaling:
  cool_down_period_sec: 80
  min_num_instances: 3
  max_num_instances: 25
  cpu_utilization:
    target_utilization: 0.999

resources:
  cpu: 1
  memory_gb: 0.85

The number of deployed instances range from 8-14 based on the incoming traffic (which ranges from 2500-4200 Requests per second).

The problem that I'm facing is that App Engine won't downscale my instances even when the average CPU utilization is way lower than what I have speechified in my yaml file.
To validate this, I decided to set the target_utilization as 0.999 as a test in my yaml and in the screenshot below, you can see that App Engine decides to run 13 instances even when the average CPU load is 90-92%.

Screenshot (3).png


Ideally the instances should be downscale till the average CPU utilization reaches 99.9% again.

Can anyone help me with this, I'm not sure why this is happening.

George (Cloud Platform Support)

unread,
Aug 28, 2019, 7:16:15 PM8/28/19
to Google App Engine
Hello Harshit, 

You seem to imply that downscaling depends strictly on CPU utilization levels. This is in fact not exactly the case, there are more factors involved; following policies may be adopted:

- Average CPU utilization (not identical to target utilization)
- HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
- Stackdriver Monitoring metrics. 

Scaling algorithms are not easy to describe. One factor that comes to mind is historic in nature: CPU charge for a certain past period, for instance. You may find related detail in the Scaling characteristics sub-chapter of the "App Engine Flexible Environment for Users of App Engine Standard Environment", which refers to Autoscaling policy and target utilization.

Harshit Dwivedi

unread,
Aug 29, 2019, 12:36:08 PM8/29/19
to google-a...@googlegroups.com
Thanks George for the explanation.
Is there a way I can tweak the autoscaling params for my appengine flex deployment?

As far as I can see, it only allows me to modify the scaling based on the CPU utilization.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/9e9c9047-5b13-4a9e-8cf7-b161124d3d8e%40googlegroups.com.

Diogo Almeida

unread,
Aug 29, 2019, 9:51:20 PM8/29/19
to Google App Engine
Other than the parameter you are already using, I recommend you check how auto scaling services use dynamic instances and the parameter you can use in the app.yaml.

On Thursday, August 29, 2019 at 12:36:08 PM UTC-4, Harshit Dwivedi wrote:
Thanks George for the explanation.
Is there a way I can tweak the autoscaling params for my appengine flex deployment?

As far as I can see, it only allows me to modify the scaling based on the CPU utilization.

On Thu, Aug 29, 2019 at 4:46 AM 'George (Cloud Platform Support)' via Google App Engine <google-appengine@googlegroups.com> wrote:
Hello Harshit, 

You seem to imply that downscaling depends strictly on CPU utilization levels. This is in fact not exactly the case, there are more factors involved; following policies may be adopted:

- Average CPU utilization (not identical to target utilization)
- HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
- Stackdriver Monitoring metrics. 

Scaling algorithms are not easy to describe. One factor that comes to mind is historic in nature: CPU charge for a certain past period, for instance. You may find related detail in the Scaling characteristics sub-chapter of the "App Engine Flexible Environment for Users of App Engine Standard Environment", which refers to Autoscaling policy and target utilization.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.

Harshit Dwivedi

unread,
Aug 30, 2019, 12:38:10 AM8/30/19
to google-a...@googlegroups.com
Thanks but this seems to be available only for the Standard environment.
I'm currently using Appengine in the flex environment.

On Fri, Aug 30, 2019, 7:21 AM 'Diogo Almeida' via Google App Engine <google-a...@googlegroups.com> wrote:
Other than the parameter you are already using, I recommend you check how auto scaling services use dynamic instances and the parameter you can use in the app.yaml.

On Thursday, August 29, 2019 at 12:36:08 PM UTC-4, Harshit Dwivedi wrote:
Thanks George for the explanation.
Is there a way I can tweak the autoscaling params for my appengine flex deployment?

As far as I can see, it only allows me to modify the scaling based on the CPU utilization.

On Thu, Aug 29, 2019 at 4:46 AM 'George (Cloud Platform Support)' via Google App Engine <google-a...@googlegroups.com> wrote:
Hello Harshit, 

You seem to imply that downscaling depends strictly on CPU utilization levels. This is in fact not exactly the case, there are more factors involved; following policies may be adopted:

- Average CPU utilization (not identical to target utilization)
- HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
- Stackdriver Monitoring metrics. 

Scaling algorithms are not easy to describe. One factor that comes to mind is historic in nature: CPU charge for a certain past period, for instance. You may find related detail in the Scaling characteristics sub-chapter of the "App Engine Flexible Environment for Users of App Engine Standard Environment", which refers to Autoscaling policy and target utilization.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/4d7b4fde-207d-4e4b-9743-c8b90c99a1cc%40googlegroups.com.

Diogo Almeida

unread,
Aug 30, 2019, 6:54:22 PM8/30/19
to Google App Engine
In that case, you are already using all the app.yaml parameters made available for the flexible environment.

On Friday, August 30, 2019 at 12:38:10 AM UTC-4, Harshit Dwivedi wrote:
Thanks but this seems to be available only for the Standard environment.
I'm currently using Appengine in the flex environment.

On Fri, Aug 30, 2019, 7:21 AM 'Diogo Almeida' via Google App Engine <google-appengine@googlegroups.com> wrote:
Other than the parameter you are already using, I recommend you check how auto scaling services use dynamic instances and the parameter you can use in the app.yaml.

On Thursday, August 29, 2019 at 12:36:08 PM UTC-4, Harshit Dwivedi wrote:
Thanks George for the explanation.
Is there a way I can tweak the autoscaling params for my appengine flex deployment?

As far as I can see, it only allows me to modify the scaling based on the CPU utilization.

On Thu, Aug 29, 2019 at 4:46 AM 'George (Cloud Platform Support)' via Google App Engine <google-appengine@googlegroups.com> wrote:
Hello Harshit, 

You seem to imply that downscaling depends strictly on CPU utilization levels. This is in fact not exactly the case, there are more factors involved; following policies may be adopted:

- Average CPU utilization (not identical to target utilization)
- HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
- Stackdriver Monitoring metrics. 

Scaling algorithms are not easy to describe. One factor that comes to mind is historic in nature: CPU charge for a certain past period, for instance. You may find related detail in the Scaling characteristics sub-chapter of the "App Engine Flexible Environment for Users of App Engine Standard Environment", which refers to Autoscaling policy and target utilization.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.

chuda mani

unread,
Sep 3, 2019, 8:54:41 AM9/3/19
to Google App Engine
hi i need a suggestion , i have to deploy angular 8 application with apache server in gcp app engine..is it possible?..if so please  forward any reference documents..thank you

Diogo Almeida

unread,
Sep 3, 2019, 10:52:21 AM9/3/19
to Google App Engine
You can use the App Engine custom runtime to deploy an application in any language.

I did not find any documents about Angular 8, but as a starting point you could take a look at this tutorial for deploying Angular 6.

As not all use cases can be covered in the App Engine documents I recommend you take request development assistance on Stack Overflow, where the community of developers will be able to help you with your Angular coding.

tz martin

unread,
Jul 19, 2020, 7:26:46 PM7/19/20
to Google App Engine
Does anyone know if potential health check settings (liveness_check or readiness_check) are as requests that would prevent downscaling instances?  Wondering if health check "bursts" would keep one or all instances alive. 

wesley c

unread,
Jul 20, 2020, 5:55:57 PM7/20/20
to google-a...@googlegroups.com
This question only applies to Flex users. According to the documentation, health check HTTP requests are *not* sent to the container, meaning that it doesn't affect the autoscaling (up or down) to your app like "real" traffic to your app. The exception is if you "extended" your health checks to your app by providing a path to the endpoint you wish to be hit when a health check is performed. I'm sure you also know that Flex requires *at least* 1 instance being up, so it'll never downscale to zero (see this page regarding scaling, health checks, and other differences b/w standard & flexible environments.

Hope this helps!
--Wesley
--
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"A computer never does what you want... only what you tell it."
    wesley chun :: @wescpy :: Software Architect & Engineer
    Developer Advocate at Google Cloud by day; at night...
    Python training & consulting : http://CyberwebConsulting.com
    "Core Python" books : http://CorePython.com
    Python blog: http://wescpy.blogspot.com

Luca de Alfaro

unread,
Jul 20, 2020, 8:27:01 PM7/20/20
to google-a...@googlegroups.com
I don't understand your example.  You have an average CPU load of 12.7, and 13 instances are used -- this is actually near optimal, what did you expect? 

As for the scaling down, what do you expect and what do you observe, in terms of utilization and latency in scaling down?  You give no data on that.  How long are the requests?  How long does the utilization need to go before one instance is shut down?  There obviously is some threshold you need to go below the target utilization for instances to be shut down. 

The other thing is that you may be thinking in terms of "maximum utilization" measured in 1-second (or whatever) intervals, but possibly to provide good latency, appengine is measuring "maximum utilization" in (say) 100 millisec intervals, and scaling down only when many consecutive of those intervals are below the target.  Utilization under random requests is a stochastic quantity, and so it depends on the time granularity you observe it.

Luca

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/c85d5e96-98f1-4a3f-b1f0-535992e650ee%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages