Need help understand why so many instances are spinning up

44 views

python

Skip to first unread message

Rob Curtis

unread,

Mar 21, 2020, 12:46:21 AM3/21/20

to Google App Engine

Hi,

We have a gae flex project using docker.

The "problem" we're experiencing is that the number of instances spikes to 20 with single request (as far as I can tell).

In the example below I had a task that kept retrying to call the service. The service attempts to do some file processing but fails. I stopped the retrying task and the usage goes back to normal.

I would like to understand why so many instances would spin up each time, instead of just say twice the number it was using? I've included the configuration used for gunicorn and app.yaml

#gunicorn config:

import multiprocessing

workers = multiprocessing.cpu_count() * 2 + 1
worker_class = 'sync'
timeout = 120
graceful_timeout = 120

#-----------------------------

#app.yaml

runtime: custom
env: flex
entrypoint: gunicorn -b :$PORT main:app
resources:
  cpu: 1
  memory_gb: 4.0
service: conversion-service 

automatic_scaling:
  min_num_instances: 1
  cool_down_period_sec: 120



Thanks
Rob

Katayoon (Cloud Platform Support)

unread,

Mar 23, 2020, 3:46:47 PM3/23/20

to Google App Engine

Hi Rob,

My assumption is that your task is CPU intensive while you are using the default setting in your yaml file which is 1 core and doesn't seem to be enough for your task to be run successfully. Note that, the default target_utilization is 0.5. That means, when the average CPU usage across all running instances reaches to 50%, the autoscaler would fire up more instances.

So, when your task kicks in, the autoscaler gets notified that the CPU is not enough to run the task and it aggressively fire up more instances. However, as you mentioned, the task doesn't finish up successfully and it gets stuck since it needs at least one powerful CPU to be accomplished. I recommend to tweak your setting and consider using more cores in your yaml file to see if it resolves the issue. If the issue still persists, you may create a PRIVATE report in the GCP Issue Tracker and provide us with your project ID, the service ID, and the version ID so that we would be able to dig into the issue.

Reply all

Reply to author

Forward

0 new messages