F1 instance, changed behavior? quitting on terminated signal

42 views
Skip to first unread message

Carlo Contavalli

unread,
Sep 10, 2021, 8:58:35 AM9/10/21
to google-a...@googlegroups.com
Hello,

We have a golang 1.15 app, app takes < 1 second to start (~200 ms), requests are handled quickly (< 10ms normally), benchmarks show it can handle several thousands qps easily, has a handler for /_ah/{start,stop,warmup} always returning 200, very few external dependencies (it rarely uses datastore, most of the times it just signs tokens/verifies signatures).

Given the low volume of traffic and startup speed, and how many requests a single instance can handle, we are totally ok with having 0 instances when there is no traffic, and up to 1 instance with traffic.

Now... we started seeing 500s. I think I tracked this down to autoscaling? 

On one terminal, I have:

while :; do
  sleep 30;
done

so, doing a request every 30 seconds. This should keep our one instance alive, right?

But in the log explorer I can see that ~every 10 minutes, the instance is terminated, regardless of requests. For example, I see:

2021/09/09 21:23:10 Opening port 8081 --> START of APP
[...] many requests handled successfully
2021-09-09 14:32:45.740 HEAD curl 7.72.0 --> LAST CURL REQUEST
2021-09-09 14:33:11.747 PDT [start] 2021/09/09 21:33:11.747340 Quitting on terminated signal

After it quits, I see a window of 5-15 mins during which the application is not running, and 500s are always returned. Then it starts again, and the cycle repeats.

It seems like APP ENGINE thinks the application is not healthy, so it terminates it, and reschedules it later after some form of exponential back off?

Note that the application always returns 200 to /_ah/* requests.

Note also that the application has seen very little changes, and we've been running it for almost a year. Looking at log explorer, I can see that the last deployment was around Aug 24th, and we started seeing a high volume of 500s on September 7th - very unlikely to be a change on our side? Before then, for as far as I can go with log explorer, I don't see 500s in the graph.

Ideas? Suggestions?

Thank you,

[1]:
env: standard
runtime: go115
instance_class: F1

automatic_scaling:
  max_instances: 1
  min_instances: 0

inbound_services:
- warmup



Reply all
Reply to author
Forward
0 new messages