From what I understand , your instance is overloaded during spikes. My suggestion is to set target_concurrent_requests [1] parameter to limit the number of concurrent requests per instance. This will trigger the creation of a new instance as soon as the limit is reached.
There is also the max_concurrent_requests [2] use to specify when a new instance is started due to concurrent requests.
[1]https://cloud.google.com/appengine/docs/flexible/python/reference/app-yaml?hl=en#automatic_scaling
[2]https://cloud.google.com/appengine/docs/standard/nodejs/config/appref#scaling_elements