Requests that hit your application are put into a queue waiting for an instance to become available. If those requests become 10-seconds old, they are cancelled (error to client).
The App Engine Scheduler may or may not attempt to spin up a new instance under this pressure (rules are black box and change from time to time). If it is able to spin up an instance, it will send the request to the "cold" (no warm up request) instance, though there is much debate (e.g.,
https://code.google.com/p/googleappengine/issues/detail?id=7865) about whether or not this should occur, especially for Java apps.
The only way to avoid these errors are to allocate Min Idle Instances, which keeps resident instances around to help in this specific case. These resident instances themselves can be somewhat confusing because they are really only used to serve when a dynamic instance cannot be found - i.e., basically under this pressure situation you've outlined. You'll find, in a more smoothly loaded case, that the resident instances can be very under-utilized as they are basically idle waiting for spikes. This too is the subject of much debate.
j