Typically, when operating a memory cache in this way, you will need some way for your cache layers to signal information to each other. Each caching solution out there, including one you might roll yourself in, say, some hash tables in memory, will have different semantics and methods.
One solution in this situation would be to use lazy loading, so that each request for a resource will check its cache validity via an in-memory flag, and serve from memory cache afterwards if valid. Requests notifying the instance memory of cache invalidation events will be quick to service by the instance when interleaved with user requests. Invalidation events will be communicated by a simple syntax, such as a list of keys to invalidate, or more complex, nested data structures, or data with meanings which your instance memory cache knows how to interpret in some other more specific, complex way.
The problem with such a method is, especially in automatic scaling
where instances are anonymous and not addressable, that each instance memory will need its own method of getting notified.
Memcache itself can be used to coordinate memory caching regardless of instances, as a datacenter service common to all instances.
You could also set up a
memcached box on
Compute Engine.
Another method, as you identified, would involve running a coroutine which is able to poll for cache invalidation events and update the in-memory cache appropriately. The issue is, as you've noticed, in vanilla App Engine, is that background processing on a regular basis outside of the lifetime of a request is somewhat limited. You can check out
func RunInBackground() for manual scaling modules.
To schedule regular "events" which would prompt the instances holding memory caches to themselves revalidate would be the
cron service, although again since this is done via HTTP requests, to be handled by a given instance, that wouldn't be able to address cache invalidation event notifications to each instance in an automatic scaling scheme, only to the instance which caught the request. You could use basic scaling to overcome that limitation, providing a scalable yet finite and addressable pool of instances which can receive cache invalidation event notifications according to your custom specification, perhaps triggered by any one of them catching the cron request and notifying its siblings in turn, deciding to wait for a response / handle errors as needed.
Another final option is to use
Managed VMs, which do allow full access to threading, process control, the filesystem, network interface, and will enable pretty much any pattern you can think of implementing.
I hope this broad discussion on background processing and cache invalidation in the context of the Cloud Platform is helpful as an introduction, although be aware that there are possibly other patterns you could put together which have desirable properties along some axes while suffering limitations along some others.