In the case you described I second what Derek says: fetching 100 entities with moderate size is reasonable to run on GAE as a single operation, if you don't hit any limits: request size from data store, number of entities fetched or the datastore rpc timeout.
But, when the limits are not enough for a single operation, I usually try to visualize the problem as a set of steps: read, process, update. Usually, I run a query to fetch N entries in batches of size M, and for each entity fetched, launch a go routine to process it. Then, as they get processed, I send them to a channel to "buffer" them and batch put when buffer is size M. The pattern is something like this:
in := make(chan(MyEntity)) // MyEntity.ID is used to build Key; is set when fetching and used when putting.
out := make(chan(MyEntity))
wg := &sync.WorkGroup{}
wg.Add(3)
go run fetch(in, wg)
go run process(in, out, wg)
go run put(out, wg)
wg.Wait()
- fetch does GetAll in batches (so we don't hit dead-lines or limits), sending them to process via in; fetch closes in when the query is done.
- process is a for range over in that mutates/process the entity then sends them to out. You can process items here serially, then close out when done. You need more synchronization if you want to use one go routine for each entity.
- put, in turn, is a for loop reading from out, that appends to a slice and PutMulti when len(slice) is grater than 100, until chan is closed.
Working group is used to synchronize work, i.e., we wait until all entities were read, processed and saved; just add defer wg.Done() at the beginning of each function. Handling errors is a bit more complicated: I usually log them, but you may as well put an error channel that all use to aggregate the errors and handle them after wg.Wait().
The benefit of the pattern above is that your program will not block reading or writing to data store. In theory, this pattern allows you to process any size of items (N) in batches of size (M), concurrently. M is your upper bound for in-memory stuff (roughly).
This works well for small data sets (N between 1 and 10000), but depends on how much time you need to process each entry. I usually split a larger data set in ranges (basically, changing the query in fetch() func) then making multiple parallel operations as multiple task queues (fan-in). Fan-out is more complicated and use-case dependent. I mostly can come out with a single data store entity to control progress over multiple tasks by updating a "done" field or something.
Hope this helps!