Alright folks - I've gotten to the bottom of this. Here's an explanation below.
db.put(tests)
Test.all().fetch(500)
1. We write the entity to the datastore by writing to a log. As far as we are concerned, the write has now completed, and we return to the user.
2. In the background, the write is applied to the actual datastore based on data in the log. The unit of write is atomic to an entity group. You can think of this as the commit step.
The time between steps 1 and 2 is tiny. For most intents and purposes, it's almost instantaneous. While it's theoretically possible a read operation can come in between steps 1 and 2, in practice, this never happens, so it appears to be consistent.
How writes and reads work in High Replication
---------------------
1. The new data is written to the log. To ensure consistency, we must write the data to the logs of a majority of data centers (this is how Paxos works). This is what adds the additional write latency required for HR to guarantee consistency. It is possible that 1 or more data centers do not receive the log updates; this is okay because we have consensus.
2. Each individual datastore in each data center is responsible for applying the changes in the log.
When a batch get is issued, it's possible it gets routed to a data center that hasn't yet received the write in its log. For this reason, we need to transact on the log entry across data centers to ensure that we are returning the freshest data for consistency reasons. In the sample code above, there are 500 root entities, so 500 transactions need to occur. This is the key difference between a get in Master-Slave replication versus High Replication: it must happen in a transaction. If you remember how entity group transactions work, you'll remember that they happen by reading the log entry for the root entity. If the code above did a batch get on 500 entities in the same entity group, it would run considerably faster because only one transactional read needs to take place.
You can improve get performance at the expense of consistency by setting the read_policy to eventually consistent. Here's a code sample:
t = time()
config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY)
db.get(keys, config=config)
print "Time it takes to perform a batch fetch with eventual consistency"
print time() - t
The performance of the code sample above will be similar to that of Master-Slave datastore.
In the original benchmark, you might notice that query performance across entity groups is significantly faster - that's because queries assume eventual consistency and are not executed in a transaction like get-by-key operations. See the Usage Notes under the High Replication documentation for more details (
http://code.google.com/appengine/docs/python/datastore/hr/overview.html).
We're working on improving the performance of this case by making a batch get read from multiple data centers, but this will likely not be available until one of the next SDK releases.
Feel free to ask any questions you may have about the explanation.