Hi Eric,
I can't understand well on your second paragraph. "it can read faster" is because of the parallel reading during prefetch?
Here is my understanding about prefetch and replIndexPrefetch parameter. please correct me if I'm wrong anywhere. Thank you !
Without prefetch, if one thread took the write lock and be ready to do then DML, but then, unfortunatly, the page fault happened. it will take a "long" time to read the data from disk into memory and then apply them. Due to the write lock, there is no other threads can get this lock during this time.
With prefetch, there is no write lock needed, many threads will work to read the data into memory *parallely*. after that, the apply thread can apply them without page fault this time.
prefetch includes index prefetch and data prefetch. for perfetch in index, in default, the prefetch will prefetch all the indexes records for the updated documents. but since the indexed columns maybe doesn't changed in this oplog, so this prefetch index is unnecessary. so replIndexPrefetch will work in this case.
my understanding ended.
-----
but prefetch (both prefetch data and index) will be harmful to performance in the following case:
1 the data has been in memory already (some hot data).
2 the most of DML is insert.
to prefetch a data, first we need to parse every oplog and then find out its index entry and data record location. and then parse the oplog again and apply them.
I think the prefetch is costly.
is it necessary to provide a option to let user to disable/enable it by themselves?