Hi
Is this expected behaviour for MongoDB or some is there a possible optimisation that can be done on the index update when only a single element is being inserted/deleted/updated on the array?
Do you absolutely require the field array.z to be an array, or is there any alternative schema design that you could consider for your use case?
I wouldn’t call it a problem, but having a large array with thousands of entries might limit your options in the future, since there is a 16MB limit on BSON document size (Limitations and Thresholds). If this array could grow to an undetermined size, hitting this limit in the future might be a real possibility.
Another potential issue is the way multikey index works. MongoDB creates a multikey index by essentially creating one index key per array element. Thus, if your document contains an array with 2000 elements, then MongoDB will create 2000 index entries for that single document. This could create an issue with index sizes.
I noted from your extensive experiments that the growth in operation timing between array sizes is pretty linear. For example, without index, updating a document with array of 1000 took approximately twice as long as updating a document with 500 elements. Also updating a document with 2000 elements took about twice as long as updating a document with 1000 elements. It took longer with indexes, but that’s because the indexes also needs updating as well. This is because WiredTiger is a no-overwrite storage engine, which means that a small update translates to a full document rewrite inside WiredTiger. If your use case require small updates to a large document, you could see faster performance using the MMAPv1 storage engine. Having said that, WiredTiger provides you with better concurrency, compression, and more efficient disk use.
Regarding schema design, you might want to consider the following links:
Best regards,
Kevin