I have small sensors that collect information and they all simultaneously insert data into a single database/collection.
I batch process this data using another program, right now I set my batch size limit to 50,000 documents and use a combination of $skip and $limit to pick up where I left off (from the previous batch).
I need to return my batches sorted by a 'start_time', when I sort or use $orderby AFAIK the entire collection gets sorted first then the query is returned, is that correct?
Example for running my batch queries:
batch # : query
0 : find({yadda yadda yadda}).skip(0).limit(50000) ~ lets say this returns 300 documents, then I pick back up at 300
1 : find ({yadda yadda yadda}).skip(300).limit(50000) ~ lets say this returns 8000 documents, then I pick back up at 8300
2 : find ({yadda yadda yadda}).skip(8300).limit(50000) ... and so on
Is this the best way to batch process?
Is there a better way to sort each batch (or do some kind of insertion sort) when inserting?