Here's the relevant snippet of code. $casrns is a list generated by a call to mongo using a javascript print() method (instead of printjson(), which would give me a json list, which isn't useful here). I use sed to:
1. removed blank lines
2. replace the delimiters in the list (blank space) with quote-comma-quote
3. add a quote to beginning and end of list
xargs chunks it all into max of 1000 item-lists
/usr/bin/mongo --host $mongodb_ip $db --eval 'db.substances.drop()'
/usr/bin/mongo --host $mongodb_ip $db --eval 'db.datapoints.drop()'
for casrn_line in `echo $casrns |xargs -n1000 -d , | sed '/^\s*$/d; s/ /","/g; s/^/\["/; s/$/"\]/'`; do
mongorestore --host $mongodb_ip --db $db --collection substances $cdr/substances.bson --filter "{_id: {\$in: $casrn_line }}";
mongorestore --host $mongodb_ip --db $db --collection datapoints $cdr/datapoints.bson --filter "{substance_id: {\$in: $casrn_line }}";
done