ARGUMENT LIST TOO LONG: mongorestore filter

539 views
Skip to first unread message

jamieorc

unread,
Aug 9, 2013, 2:42:11 PM8/9/13
to mongod...@googlegroups.com
I use mongo --eval to get a list of distinct items. Then I use that list later as a filter to a mongorestore --filter command. Problem is sometimes my list has over 10,000 items and so I get an ARGUMENT LIST TOO LONG error.

Looking for an approach to work around this problem. Perhaps some kind of chunking? 

Importing the data first and then deleting as I usually have only a few hundred or thousand items and one of the two lists I filter has over a million items.

Jamie

Asya Kamsky

unread,
Aug 9, 2013, 7:59:04 PM8/9/13
to mongodb-user
Can you explain exactly what your use case is? Maybe there is a
different (more efficient) way to achieve the same thing.
I'm actually having trouble visualizing how you can use the output of
distinct this way (or rather why you would need to).

Asya
> --
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
>
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Jamie Orchard-Hays

unread,
Aug 9, 2013, 10:23:30 PM8/9/13
to mongod...@googlegroups.com
I solved it using xargs and a for loop. Use case is I have a master db sharing data with many other dbs but only want to import the relevant data
> You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/_JMd0HSbjks/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.

Jamie Orchard-Hays

unread,
Aug 10, 2013, 11:05:50 AM8/10/13
to mongod...@googlegroups.com
Here's the relevant snippet of code. $casrns is a list generated by a call to mongo using a javascript print() method (instead of printjson(), which would give me a json list, which isn't useful here). I use sed to:

1. removed blank lines
2. replace the delimiters in the list (blank space) with quote-comma-quote
3. add a quote to beginning and end of list

xargs chunks it all into max of 1000 item-lists

/usr/bin/mongo --host $mongodb_ip $db --eval 'db.substances.drop()'
/usr/bin/mongo --host $mongodb_ip $db --eval 'db.datapoints.drop()'

for casrn_line in `echo $casrns |xargs -n1000 -d , | sed '/^\s*$/d; s/ /","/g; s/^/\["/; s/$/"\]/'`; do
mongorestore --host $mongodb_ip --db $db --collection substances $cdr/substances.bson --filter "{_id: {\$in: $casrn_line }}";
mongorestore --host $mongodb_ip --db $db --collection datapoints $cdr/datapoints.bson --filter "{substance_id: {\$in: $casrn_line }}";
done
Reply all
Reply to author
Forward
0 new messages