mongorestore and upsert

982 views
Skip to first unread message

Mark Kwan

unread,
Mar 3, 2011, 2:59:54 AM3/3/11
to mongodb-user
Back in another thread Eliot mentioned that upsert was being added as
an option to mongorestore:
http://groups.google.com/group/mongodb-user/browse_thread/thread/a32188843134d711/b0d9efcb88f000ad

Is that just the default behavior now, or was this ever implemented?
I don't see it in Jira and it's not listed as an option in the docs:
http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongorestore

If mongorestore doesn't support upserts, what's the recommended way of
moving a collection or database which is currently being written to?
Or do you have to shut down the writers, move the collections and then
start them back up again?

Eliot Horowitz

unread,
Mar 3, 2011, 3:12:08 AM3/3/11
to mongod...@googlegroups.com
Can't remember what the history is, but as now it does an insert.
The insert won't overwrite documents with the same _id, so if you
restore a file to a collection being written too, only missing docs
(by _id) will be inserted.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Mark Kwan

unread,
Mar 3, 2011, 7:53:51 PM3/3/11
to mongodb-user
Is there any recommendation on how to update those documents that
already exist?

I want to move a collection from one machine to another. That
collection is currently in use and being updated.
I can:
1) copy the collection to the other database using either
cloneCollection or a mongodump/mongorestore
2) point the writers to the new database
3) mongodump/mongorestore (does copyCollection do the same as
mongodump/mongorestore?)

But any updates on the old database will be lost. Does Mongo support
any way of synchronizing those updates? I'm not even going to worry
about the case where updates were made to a document in both the new
and old databases.


The only way I can think of is to change the writers to save an
update_time (or a dirty flag), do the steps above, then query for
those updated records.. This seems somewhat... dirty..


On Mar 3, 12:12 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> Can't remember what the history is, but as now it does an insert.
> The insert won't overwrite documents with the same _id, so if you
> restore a file to a collection being written too, only missing docs
> (by _id) will be inserted.
>
>
>
>
>
>
>
> On Thu, Mar 3, 2011 at 2:59 AM, Mark Kwan <mark.k...@gmail.com> wrote:
> > Back in another thread Eliot mentioned that upsert was being added as
> > an option to mongorestore:
> >http://groups.google.com/group/mongodb-user/browse_thread/thread/a321...
>
> > Is that just the default behavior now, or was this ever implemented?
> > I don't see it in Jira and it's not listed as an option in the docs:
> >http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportT...

Eliot Horowitz

unread,
Mar 3, 2011, 9:00:16 PM3/3/11
to mongod...@googlegroups.com
Some of your statements seem a tad contradictory.

So you have a current collection with data in it - server A.
You have another collection with some of the same data, which is
missing some, and which there are some conflicts - server B.

The simplest thing to do is do a mongodump from server B, mongorestore
to server A.
Then you know that you have a copy of every document (by _id).

Now the question is what do you want to do if there is the same _id on
A and B with different data.
I can help you detect those, but the real question is what do you want
to, as that has to be handled completely by you.

Mark Kwan

unread,
Mar 3, 2011, 9:11:40 PM3/3/11
to mongodb-user
What I'm trying to do is to move a collection that is in use from one
database (A) to another (B).

If mongorestore supported upserts, I would do something like:
1) mongodump from A
2) mongorestore to B
3) change writers to point to B
4) mongodump from A
5) mongorestore with upserts to B

There are obviously race conditions here, like if an _id was updated
in both A and B between steps 1 and 5. I'm not even trying to address
those, that seems overly complicated especially without some
application-specific logic.

I just want to make sure the updates that were made in A between steps
1 and 3 are applied (upsert-style) to B.

Eliot Horowitz

unread,
Mar 3, 2011, 9:43:45 PM3/3/11
to mongod...@googlegroups.com
What exactly do you mean by upsert?
The default behavior of mongorestore is insert.
If there is already a document with the same _id, then it won't be inserted.

Mark Kwan

unread,
Mar 3, 2011, 10:01:50 PM3/3/11
to mongodb-user
I meant upsert as in, new _id's would be added, changed _id's would be
updated. I realize that mongorestore just does inserts, which is the
problem I'm trying to solve.


Okay, suppose I have a collection in use and I want to move it. The
simplest attempt to move the database would be a simple mongodump and
mongorestore. Assume there are insert's and update's being performed
on the database during this time
1) insert to A {_id: 1, value: 1}
2) mongodump from A
3) insert to A {_id: 2, value: 1}
4) update A {_id: 1, value: 2}
5) mongorestore to B
6) change writers to point to B

Obviously in this case, my database B is now missing the changes in
steps 3 and 4.

What I could do is follow up with another mongodump/mongorestore.
7) mongodump from A
8) mongorestore to B

Now my database B is still missing the change in step 4.

If mongorestore had supported upsert, then database B would have all
the inserts and updates that were performed.

Since mongorestore doesn't support upsert, how would I move a
collection without losing the updates between steps 2 and 6?

Mark Kwan

unread,
Mar 3, 2011, 10:19:51 PM3/3/11
to mongodb-user
Although I guess a mongrestore with upserts, would still fail to
preserve all the inserts and updates if there is:
6.5) update B {_id: 2, value: 2}

In that case database B would lose that change after step 8 upserts
the old data over it.


Perhaps the only way to move an in-use collection without losing
anything is to change the writers to set a 'dirty' bit on the
documents during migration and then query for those documents and
apply their changes to the new database..

Eliot Horowitz

unread,
Mar 4, 2011, 3:05:19 AM3/4/11
to mongod...@googlegroups.com
Right.
The other option is just to write a program to copy the data which can
determine how to handle conflicts.
Based on the app, there may be ways to handle it easily.
Reply all
Reply to author
Forward
0 new messages