Hierarchical Architecture with MongoDB

70 views
Skip to first unread message

Christopher Förster

unread,
May 28, 2015, 10:21:24 AM5/28/15
to mongod...@googlegroups.com
Hi,
my scenario is quite simple. I have multiple mongoDB databases. Now I would like to create a kind of central mongoDB which collects (not sync) all data of the other mongoDB databases - so it is some kind of archive. Are there existing mechanisms in MongoDB which support such a hierarchical architecture? I did not found something like that.

I found the read-only shards. But this is only for one MongoDB - not for multiple.
I found the backup und restore - but a live data transfer would be preffered.

BR Chris

Stephen Steneker

unread,
May 28, 2015, 2:04:28 PM5/28/15
to mongod...@googlegroups.com, cf.alt...@googlemail.com
Hi Chris,

There is no standard mechanism for archiving data from multiple MongoDB deployments into a single separate MongoDB deployment. The in-built mechanisms are for scaling out logical deployments via replication and sharding.

You could copy data into other deployments via:

 - Mongo Connector or a similar data connector API/tool: https://github.com/10gen-labs/mongo-connector

 - mongodump/mongorestore (as you noted): http://docs.mongodb.org/manual/tutorial/backup-with-mongodump/


For live replication of data you will want to enable the replication oplog (http://docs.mongodb.org/manual/core/replica-set-oplog/) and use that as the basis for syncing changes via a tailable cursor (http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/). That is the approach used MongoDB's built-in replication features as well as tools like Mongo Connector.

Regards,
Stephen

badi

unread,
Jun 1, 2015, 4:45:12 AM6/1/15
to mongod...@googlegroups.com
Hi, 

in my company we run this solution under name Central Archive. 
Running 10+ very small applications (up to 1 GB storage) and there was demand to have copy of all DBs  in one place - in the Central Archive. (statistical and backup purposes)

Solution is based on Master-Slave feature in Mongodb.

Every database is set as Master. Central storage runs 1 mongod process and is set as Slave for EVERY database.

Running smoothly for more than 2 years. Max lag (2-3 secs) between Masters and Central Slave depends on the utilization (number of actions in the oplog) of the Primaries, as the Slave reads only 1 Master in time, rest of Primaries must wait.

Work also perfectly with WiredTiger.

Unfortunately preferred ReplicaSet solution has no such feature.

Hope Mongo company doesn`t have plan to sunset the MasterSlave in near versions of this wonderful database.

Marian

Christopher Förster

unread,
Jun 12, 2015, 1:14:17 AM6/12/15
to mongod...@googlegroups.com
Hi Marian,
thank you a lot for sharing this experiences. I will try this mechanism at it seems to be very transparent. Do your small applications have partly the same collections? So for instance if two applications have a "sales" collection and both are push this collection data to the central "sales" collection on the central slave.

BR Chris
Reply all
Reply to author
Forward
0 new messages