Re: Large number of collections and "too many namespaces/collections" replication issues

261 views
Skip to first unread message

Sam Helman

unread,
Nov 1, 2012, 10:49:07 AM11/1/12
to mongod...@googlegroups.com
Hello,

As a general rule, using one collection per user in a user - oriented database is not recommended.  There is overhead for collections that does not exist on a document level, and collections are intended to group similar data together.  A "users" collection would be a good way to go (indexed appropriately by username).  

Do you have any questions about your redesign I could help with?


George Niculae

unread,
Nov 1, 2012, 12:05:23 PM11/1/12
to mongod...@googlegroups.com
Thanks for your reply!

Briefly, the requirements are to store voicemail files (typicaly not so large files as they're just voicemails, configurable mp3 or wav format) for sipXecs users (100k) - each user can have several voicemails in mailbox (seen users with 6k messages but I consider that an exception, I'd say an average of 100 files / user). Each voicemail has additional details like heard/unheard flag, from field, unique message id, priority, duration, folder label (inbox, saved, deleted). User could query mailbox details and listen voicemails via web interface or phone.

In current implementation I am creating a bucket per each user - it stores user voicemail as file and voicemail details as metadata associated with the file. When user wants to query or retrieve and listen voicemails I am querying only the associated bucket and get data from.

Given your response I think I'll have to move all files into a single bucket and add a new field in metadata - that is user owning the voicemail. Then all queries will go to the unique bucket and will be filtered based on user that is querying (index on user probably needed too?)

Do you see any drawback with this approach?

Thanks so much for your help!
George

Sam Helman

unread,
Nov 1, 2012, 1:55:27 PM11/1/12
to mongod...@googlegroups.com
I am not sure I fully understand, but I think what you are suggesting is:

A users collection, with information about the users.
A voicemails collection, where each document has a "user" field stating which user the voicemail belongs to.  If you index this field, finding all voicemails for a given user should be easy.  You can also use compound indexes to make queries fast that find all unread voicemails for a given user, or something similar.

This is definitely the way I would recommend.

George Niculae

unread,
Nov 1, 2012, 3:07:03 PM11/1/12
to mongod...@googlegroups.com
Sorry I wasn't clear enough, let me give some additional info

I only need to store voicemail files in GridFS and associate them with external users (users that are not stored in mongodb). For this I use GridFS java API that allows me to do something like:

            GridFSInputFile audioFile = vmFS.createFile(file);
            audioFile.setFilename(file.getName());
            audioFile.setMetaData(metadata);
            audioFile.save();

where in metadata object I add additional info like user owner of vm, from where it was received, urgent flag or unheard flag, e.g.

        BasicDBObject metadata = new BasicDBObject();
        metadata.put(FROM, sender);
        metadata.put(OWNER, userName);
        metadata.put(NEW, newVm);
        metadata.put(URGENT, urgent);

When I need to query for example new voicemails for a particular user I can do something like:

        BasicDBObject query = new BasicDBObject();
        query.put(OWNER, userName);
        query.put(NEW, true);
        vmFS.find(query);

Right now I am creating one files / chunks collections pair for each user e.g user george will have 2 collections - george.files and george.chunks and so on - by constructing GridFS object as '
        GridFS vmFS = new GridFS(getVmdb(), username);

I could use the same approach but creating only the default collections fs.files and fs.chunks by creating GridFs object as
        GridFS vmFS = new GridFS(getVmdb());

Hope this makes sense

Thanks!
George

Sam Helman

unread,
Nov 7, 2012, 10:13:30 AM11/7/12
to mongod...@googlegroups.com
This makes sense to me, and definitely seems like the reasonable way to go.

George Niculae

unread,
Nov 8, 2012, 1:10:24 PM11/8/12
to mongod...@googlegroups.com
Thanks! I'm on to load tests now and post back

George
Reply all
Reply to author
Forward
0 new messages