Collection partitioning

tracey eubanks

unread,

Oct 29, 2010, 5:26:28 PM10/29/10

to mongod...@googlegroups.com

Is there a way to partition a collection sort of how MySQL handles partitioning on tables?

Thanks
Tracey Eubanks
tra...@pmamediagroup.com
Nova ESP (PMA Media Group)

_____________________________________________________________

NOTICE:

This e-mail transmission, and any documents, files or
previous e-mail messages attached to it is only intended for
the person(s) to whom it is addressed and the information
contained in this message is confidential, privileged,
proprietary information, subject to copyright or constitutes a
trade secret, and is the property of PMA Media Group, Inc.
and its affiliated companies ("PMA")- the disclosure of which
is governed by applicable law. Unless stated to the contrary,
any opinions or comments are personal to the writer and do
not represent the official view of the company. If you are not
the addressee or authorized to receive this for the addressee,
you are hereby notified that any dissemination, copying,
action taken on your part in reliance upon this information by
persons or entities other than the intended recipient, or
distribution of this message, or files associated with this
message is strictly prohibited. If you have received this
message in error, please immediately notify the sender by
telephone (801-705-4400) or return e-mail and delete the
original transmission and its attachments without reading or
saving in any manner. Any reproduction, forwarding, or
copying without the expressed written permission of PMA is
strictly prohibited and no remedy or privilege is waived.
Messages sent to and from PMA may be monitored.

PMA does not accept responsibility for any errors or
omissions that are present in this message, or any
attachment, that have arisen as a result of e-mail
transmission. If verification is required, please request a hard-
copy version from the sender of this message.

Thank you for your cooperation.
___________________________________________________________

Markus Gattol

unread,

Oct 29, 2010, 6:14:48 PM10/29/10

to mongodb-user

that's basically sharding with MongoDB

Eliot Horowitz

unread,

Oct 29, 2010, 10:08:21 PM10/29/10

to mongod...@googlegroups.com

Not really.
We're probably going to allow you to "hint" to the storage layer about
how to cluster data.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Andrew Rollins

unread,

Nov 12, 2010, 12:43:58 AM11/12/10

to mongod...@googlegroups.com

I'm also interested in partitioning. Now that sharding is production ready, it's pretty much the last big thing I really want to see in MongoDB.

I really want to mimic MySQL partitioning so that I can easily drop parts of a collection in an efficient manner. For example, having timestamps in documents and partitions based on those timestamps would allow me to easily drop old documents. This is separate from sharding in that partitions don't dictate how documents are split among shards, instead they dictate how the storage is split out within each shard.

I guess you can partition yourself by just using different collections, but it'd be so much nicer to have it done in the DB.

- Andrew

Ted

unread,

Nov 12, 2010, 8:59:55 PM11/12/10

to mongodb-user

Me too. This one is huge for me. I need an easy way to drop
timestamp'ed
data without having to perform Mongo admin voodoo to reclaim space.
The
lack of this feature may drive us back to a SQL solution.

I know creating separate databases could be a solution but can one
query
across databases easily? My GUI developer is going to kill me if he
has
to create separate queries for each day. And this becomes a bigger
problem
if there are several months of data.

On Nov 12, 12:43 am, Andrew Rollins <and...@localytics.com> wrote:
> I'm also interested in partitioning. Now that sharding is production ready,
> it's pretty much the last big thing I really want to see in MongoDB.
>
> I really want to mimic MySQL partitioning so that I can easily drop parts of
> a collection in an efficient manner. For example, having timestamps in
> documents and partitions based on those timestamps would allow me to easily
> drop old documents. This is separate from sharding in that partitions don't
> dictate how documents are split among shards, instead they dictate how the
> storage is split out within each shard.
>
> I guess you can partition yourself by just using different collections, but
> it'd be so much nicer to have it done in the DB.
>
> - Andrew
>

> On Fri, Oct 29, 2010 at 10:08 PM, Eliot Horowitz <eliothorow...@gmail.com>wrote:
>
> > Not really.
> > We're probably going to allow you to "hint" to the storage layer about
> > how to cluster data.
>
> > On Fri, Oct 29, 2010 at 5:26 PM, tracey eubanks
> > <trac...@pmamediagroup.com> wrote:
> > > Is there a way to partition a collection sort of how MySQL handles
> > partitioning on tables?
>
> > > Thanks
> > > Tracey Eubanks

> > > trac...@pmamediagroup.com

> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> > .

> > > For more options, visit this group at
> >http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> > .

Eliot Horowitz

unread,

Nov 12, 2010, 10:44:49 PM11/12/10

to mongod...@googlegroups.com

Not sure why you need this for that case?
If you delete object that space should be reclaimed pretty normally.

> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.

Ted

unread,

Nov 15, 2010, 9:58:54 AM11/15/10

to mongod...@googlegroups.com

Right, we're not planning true partitioning right now.
Clustering gets a lot of the benefit since your won't be disk bound
most of the time.
Still some effort cleaning up indexes, but that's it.
The problem with partitioning in general is that you need to do a
lookup in N indexes instead of 1 for secondary indexes ( N is the
number of partitions ).
Something to consider in the future, but not short term.

Have you considered using capped collections?
We're also going to be doing TTL collections, so things are removed
continuously rather than have to be done in 1 large delete.

Ted

unread,

Nov 15, 2010, 9:58:51 AM11/15/10

to mongod...@googlegroups.com

Andrew - Thanks! That is what I meant to say! Exactly the feature I am looking for in MongoDB. This is what we are currently doing in Oracle today.

The only other possible solution I know is to use multiple collections on different databases but I don't think one can easily query across databases. Eliot, please correct me if I'm wrong.

Markus Gattol

unread,

Nov 15, 2010, 11:17:15 AM11/15/10

to mongodb-user

You can't query across collections/databases. You would have to
maintain different connection (pools) to both of your databases and
handle the return sets in your application.

Andrew Rollins

unread,

Nov 15, 2010, 12:34:17 PM11/15/10

to mongod...@googlegroups.com

On Mon, Nov 15, 2010 at 9:58 AM, Eliot Horowitz <elioth...@gmail.com> wrote:

Right, we're not planning true partitioning right now.
Clustering gets a lot of the benefit since your won't be disk bound
most of the time.
Still some effort cleaning up indexes, but that's it.

The problem with partitioning in general is that you need to do a
lookup in N indexes instead of 1 for secondary indexes ( N is the
number of partitions ).

True, but I'm ok with that because N is probably only 2 in my case. Even if it's larger, N is constant and I can plan around that.

Something to consider in the future, but not short term.

Have you considered using capped collections?

The docs say that you can't shard capped collections. I want something I know I can grow by adding nodes (already doing thousands of ops a second now). Is there a way to get around this limitation?

We're also going to be doing TTL collections, so things are removed
continuously rather than have to be done in 1 large delete.

TTL would also get me what I need (dropping old data).

- Andrew

Ted

unread,

Nov 15, 2010, 12:37:12 PM11/15/10

to mongod...@googlegroups.com

Any ETA on the TTL and clustering feature?

--

Markus Gattol

unread,

Nov 15, 2010, 12:45:32 PM11/15/10

to mongodb-user

- it says 1.7.X for TTL http://jira.mongodb.org/browse/SERVER-211
- note that by voting you can influence what priority a feature gets

Ted

unread,

Nov 15, 2010, 12:49:04 PM11/15/10

to mongod...@googlegroups.com

Thanks guys for your time.

On Mon, Nov 15, 2010 at 12:45 PM, Markus Gattol <markus...@gmail.com> wrote:

- it says 1.7.X for TTL http://jira.mongodb.org/browse/SERVER-211
- note that by voting you can influence what priority a feature gets

--

Andrew Rollins

unread,

Nov 15, 2010, 12:51:16 PM11/15/10

to mongod...@googlegroups.com

Also want to say thanks. Always appreciate the responsiveness and willingness to discuss different approaches.

- Andrew

Markus Gattol

unread,

Nov 15, 2010, 1:16:20 PM11/15/10

to mongodb-user

Yes, good and fruitful discussion. I created a ticket
http://jira.mongodb.org/browse/SERVER-2097
You are welcome to vote :)

Reply all

Reply to author

Forward