creating large indexes (larger than memory)

Zac Witte

unread,

Feb 6, 2012, 6:29:14 PM2/6/12

to mongodb-user

I have a very large collection (much larger than memory, which is
unavoidable) and I'm trying to create an index, which is also going to
be much larger than the amount of memory. Most of it is archival so
I'm not worried about access times and when data is accessed it will
be sequentially in large chunks in the order of the index I'm trying
to create. The index is a single text field in each document not more
than about 50 characters in length.

When I start the index (foreground operation, not background) it goes
fairly fast and steadily slows until some 12 hours later it's only
moving forward about 100 documents per MINUTE and reading some
300-400MB/s from disk. Each document is less than 1k.

Is mongo just incapable of handling large indexes like this? It
doesn't seem like a problem that should be that hard to solve. I know
if I create the index when I create the collection and add documents
incrementally I get good performance out of it (both inserts and
queries). I had a similar collection about 4x the size that grew in
this manor. It's just the one-time creation I've had problems with.

Wes Freeman

unread,

Feb 6, 2012, 9:00:05 PM2/6/12

to mongod...@googlegroups.com

I don't have an answer, but I'm curious--how big is your collection?

Wes

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Zac Witte

unread,

Feb 6, 2012, 9:08:58 PM2/6/12

to mongod...@googlegroups.com

Currently the collection is spread across 3 shards and each shard has about 300 million documents storage size 185GB. I expect it will grow at least 10x. This is an internally used database for real-time(ish) incremental analytics so it's OK to take it down every once in a while.

Zac

Wes Freeman

unread,

Feb 7, 2012, 12:33:34 AM2/7/12

to mongod...@googlegroups.com

Since nobody else has responded, yet, I wonder how long it would take
to copy the collection to a new collection (where the index is already
created). Considering your 100/minute rate after 12 hours, I think
this might be the only way to get it done. Of course, it would require
enough space for a copy of the collection.

Wes

NPSF3000

unread,

Feb 7, 2012, 3:47:30 AM2/7/12

to mongodb-user

Obviously mongodb could look at their index building algorithm -
however the issue hear appears to be you're creating an index bigger
than your RAM.

I'm struggling to figure out why.

You spend ~ 600 bytes on each document - but are only indexing say 50
bytes per document. So you only need ~16GB's of ram.

So yeah, either index early [as you indicate this works fine], get a
bigger server, or investigate sharding.

On Feb 7, 12:08 pm, Zac Witte <zacwi...@gmail.com> wrote:
> Currently the collection is spread across 3 shards and each shard has about
> 300 million documents storage size 185GB. I expect it will grow at least
> 10x. This is an internally used database for real-time(ish) incremental
> analytics so it's OK to take it down every once in a while.
>
> Zac
>
>
>
>
>
>
>
> On Mon, Feb 6, 2012 at 6:00 PM, Wes Freeman <freeman....@gmail.com> wrote:
> > I don't have an answer, but I'm curious--how big is your collection?
>
> > Wes
>

Adam C

unread,

Feb 7, 2012, 7:07:54 AM2/7/12

to mongodb-user

With an index size that significantly outstrips your RAM you are going
to run into a lot of disk based IO as data is swapped to and from
disk, this is going to be very slow, especially if that is fighting
with other operations (replication, writes, reads etc.) and/or other
processes beyond the index build.

To speed things up, you can try building the index on the secondaries
by following the procedure here (and you can switch around secondary/
primary etc.):

http://www.mongodb.org/display/DOCS/Building+indexes+with+replica+sets

The main advantage is that the DB will not have to deal with
replication while the build is going on, because you have essentially
removed it from the replica set. However, the building of the indexes
will *have to take less time than the duration of your oplog* or it
won't be able to catch up to the primary once you add it back in to
the set.

Depending on your architecture, you could then also use snapshot type
backups of the data files to recreate your expensive index build
elsewhere without going through a rebuild (indexes are preserved with
this type of backup), meaning you only have to go through the pain
once. This is described in detail in the Backups section of the Docs:
http://www.mongodb.org/display/DOCS/Backups - see the fsync and lock
or shutdown options.

Adam.

Zac Witte

unread,

Feb 7, 2012, 1:37:12 PM2/7/12

to mongod...@googlegroups.com

Hey Adam, thanks for the reply. In this case, because it's mostly an internally used database I was able to take it offline for the index build so there were no other operations interfering. (No replication or any other queries). 100% of the RAM was dedicated to the effort, but still, if the index is large enough, that is still a problem.

Without actually knowing the internals, I'm guessing mongo builds indexes with the assumption that the whole thing can fit in memory. For each individual document it probably tried to insert the record directly into the B+ tree, which means it needs to page in that section of the tree, shuffle things around a little, then page in the next document in the table scan and continue. Wouldn't a smarter approach be to build smaller sections of the index at a time and use something like merge sort to assemble the index when complete? Let me know if this makes any sense or if my understanding is wrong. I

For the time being, it sounds like indexes that fit in memory is just a technical limitation of mongodb so to get around it, I've created a new collection that has the index to begin with and am in the process of inserting all the documents from my old collection without the index. For whatever reason this is going faster even though it requires much more reading and writing to disk.

Zac

Wes Freeman

unread,

Feb 7, 2012, 3:11:47 PM2/7/12

to mongod...@googlegroups.com

I think you're right. It seems to just build the index in RAM, rather
than allowing it to be written partly to disk as it works after the
index is created. It's probably faster that way for indexes that fit
in RAM, but there is a pretty big downside when you're working with
larger stuff. Maybe there should be an option, if it's possible to
change the way it works: ensureIndex(..., {in_ram:false});

What sort of per second copy rate are you getting as you copy your
collection? I tested last night and was amazed that it was fairly
quick: 3-4k docs/second according to mongostat, including a check to
make sure the _id didn't exist, so I could cancel and resume. My test
was nowhere near as big as your total... just curious.

Wes

Eliot Horowitz

unread,

Feb 7, 2012, 5:37:54 PM2/7/12

to mongod...@googlegroups.com

The index is not built in ram.
The data is first put into sorted external files from which a merge
sort is done at the end to construct the final b-tree.
Generally works well for indexes much later than ram.

Wes Freeman

unread,

Feb 11, 2012, 9:04:39 PM2/11/12

to mongod...@googlegroups.com

I tested this myself, and was able to create a 17GB index (from a 300M
record collection) with 4GB of RAM available in about 30 minutes. It
does open a lot of files--mongo crashed the first try because it
opened too many files, so I had to increase the OS limit. It did do a
fair amount of RAM paging in, so I'm not sure if maybe it tries to do
too much at once, or if that's just how it looks when you're using
memory mapped files.

Anyway, I've seen this complaint before, but I'm not sure why it seems
to cause issues some times and not others. Could sharding have
something to do with it? I wasn't using sharding for my test.

Wes

Eliot Horowitz

unread,

Feb 11, 2012, 10:13:04 PM2/11/12

to mongod...@googlegroups.com

Sharding shouldn't bean issue at all.
There were issues with older versions of mongod using more ram than it
should for creating indexes, but in 2.0 should be good.

Reply all

Reply to author

Forward