Data sets outside of RAM

Terry Smith

unread,

Aug 14, 2010, 3:59:21 PM8/14/10

to mongodb-user, l...@jaxified.com

Hi guys,

We've been playing with MongoDB for a start-up we're doing. We're
looking at a very very (insert many more here) large data set (on the
order of 1 TB a month) and it needs to be searchable going back at
least 3 months, preferably more. It seems near impossible that we're
going to be able to fit the data into RAM and we're seeing awful
performance pass that boundary.

We're running on 1.6.0 and have 3 shards set up (no replication yet).
To test the constraints, each server only has 1 GB of RAM with RAID10
drives that get very good performance. We're doing about 10,000
writes/s and about 200 updates/s and until it hits that RAM limit it's
been great. Once it hits that RAM limit, the servers seem to take
turns sitting at 100+ faults/s and blocking all write queries and most
read queries. Every now and then they let through a burst of inserts
before going back to having fault parties.

On Monday or Tuesday we're moving to our production servers which will
see us have 4 shards set up in pairs (8 servers total), 8 GB of RAM in
each. We'll be running single 300 GB SAS drives in each server. I
expect we'll hit that RAM ceiling pretty quickly and performance will
be much the same as it is now.

Is there any solution or advice for running a data set that doesn't
fit in RAM? Ideally we'll get there, but we'd need 200 - 300 boxes to
do so which is outside of our start-up budget. I love the paradigm/
document schema we're using in MongoDB as it fits well with our
project but I need it to get at least the performance we're getting
now and preferably higher (like 3x higher).

Thanks in advance for any help and for taking the time to read this.

Sincerely,

Terry

Eliot Horowitz

unread,

Aug 14, 2010, 7:57:48 PM8/14/10

to mongod...@googlegroups.com

The only reason writes get slow with large data sets are indexes.
How many indexes do you have and how big are they?
Also - if they're right balanced (new items always go on one side)
then the working set is small.

db.foo.stats() will give us a lot of info about index sizes, etc...

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Markus Gattol

unread,

Aug 15, 2010, 2:41:00 PM8/15/10

to mongod...@googlegroups.com

Terry> Hi guys, We've been playing with MongoDB for a start-up we're
Terry> doing. We're looking at a very very (insert many more here)
Terry> large data set (on the order of 1 TB a month) and it needs to be
Terry> searchable going back at least 3 months, preferably more. It
Terry> seems near impossible that we're going to be able to fit the
Terry> data into RAM and we're seeing awful performance pass that
Terry> boundary.

You do not need to be able to fit all your data into RAM, just the
working set size and indexes would be the ideal case.

http://www.markus-gattol.name/ws/mongodb.html#what_is_the_so-called_working_set_size

In a nutshell, you need to have the right indexes for your use case
plus, if you do sharding, the right sharding key for your use case.

[skipping a lot of lines ...]

Terry> On Monday or Tuesday we're moving to our production servers
Terry> which will see us have 4 shards set up in pairs (8 servers
Terry> total), 8 GB of RAM in each. We'll be running single 300 GB SAS
Terry> drives in each server.

Given the amount of data you are talking about, without the right
indexes and sharding keys for your use case, you will always run into
problems ... even on hardware very big boxes. On the other hand, even
1GB of RAM might do if you have the right metrics figured out for your
use case.

Terry Smith

unread,

Aug 15, 2010, 10:07:12 PM8/15/10

to mongodb-user

Here are the stats:

> db.urls.stats()
{
"sharded" : true,
"ns" : "namespace.urls",
"count" : 5456135,
"size" : 2375506864,
"avgObjSize" : 435.38271395410857,
"storageSize" : 3087559424,
"nindexes" : 5,
"nchunks" : 65,
"shards" : {
"shard0000" : {
"ns" : "namespace.urls",
"count" : 1620085,
"size" : 612342952,
"avgObjSize" : 377.96964480258754,
"storageSize" : 736285696,
"numExtents" : 21,
"nindexes" : 5,
"lastExtentSize" : 129238272,
"paddingFactor" : 1.0099999999972917,
"flags" : 0,
"totalIndexSize" : 800195520,
"indexSizes" : {
"_id_" : 105578496,
"array.string_1" : 371696576,
"updated_1" : 85082112,
"priority_1" : 88334336,
"hash_1" : 149504000
},
"ok" : 1
},
"shard0001" : {
"ns" : "namespace.urls",
"count" : 2132475,
"size" : 1058353448,
"avgObjSize" : 496.30286310507745,
"storageSize" : 1568785152,
"numExtents" : 25,
"nindexes" : 5,
"lastExtentSize" : 267987712,
"paddingFactor" : 1.0099999999910436,
"flags" : 0,
"totalIndexSize" : 1427662720,
"indexSizes" : {
"_id_" : 120954880,
"array.string_1" : 825959360,
"updated_1" : 135913472,
"priority_1" : 153174016,
"hash_1" : 191660992
},
"ok" : 1
},
"shard0002" : {
"ns" : "namespace.urls",
"count" : 1703575,
"size" : 704810464,
"avgObjSize" : 413.7243526114201,
"storageSize" : 782488576,
"numExtents" : 19,
"nindexes" : 5,
"lastExtentSize" : 138832384,
"paddingFactor" : 1.0099999999961091,
"flags" : 0,
"totalIndexSize" : 911475648,
"indexSizes" : {
"_id_" : 109780992,
"array.string_1" : 444580800,
"updated_1" : 104898560,
"priority_1" : 92971008,
"hash_1" : 159244288
},
"ok" : 1
}
},
"ok" : 1

> > For more options, visit this group athttp://groups.google.com/group/mongodb-user?hl=en.- Hide quoted text -
>
> - Show quoted text -

Terry Smith

unread,

Aug 15, 2010, 9:55:35 PM8/15/10

to mongodb-user

Thanks for the replies. I'll try to go into a little more detail,
without divulging any secrets. If you guys can't answer with the
information provided here, let me know and I'll talk to my sysadmin
about putting some consulting time on the books.

So our working set is pretty small, maybe 1 - 5% of our documents
needs to be accessed at any time. But our indexes are large, we're
indexing almost the whole document. It's all in a single collection
spread across 3 shards.

Basically we're indexing web pages (not storing any text though). So
for each document/page we're keying off of and sharding on a hash of
the URL which is also a unique key in our database. This gives us a
nice random key and gives us a relatively even distribution across our
shards. For every page we find/crawl we insert a new document which
is made up of the _id, the hash (index), the URL itself (not indexed),
the last time we crawled it (index), and the priority of that page to
be re-indexed (index for sorting). We're also storing a list of sub-
objects (each one includes a string and a long) for each document and
the string portion of those sub-objects is indexed as well (this is by
far the largest index but also the most important).

When we start crawling, everything gets send to one shard until it
splits. We're using a small chunk size (16 MB) to make sure things
get distributed. Our insert rate is so high that we've seen
versioning errors between shards at large chunk sizes. Once
everything has split, the writes get distributed relatively evenly.
At this stage when we get to about 5 - 10M documents one of the slaves
will start losing it and just sit there faulting and blocking all of
the queries. Now, our indexes do not fit in RAM and I don't expect
them to. For now, that's impossible. We just can't buy enough
hardware to make it work. But mongo literally crawls to a halt when
we hit that RAM limit and, like I said, almost every read/write just
starts blocking.

How can we differentiate working from non-working documents? Given
the information above do you see anything wrong with our keys, outside
of the fact that there's a lot of them? By far the most important
index is the largest one (the index of the string portion in the list
of sub-objects) and that being indexable and queryable is one of the
major reasons I picked mongo in the first place so it's pretty
important that that stay.

Let me know if there's any other information I can provide. If it's
not enough info, I'll see about getting some consulting time to
discuss this in a little more of a private forum.

Thanks again,

Terry

On Aug 15, 2:41 pm, Markus Gattol <markus.gat...@sunoano.org> wrote:
> Terry> Hi guys, We've been playing with MongoDB for a start-up we're
> Terry> doing. We're looking at a very very (insert many more here)
> Terry> large data set (on the order of 1 TB a month) and it needs to be
> Terry> searchable going back at least 3 months, preferably more. It
> Terry> seems near impossible that we're going to be able to fit the
> Terry> data into RAM and we're seeing awful performance pass that
> Terry> boundary.
>
> You do not need to be able to fit all your data into RAM, just the
> working set size and indexes would be the ideal case.
>

> http://www.markus-gattol.name/ws/mongodb.html#what_is_the_so-called_w...

J Greely

unread,

Aug 16, 2010, 2:29:42 AM8/16/10

to mongodb-user

On Aug 15, 6:55 pm, Terry Smith <tj.hackin.sm...@gmail.com> wrote:
> At this stage when we get to about 5 - 10M documents one of the slaves
> will start losing it and just sit there faulting and blocking all of
> the queries.

In your shard server logs, what sort of times are showing up for
"Finding median for index" and "Finding size for ns"? Your description
matches what I've seen in my low-memory sharding test (150 million
records, 2 shards running on a single machine with 4 GB of RAM), where
the chunk splits cause insert performance to degrade severely once the
indexes start paging out. Without sharding, insert performance
degrades much more gradually as your indexes exceed available RAM.

With the volume of incoming data and extensive indexing you describe,
I'd say your Production cluster will perform well for inserting the
first 100 million records, and then fall apart the same way your test
cluster does. Less if you have a lot of those sub-objects to index.

By the way, if you're replacing the entire object each time you crawl
a page (rather than updating the existing one), your new _id field
will contain the last-crawled timestamp, saving you an index.

> How can we differentiate working from non-working documents?

The obvious idea is multiple collections. For my 150 million records/
day, I eliminated an index and a data field by inserting into
collections whose name contained the date. As I rotate collated data
into a GridFS, I just drop the oldest collections. (which admittedly
aren't that old right now, because I only have 24 GB of RAM on the
server...)

-j

Adam Greene

unread,

Aug 16, 2010, 5:21:54 PM8/16/10

to mongodb-user

Hi Terry,

we ran into the *exact* same limitation. We are working around it
with a few general approaches:
* we keep a lot of very small (5-7 field) records. Instead of having
each record have its own row, we are looking to batch them up by day
and a bit of meta data (ownership, data source, data type, etc). This
compacts the index by almost 4 orders of magnitude. Searching for
small amounts of data inter-day is a bit slower as we have to pull
back the entire day record, but it speeds up inserts drastically and
helps keep our indexes in memory. And most of our searches are multi-
day, so the slowness only comes up in an edge case..and it is still
pretty fast.

* for real searchability, perhaps not as detailed as you described, we
are looking to use sphinx instead of indexing the hell out of the
collections. This is an approach we actually did with mysql to keep
the indexes from getting out of hand, and for the same reason as
mongo, we wanted to keep the indexes in memory for maximum
performance. That might be an approach worth checking out?

Adam

Eliot Horowitz

unread,

Aug 16, 2010, 5:42:07 PM8/16/10

to mongod...@googlegroups.com

There is an issue with how we compute split points.
Going to be making better soon.

Adam Greene

unread,

Aug 16, 2010, 8:31:21 PM8/16/10

to mongodb-user

hi elliot,
what do you mean exactly? 'Split Points' sounds like something to do
with sharding, but it seems like this thread is more about the rapid
drop-off in performance when the index doesn't fit in memory. Or am I
mistaking and does split point have something to do with deciding what
part of the index (most recently and commonly used) to keep in memory?

Thanks!
Adam

Eliot Horowitz

unread,

Aug 16, 2010, 10:28:06 PM8/16/10

to mongod...@googlegroups.com

Split points has to do with sharding.

If your'e having issues with non-sharding, that's different.
Though what you said about compacting is best.

Reply all

Reply to author

Forward