MongoDB Out of Residential Memory Failure during Index creation

332 views
Skip to first unread message

Reid

unread,
Apr 22, 2011, 9:01:52 AM4/22/11
to mongodb-user
We have a replica set setup and were in the process of adding the
secondary server when after about 12 hours into the initial sync the
secondary server ran out of residential (physical) memory. Other
processes on the boxes failed with:
"java.io.IOException: error=12, Cannot allocate memory".

This secondary server had already successfully copied across all the
data and was in the process of building the indexes when the system
started failing and automatically killed mongod.

The master server is behaving as expected, it is using almost all of
the residential memory and about 400GB of virtual memory. I read
somewhere that we need to have enough residential (physical memory) to
hold the indexes. Is there some way to determine how much space is
needed for the indexes?
We also noted a 20 fold drop in insert performance after adding 10
indexes to the collection of over 100 million documents. Guessing that
we need more physical memory to hold the indexes.

I am guessing that it ran out of residential (physical) memory during
the index build phase of the secondary server. I expect everything to
return to normal once the indexes have been built at which point the
regular OS virtual memory management techniques would kick in. This is
exhibited by the master server in the replica set that is running fine
with the same configuration.

The Master and Secondary servers are configured as follows:
- RHEL 5
- MongoDB 1.8.1
- 24GB RAM
- 128GB SSD for Journals
- Separate 1TB HDD just for data

Thanks to the new Journaling feature on 1.8 the secondary server
recovered quickly, but since the mongod instance was killed and
restarted during the initial sync phase (during the final index
creation steps) it has deleted everything and is now restarting the
initial sync from the beginning again.

Scott Hernandez

unread,
Apr 22, 2011, 9:20:17 AM4/22/11
to mongod...@googlegroups.com
It sounds like the OOM killer took the process down. Is there anything
in your system logs to confirm this?

Also, you may want to read this :
http://www.mongodb.org/display/DOCS/Checking+Server+Memory+Usage

Do you have a swap file/partition?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Reid

unread,
Apr 22, 2011, 10:30:24 AM4/22/11
to mongodb-user
Yes, both master and slave servers have 2GB swap partitions. These
physical servers are built and managed by the Linux sys-admins at
RackSpace.

We also monitor the server with both Hyperic HQ and Nagios. Nagios
indicated that the system was running low on residential (physical)
memory and started swapping what little residential space was left for
the OS itself until the system slowed to a near halt.

To monitor and alert on MongoDB I wrote a plugin for Hyperic HQ
(http://www.hyperic.org) :
https://github.com/ClarityServices/hyperic-mongodb

The Hyperic Plugin for MongoDB allows us to monitor and alert on
everything in mongo db.serverStatus(), as well as all the mongod
process stats ( CPU, Mem, Threads, etc. ), and the number of seconds
any secondary server is behind the master. Hyperic HQ already allows
us to monitor disk activity and disk space by mount. For the data and
journal mounts we also monitor the number of bytes read and written
per minute.

The first time we turned on this monitoring it was a little scary for
our admins until we read up on how MongoDB uses OS Virtual Memory
Management for everything. Brilliant solution!

It was only after adding so many indexes that we have seen the
Residential Memory for mongod go so high. Usually it sits at about
18GB, but is now consistently around 21GB. Going to ask RackSpace for
another 12 or 24GB of RAM in each of our Mongo servers.

Reid

unread,
Apr 23, 2011, 1:49:51 PM4/23/11
to mongodb-user
We just confirmed that this is a bug with MongoDB 1.8.1.

The problem was reproduced by repeating the initial sync of a
secondary server when it crashed again when it ran out of physical
memory.

More importantly after reviewing "free memory + buffers/cache" it
shows a constant drop in this memory shortly after the index creation
phase has started until it is eventually exhausted. Unfortunately this
forum does not support pictures, so I cannot show you the monitoring
chart from the Hyperic HQ Enterprise monitoring system.

"free memory + buffers/cache" should not be confused with just "free
memory" since "free memory" is very low due to the way that MongoDB
uses OS Virtual Memory. "free memory + buffers/cache" removes the OS
Virtual Memory caching from its figure. "free memory + buffers/cache"
on our primary server sits around 21GB. The secondary server also sits
at 21GB "free memory + buffers/cache" until the initial sync index
creation starts.

Eliot Horowitz

unread,
Apr 23, 2011, 3:49:14 PM4/23/11
to mongod...@googlegroups.com
Can you run mongostat while doing this and tell us what options you
started mongod with?

Reid

unread,
Apr 25, 2011, 9:45:17 AM4/25/11
to mongodb-user
We successfully synced the secondary server this weekend by stopping
the master server, copying its data to the secondary server and then
starting it with the --fastsync option.

Below is a link to the image of the Free Memory metrics from Hyperic
HQ including buffers/cache so as to exclude memory used by the Virtual
Memory Management in mongod:

https://github.com/ClarityServices/hyperic-mongodb/wiki/free_memory_with_buffers.png

- The red line shows the Free Memory (+buffers) for the master server,
whereas the blue line shows the secondary server.
During its initial sync the secondary server started building its
indexes just before its Free Memory (+buffers) started dropping, until
the Free Memory (+buffers) was finally exhausted.

After a stock install of MongoDB 1.8.1 on RHEL 5, the following
changes were made:
- Data partition is using an ext4 file system and atime was disabled
- Not using large VM pages and file descriptor limit raised above 4k
- The following lines were added to /etc/mongod.conf
rest=true
replSet=production
journal=true
notablescan=true
- dbpath changed to the 1TB ext4 drive
- Soft link journal to the SSD drive
- Then the server was added to the replica set using the mongo shell

Eliot Horowitz

unread,
Apr 26, 2011, 1:24:58 AM4/26/11
to mongod...@googlegroups.com
In order to really see what's going on, need the internal memory stats
from mongo.
So mongostat, or the munin plugins.
Those give us a breakdown of mapped, resident, heap, etc...

Reid

unread,
May 16, 2011, 5:56:44 PM5/16/11
to mongodb-user
We added another secondary server to the Replica Set and instead of
crashing when out of residential memory, it now fails with:
replSet initial sync exception 10092 too may dups on index build
with dropDups=true

Below is an extract of the log files to give a better picture.

Sat May 14 00:24:48 [replica set sync] ******
Sat May 14 00:24:48 [replica set sync] creating replication oplog of
size: 88974MB...
Sat May 14 00:24:48 [FileAllocator] done allocating datafile /
mongodata/mongo/local.1, size: 128MB, took 0.192 secs
Sat May 14 00:24:48 [FileAllocator] allocating new datafile /mongodata/
mongo/local.2, filling with zeroes...
.......
Mon May 16 16:52:11 [replica set sync] 107673282 objects cloned so far
from collection production.responses
Mon May 16 16:52:25 [replica set sync] building new index on { _id:
1 } for production.responses
288900/107686930 0%
738100/107686930 0%
......
107682700/107686930 99%
Mon May 16 17:53:54 [replica set sync] external sort used : 108
files in 3689 secs
521500/107686930 0%
......
77410500/107686930 71%
77416100/107686930 71%
Mon May 16 18:49:50 [replica set sync] replSet initial sync exception
10092 too may dups on index build with dropDups=true


Running Mongodb 1.8.1. The Master database was built on MongoDB 1.8
and then upgraded to 1.8.1 when it was released.

I found the error message in the source code:
https://github.com/mongodb/mongo/blob/master/db/pdfile.cpp on line
1222:

/* we could queue these on disk, but normally there are very few
dups, so instead we
keep in ram and have a limit.
*/
dupsToDrop.push_back(d.second);
uassert( 10092 , "too may dups on index build with
dropDups=true", dupsToDrop.size() < 10000

It is clear now why it was crashing previously. Now there is an
assertion to make sure it does not run out of memory.

No dropped dups were found in the log file.

How do we fix this issue with the _id key? Did not think we could
insert multiple documents into a mongo collection with the same _id
value?

On Apr 26, 1:24 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> In order to really see what's going on, need the internal memory stats
> from mongo.
> So mongostat, or the munin plugins.
> Those give us a breakdown of mapped, resident, heap, etc...
>
>
>
>
>
>
>
> On Mon, Apr 25, 2011 at 9:45 AM,Reid<rei...@gmail.com> wrote:
> > We successfully synced the secondary server this weekend by stopping
> > the master server, copying its data to the secondary server and then
> > starting it with the --fastsync option.
>
> > Below is a link to the image of the Free Memory metrics from Hyperic
> > HQ including buffers/cache so as to exclude memory used by the Virtual
> > Memory Management in mongod:
>
> >https://github.com/ClarityServices/hyperic-mongodb/wiki/free_memory_w...
>
> > - The red line shows the Free Memory (+buffers) for the master server,
> > whereas the blue line shows the secondary server.
> > During its initial sync the secondary server started building its
> > indexes just before its Free Memory (+buffers) started dropping, until
> > the Free Memory (+buffers) was finally exhausted.
>
> > After a stock install of MongoDB 1.8.1 on RHEL 5, the following
> > changes were made:
> > - Data partition is using an ext4 file system and atime was disabled
> > - Not using large VM pages and file descriptor limit raised above 4k
> > - The following lines were added to /etc/mongod.conf
> >  rest=true
> >  replSet=production
> >  journal=true
> >  notablescan=true
> > - dbpath changed to the 1TB ext4 drive
> > - Soft link journal to the SSD drive
> > - Then the server was added to the replica set using the mongo shell
>
> > On Apr 23, 3:49 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >> Can you run mongostat while doing this and tell us what options you
> >> started mongod with?
>
> >> On Sat, Apr 23, 2011 at 1:49 PM,Reid<rei...@gmail.com> wrote:
> >> > We just confirmed that this is a bug with MongoDB 1.8.1.
>
> >> > The problem was reproduced by repeating the initial sync of a
> >> > secondary server when it crashed again when it ran out of physical
> >> > memory.
>
> >> > More importantly after reviewing "free memory + buffers/cache" it
> >> > shows a constant drop in this memory shortly after the index creation
> >> > phase has started until it is eventually exhausted. Unfortunately this
> >> > forum does not support pictures, so I cannot show you the monitoring
> >> > chart from the Hyperic HQ Enterprise monitoring system.
>
> >> > "free memory + buffers/cache" should not be confused with just "free
> >> > memory" since "free memory" is very low due to the way that MongoDB
> >> > uses OS Virtual Memory. "free memory + buffers/cache" removes the OS
> >> > Virtual Memory caching from its figure. "free memory + buffers/cache"
> >> > on our primary server sits around 21GB. The secondary server also sits
> >> > at 21GB "free memory + buffers/cache" until the initial sync index
> >> > creation starts.
>

Eliot Horowitz

unread,
May 16, 2011, 6:27:04 PM5/16/11
to mongod...@googlegroups.com
Can you send getIndexKeys() from all the collections on the master?

Reid

unread,
May 17, 2011, 7:51:07 AM5/17/11
to mongodb-user
Hi Eliot,

Thank you for your prompt response. We have 2 collections, a smaller
tradelines collection, and a responses collection with over 100
million records. Below are the results from both collections:

production:PRIMARY> db.tradelines.getIndexKeys()
[ { "_id" : 1 }, { "id" : 1 } ]
production:PRIMARY> db.responses.getIndexKeys()
[
{
"_id" : 1
},
{
"id" : 1
},
{
"tracking_number" : 1
},
{
"inquiry.cell_phone" : 1
},
{
"inquiry.number" : 1,
"inquiry.state" : 1
},
{
"inquiry.email_address" : 1
},
{
"inquiry.encrypted_number" : 1
},
{
"inquiry.home_phone" : 1
},
{
"inquiry.inquiry_purpose_type" : 1
},
{
"inquiry.inquiry_received_at" : 1
},
{
"inquiry.location_id" : 1
},
{
"inquiry.routing_number" : 1,
"inquiry.encrypted_account_number" : 1
},
{
"inquiry.upload_batch_tracking_number" : 1
},
{
"inquiry.location_id" : 1,
"inquiry.created_at" : 1,
"inquiry.file_name" : 1
}
]
production:PRIMARY>

Thank you,
Reid
> >https://github.com/mongodb/mongo/blob/master/db/pdfile.cppon line
> ...
>
> read more »

Eliot Horowitz

unread,
May 17, 2011, 8:23:49 AM5/17/11
to mongod...@googlegroups.com
I'm sorry, I mean getIndexes(), not getIndexKeys()

Reid

unread,
May 17, 2011, 1:06:40 PM5/17/11
to mongodb-user
production:PRIMARY> db.tradelines.getIndexes()
[
{
"name" : "_id_",
"ns" : "production.tradelines",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4db97dfa4ed562dfa4718aad"),
"ns" : "production.tradelines",
"key" : {
"id" : 1
},
"name" : "id_1",
"unique" : true,
"v" : 0
}
]

production:PRIMARY> db.responses.getIndexes()
[
{
"name" : "_id_",
"ns" : "production.responses",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"name" : "id_1",
"ns" : "production.responses",
"key" : {
"id" : 1
},
"unique" : true,
"background" : false,
"v" : 0
},
{
"name" : "tracking_number_1",
"ns" : "production.responses",
"key" : {
"tracking_number" : 1
},
"background" : false,
"v" : 0
},
{
"name" : "inquiry.cell_phone_1",
"ns" : "production.responses",
"key" : {
"inquiry.cell_phone" : 1
},
"sparse" : true,
"background" : false,
"v" : 0
},
{
"name" : "inquiry.number_1_inquiry.state_1",
"ns" : "production.responses",
"key" : {
"inquiry.number" : 1,
"inquiry.state" : 1
},
"sparse" : true,
"background" : false,
"v" : 0
},
{
"name" : "inquiry.email_address_1",
"ns" : "production.responses",
"key" : {
"inquiry.email_address" : 1
},
"sparse" : true,
"background" : false,
"v" : 0
},
{
"_id" : ObjectId("4daedd543fdf2f8826d3ab0f"),
"ns" : "production.responses",
"key" : {
"inquiry.encrypted_number" : 1
},
"name" : "inquiry.encrypted_number_1",
"v" : 0
},
{
"_id" : ObjectId("4daef7ab3fdf2f8826d3ab10"),
"ns" : "production.responses",
"key" : {
"inquiry.home_phone" : 1
},
"name" : "inquiry.home_phone_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4daf10823fdf2f8826d3ab11"),
"ns" : "production.responses",
"key" : {
"inquiry.inquiry_purpose_type" : 1
},
"name" : "inquiry.inquiry_purpose_type_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4daf25d93fdf2f8826d3ab12"),
"ns" : "production.responses",
"key" : {
"inquiry.inquiry_received_at" : 1
},
"name" : "inquiry.inquiry_received_at_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4daf38c33fdf2f8826d3ab13"),
"ns" : "production.responses",
"key" : {
"inquiry.location_id" : 1
},
"name" : "inquiry.location_id_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4db80675185c148044f00f48"),
"ns" : "production.responses",
"key" : {
"inquiry.routing_number" : 1,
"inquiry.encrypted_account_number" : 1
},
"name" :
"inquiry.routing_number_1_inquiry.encrypted_account_number_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4db82fd5d73c8df4a04e0f38"),
"ns" : "production.responses",
"key" : {
"inquiry.upload_batch_tracking_number" : 1
},
"name" : "inquiry.upload_batch_tracking_number_1",
"sparse" : true,
"v" : 0
},
{
"_id" : ObjectId("4db84656d7923c20aeebe373"),
"ns" : "production.responses",
"key" : {
"inquiry.location_id" : 1,
"inquiry.created_at" : 1,
"inquiry.file_name" : 1
},
"name" :
"inquiry.location_id_1_inquiry.created_at_1_inquiry.file_name_1",
"sparse" : true,
"v" : 0
}
]
> ...
>
> read more »

Eliot Horowitz

unread,
May 17, 2011, 2:52:29 PM5/17/11
to mongod...@googlegroups.com
Do you do a lot of updates that could increase the size of documents?

Reid

unread,
May 17, 2011, 3:54:46 PM5/17/11
to mongodb-user
The documents are mostly read-only, but we do pre-calculate some
fields which we have added afterwards to the existing documents. Such
additions would not add more than 1K of data to each document. Each
document varies between 4K and 20K in BSON.
> ...
>
> read more »

Eliot Horowitz

unread,
May 18, 2011, 2:32:27 AM5/18/11
to mongod...@googlegroups.com
Have you tried bringing up the secondary again?

Reid

unread,
May 18, 2011, 9:54:23 AM5/18/11
to mongodb-user
Yes, it just starts the initial replication from scratch every time
the secondary server is started. It copies all the data over again,
and then fails during the index build phase.

We are starting another replication today after increasing the memory
in the secondary server to 72GB. We will see if that helps the index
creation phase. The database is now over 700GB, so it will take a few
days to finish.
> ...
>
> read more »

Reid

unread,
May 19, 2011, 2:58:49 PM5/19/11
to mongodb-user
The secondary server core dumped with the following output in the
mongo log file:

Wed May 18 22:15:24 [initandlisten] connection accepted from
192.168.112.120:53146 #760
Wed May 18 22:15:24 [conn760] end connection 192.168.112.120:53146
Wed May 18 22:15:59 [dur] lsn set 30706940
Wed May 18 22:16:13 [replica set sync] 66251710 objects cloned so far
from collection production.responses
Wed May 18 22:16:14 [dur] old journal file will be removed: /mongodata/
mongo/journal/j._197
Wed May 18 22:16:29 [FileAllocator] allocating new datafile /mongodata/
mongo/production.308, filling with zeroes...
Wed May 18 22:16:29 Invalid access at address: 0x2bf831dcdc00

Wed May 18 22:16:29 Got signal: 11 (Segmentation fault).

Wed May 18 22:16:29 Backtrace:
0x8a2e59 0x8a3430 0x3bc720eb10 0x76d400 0x771b18 0x7719e4 0x7719e4
0x7719e4 0x7719e4 0x7719e4 0x7719e4 0x7719e4 0x7719e4 0x7719e4
0x7719e4 0x7719e4 0x7719e4 0x7719e4 0x7719e4 0x7719e4
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x8a2e59]
/usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv
+0x220) [0x8a3430]
/lib64/libpthread.so.0 [0x3bc720eb10]
/usr/bin/mongod(_ZN5mongo6Extent4initEPKciii+0x30) [0x76d400]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x268)
[0x771b18]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]
/usr/bin/mongod(_ZN5mongo13MongoDataFile12createExtentEPKcibi+0x134)
[0x7719e4]

Wed May 18 22:16:29 ERROR: Client::~Client _context should be null but
is not; client:replica set sync
Wed May 18 22:16:29 ERROR: Client::shutdown not called: replica set
sync


We are now copying the data files manually from another slave to see
if that works.
> ...
>
> read more »

Reid

unread,
May 20, 2011, 10:33:47 AM5/20/11
to mongodb-user
Doing a copy of the data files from the primary to the slave and then
using --fastsync worked.
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages