Received: by 10.58.69.11 with SMTP id a11mr2380638veu.30.1349443181702; Fri, 05 Oct 2012 06:19:41 -0700 (PDT) X-BeenThere: mongodb-user@googlegroups.com Received: by 10.220.141.4 with SMTP id k4ls3044900vcu.8.gmail; Fri, 05 Oct 2012 06:19:32 -0700 (PDT) Received: by 10.52.29.225 with SMTP id n1mr1551360vdh.5.1349443172412; Fri, 05 Oct 2012 06:19:32 -0700 (PDT) Date: Fri, 5 Oct 2012 06:19:32 -0700 (PDT) From: Gianfranco To: mongodb-user@googlegroups.com Message-Id: In-Reply-To: <9d15fe35-07f6-4ab4-b069-13274ca550c6@googlegroups.com> References: <0c4eca80-4052-408d-9dff-2cff2be99c16@googlegroups.com> <8ad827d5-09e4-40fb-8c4d-e08be7fc121a@t20g2000yqn.googlegroups.com> <8687F49F-4AB2-419B-8019-FD3BC40F15E4@gmail.com> <7bb71769-bd12-4e8d-bd3c-c0352ec3d802@30g2000yqi.googlegroups.com> <5E321CD693F54C73BB1DE1AEE46F2615@gmail.com> <6AEE2A807DD942E78259A3439EFE6AA1@gmail.com> <21DDBA49-09FF-4BAB-8233-B454D4BEF172@gmail.com> <9d15fe35-07f6-4ab4-b069-13274ca550c6@googlegroups.com> Subject: Re: [mongodb-user] Re: Replica Set Full Re-sync Out of Memory During Index Building MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_671_28845010.1349443172067" ------=_Part_671_28845010.1349443172067 Content-Type: multipart/alternative; boundary="----=_Part_672_3663415.1349443172067" ------=_Part_672_3663415.1349443172067 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit What version are you using Dave? There's been a fix since 2.0.7 https://jira.mongodb.org/browse/SERVER-6414 which addresses introduces 'much better for memory consumption and performance'. Related to this previous issue. On Friday, September 28, 2012 9:54:30 PM UTC+1, David K Storrs wrote: > > Did you ever find an answer? We are having the same issue? > > Dave > > On Monday, June 25, 2012 4:36:07 PM UTC-7, jnunemaker wrote: >> >> The good news is I managed to get one of the two new machines that I need >> to sync up to date last night (after like try 5). Going to try syncing the >> other one tonight. >> >> I've had this sync problem before as well. Would love to get it sorted >> out so I'm not so nervous about losing machines. I can't currently do file >> system snapshots so I kind of have to do full syncs. Let me know if you >> need anything else from me. Happy to help. >> >> On Mon, Jun 25, 2012 at 10:13 AM, Sid wrote: >> >>> Thanks for filing the ticket along with the logs. Will look into it and >>> update the relevant ticket accordingly. Much thanks for reporting this to >>> us. >>> >>> On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wrote: >>>> >>>> Nah, swap space didn't really help. It made it to the last data file >>>> once without swap. >>>> >>>> >>>> >>>> On Jun 22, 2012, at 1:36 PM, Sid wrote: >>>> >>>> So adding extra swap space did help it in making it move forward. As >>>> for the logs, yes can you please create a ticket in community private and >>>> attach the logs there. >>>> >>>> Thanks. >>>> >>>> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote: >>>>> >>>>> Failed again last night with 8GB of RAM and 6GB of swap. Got to data >>>>> file 49 of 52. >>>>> >>>>> I'd prefer not to post the logs publicly. Should I send them directly >>>>> to you or drop them in jira community private or something? >>>>> >>>>> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote: >>>>> >>>>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on >>>>> it in the morning (EST) and post the results. >>>>> >>>>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote: >>>>> >>>>> Can you please try with a larger swap file. Also, can you please try >>>>> reproducing the issue with log level 2 on the node that you are trying to >>>>> resync and post the logs. I am interested in seeing the logs specifically >>>>> from the time when its building the index. To run a mongo instance with >>>>> higher verbosity level just pass an extra argument -vv on the command line >>>>> when you start mongo. >>>>> >>>>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote: >>>>> >>>>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest >>>>> stable ubuntu. >>>>> 2) data size is like 54GB. Index size is ~34GB. >>>>> 3) Disk is not saturated. All the swap gets used too. It just died >>>>> again, so I couldn't free -m, but the kernel log shows free swap as nothing >>>>> so I'm assuming it is burning that as well. >>>>> >>>>> It seems to fail at a similar point, right around data file 47 and is >>>>> always during index building. >>>>> >>>>> The service is analytics, so our active set is relatively small >>>>> compared to all the data/index size. >>>>> >>>>> Most data is partitioned a collection per month as well, so only the >>>>> latest collections actually receive writes. It does not appear to be >>>>> getting to this point yet. Seems to be dying when it gets to a collection a >>>>> few months back, or at least that is what is in the log. >>>>> >>>>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote: >>>>> >>>>> Ok. Can you post the following please : >>>>> >>>>> i) details about the machine (RAM, platform, OS) etc. >>>>> ii) How big is the data size and what is the size of the indexes etc. >>>>> iii) Output from free -m while indexing is going on. Also is the disk >>>>> saturated when it happens ? >>>>> >>>>> >>>>> On Jun 20, 12:10 pm, John Nunemaker wrote: >>>>> >>>>> Yes. 2gb. >>>>> >>>>> On Jun 20, 2012, at 11:55 AM, Sid wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Do you have any swap space ? If yes, how much swap space are you >>>>> running i with ? >>>>> >>>>> >>>>> On Jun 20, 10:30 am, jnunemaker wrote: >>>>> >>>>> I've tried 3 times with one machine and 1 time with another to add >>>>> another >>>>> replica to a set. Each time it gets through 45-47 data files out of 52 >>>>> and >>>>> then starts rapidly using memory until it eventually gets sniped by >>>>> the OOM >>>>> killer. >>>>> >>>>> >>>>> For now I've added an arbiter so I have 2 full copies and an arbiter. I >>>>> need to get the new machines synced though as one of the 2 full copies >>>>> has >>>>> half the hardware as we are in the middle of transitioning to new >>>>> hardware. >>>>> >>>>> >>>>> According to our host, we can't snapshot only the data directy, it >>>>> would >>>>> have to be the whole server, which would be a mess for config. Pretty >>>>> sure >>>>> this means we have to do a full re-sync and they keep running out of >>>>> memory. >>>>> >>>>> >>>>> It seems to always happen in the index building phase. Let me know if >>>>> any >>>>> more information would help (servers, logs, etc.). >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb >>>>> >>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "mongodb-user" group. >>>> To post to this group, send email to mongod...@googlegroups.com >>>> To unsubscribe from this group, send email to >>>> mongodb-user...@**googlegroups.com >>>> See also the IRC channel -- freenode.net#mongodb >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "mongodb-user" group. >>> To post to this group, send email to mongod...@googlegroups.com >>> To unsubscribe from this group, send email to >>> mongodb-user...@googlegroups.com >>> See also the IRC channel -- freenode.net#mongodb >>> >> >> ------=_Part_672_3663415.1349443172067 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable What version are you using Dave?
There's been a fix since 2.0.7 ht= tps://jira.mongodb.org/browse/SERVER-6414
which addresses introdu= ces 'much better for memory consumption and performance'.
Related= to this previous issue.

On Friday, September 28, 2012 9:54:3= 0 PM UTC+1, David K Storrs wrote:
Did you ever find an answer?  We are having the same issue?
Dave

On Monday, June 25, 2012 4:36:07 PM UTC-7, jnunemak= er wrote:
The good news is I managed= to get one of the two new machines that I need to sync up to date last nig= ht (after like try 5). Going to try syncing the other one tonight.

=
I've had this sync problem before as well. Would love to get it = sorted out so I'm not so nervous about losing machines. I can't currently d= o file system snapshots so I kind of have to do full syncs. Let me know if = you need anything else from me. Happy to help.

On Mon, Jun 25, 2012 at 10:13 AM, Sid <siddhar...@10gen.com> wrote= :
Thanks for filing the ticket along with the logs. Will look into it and upd= ate the relevant ticket accordingly. Much thanks for reporting this to us.<= div>

On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wro= te:
Nah, swap space didn't really help. It made i= t to the last data file once without swap. 



On Jun 22, 2012, at 1:36 PM, Sid <siddhar..= ....@10gen.com> wrote:

So adding extra swap sp= ace did help it in making it move forward. As for the logs, yes can you ple= ase create a ticket in community private and attach the logs there. 
Thanks.

On Friday, June 22, 2012 10:26:19 AM UTC-4, j= nunemaker wrote:
Failed again last night with 8GB of RAM and 6GB of swap= . Got to data file 49 of 52.

I'd prefer not to post the logs p= ublicly. Should I send them directly to you or drop them in jira community = private or something?
=20

On Thursday, June 21, 2012 at 10= :56 PM, John Nunemaker wrote:

Bumped up swap to 6GB and changed log verbosity to 2. I= 'll check on it in the morning (EST) and post the results.
=20 =20

On Thursday, June 21, 2012 at 11= :24 AM, Sid wrote:

Can you please try with a larger swap file. Also, c<= span style=3D"line-height:19px;font-size:14px;background-color:rgb(240,240,= 240);font-family:arial,FreeSans,Helvetica,sans-serif">an you please try rep= roducing the issue with log level 2 on the node that you are trying to resy= nc and post the logs. I am interested in seeing the logs specifically from = the time when its building the index. To run a mongo instance with higher v= erbosity level just pass an extra argument -vv on the command line when you= start mongo.

On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
1) 6GB of RAM. Just upped to 8GB of RAM and still faile= d. Latest stable ubuntu.
2) data size is like 54GB. Index size is ~34GB.&= nbsp;
3) Disk is not saturated. All the swap gets used too. It ju= st died again, so I couldn't free -m, but the kernel log shows free swap as= nothing so I'm assuming it is burning that as well. 

It seems to fail at a similar point, right around data = file 47 and is always during index building.

The s= ervice is analytics, so our active set is relatively small compared to all = the data/index size. 

Most data is partitioned a collection per month as well= , so only the latest collections actually receive writes. It does not appea= r to be getting to this point yet. Seems to be dying when it gets to a coll= ection a few months back, or at least that is what is in the log.
=20 =20

On Wednesday, June 20, 2012 at 1= 2:41 PM, Sid wrote:

Ok. Can you post the following ple= ase :

i) details about the machine (RAM, platform,= OS) etc.
ii) How big is the data size and what is the size of th= e indexes etc.
iii) Output from free -m while indexing is going on. Also is the disk<= /div>
saturated when it happens ?


On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
Yes. 2gb.

On J= un 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:





=


Do you ha= ve any swap space ? If yes, how much swap space are you
running i= with ?

On Ju= n 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
I've tried 3 times with one machine and 1 time with another to add another<= /div>
replica to a set. Each time it gets through 45-47 data files out = of 52 and
then starts rapidly using memory until it eventually ge= ts sniped by the OOM
killer.

For now I've adde= d an arbiter so I have 2 full copies and an arbiter. I
need to ge= t the new machines synced though as one of the 2 full copies has
half the hardware as we are in the middle of transitioning to new hard= ware.

According to our host, we can= 't snapshot only the data directy, it would
have to be the whole server, which would be a mess for config. Pretty = sure
this means we have to do a full re-sync and they keep runnin= g out of memory.

It seems to always happen in the index = building phase. Let me know if any
more information would help (s= ervers, logs, etc.).

--
You received this message = because you are subscribed to the Google
Groups "mongodb-user" gr= oup.
To post to this group, send email to mongo= d...@googlegroups.com
To unsubscribe from this group, send email to
See also the= IRC channel -- freenode.net#mongodb

--
You= received this message because you are subscribed to the Google
G= roups "mongodb-user" group.
To post to this group, send email to = mongod...@googlegroups.com
To unsubscribe from this group, send email to
See also the= IRC channel -- freenode.net#mongodb
=20 =20 =20 =20

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegrou= ps.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
=20 =20 =20 =20

=20 =20 =20 =20 =20

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegrou= ps.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegrou= ps.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

------=_Part_672_3663415.1349443172067-- ------=_Part_671_28845010.1349443172067--