I've tried 3 times with one machine and 1 time with another to add another replica to a set. Each time it gets through 45-47 data files out of 52 and then starts rapidly using memory until it eventually gets sniped by the OOM killer.
For now I've added an arbiter so I have 2 full copies and an arbiter. I need to get the new machines synced though as one of the 2 full copies has half the hardware as we are in the middle of transitioning to new hardware.
According to our host, we can't snapshot only the data directy, it would have to be the whole server, which would be a mess for config. Pretty sure this means we have to do a full re-sync and they keep running out of memory.
It seems to always happen in the index building phase. Let me know if any more information would help (servers, logs, etc.).
> I've tried 3 times with one machine and 1 time with another to add another
> replica to a set. Each time it gets through 45-47 data files out of 52 and
> then starts rapidly using memory until it eventually gets sniped by the OOM
> killer.
> For now I've added an arbiter so I have 2 full copies and an arbiter. I
> need to get the new machines synced though as one of the 2 full copies has
> half the hardware as we are in the middle of transitioning to new hardware.
> According to our host, we can't snapshot only the data directy, it would
> have to be the whole server, which would be a mess for config. Pretty sure
> this means we have to do a full re-sync and they keep running out of memory.
> It seems to always happen in the index building phase. Let me know if any
> more information would help (servers, logs, etc.).
> Do you have any swap space ? If yes, how much swap space are you
> running i with ?
> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>> I've tried 3 times with one machine and 1 time with another to add another
>> replica to a set. Each time it gets through 45-47 data files out of 52 and
>> then starts rapidly using memory until it eventually gets sniped by the OOM
>> killer.
>> For now I've added an arbiter so I have 2 full copies and an arbiter. I
>> need to get the new machines synced though as one of the 2 full copies has
>> half the hardware as we are in the middle of transitioning to new hardware.
>> According to our host, we can't snapshot only the data directy, it would
>> have to be the whole server, which would be a mess for config. Pretty sure
>> this means we have to do a full re-sync and they keep running out of memory.
>> It seems to always happen in the index building phase. Let me know if any
>> more information would help (servers, logs, etc.).
> -- > You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
i) details about the machine (RAM, platform, OS) etc.
ii) How big is the data size and what is the size of the indexes etc.
iii) Output from free -m while indexing is going on. Also is the disk
saturated when it happens ?
On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
> > Do you have any swap space ? If yes, how much swap space are you
> > running i with ?
> > On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
> >> I've tried 3 times with one machine and 1 time with another to add another
> >> replica to a set. Each time it gets through 45-47 data files out of 52 and
> >> then starts rapidly using memory until it eventually gets sniped by the OOM
> >> killer.
> >> For now I've added an arbiter so I have 2 full copies and an arbiter. I
> >> need to get the new machines synced though as one of the 2 full copies has
> >> half the hardware as we are in the middle of transitioning to new hardware.
> >> According to our host, we can't snapshot only the data directy, it would
> >> have to be the whole server, which would be a mess for config. Pretty sure
> >> this means we have to do a full re-sync and they keep running out of memory.
> >> It seems to always happen in the index building phase. Let me know if any
> >> more information would help (servers, logs, etc.).
> > --
> > You received this message because you are subscribed to the Google
> > Groups "mongodb-user" group.
> > To post to this group, send email to mongodb-user@googlegroups.com
> > To unsubscribe from this group, send email to
> > mongodb-user+unsubscribe@googlegroups.com
> > See also the IRC channel -- freenode.net#mongodb
1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable ubuntu. 2) data size is like 54GB. Index size is ~34GB. 3) Disk is not saturated. All the swap gets used too. It just died again, so I couldn't free -m, but the kernel log shows free swap as nothing so I'm assuming it is burning that as well.
It seems to fail at a similar point, right around data file 47 and is always during index building.
The service is analytics, so our active set is relatively small compared to all the data/index size.
Most data is partitioned a collection per month as well, so only the latest collections actually receive writes. It does not appear to be getting to this point yet. Seems to be dying when it gets to a collection a few months back, or at least that is what is in the log.
On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
> Ok. Can you post the following please :
> i) details about the machine (RAM, platform, OS) etc.
> ii) How big is the data size and what is the size of the indexes etc.
> iii) Output from free -m while indexing is going on. Also is the disk
> saturated when it happens ?
> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > Yes. 2gb.
> > On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si (http://siddharth.si)....@10gen.com (http://10gen.com)> wrote:
> > > Do you have any swap space ? If yes, how much swap space are you
> > > running i with ?
> > > On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > > > I've tried 3 times with one machine and 1 time with another to add another
> > > > replica to a set. Each time it gets through 45-47 data files out of 52 and
> > > > then starts rapidly using memory until it eventually gets sniped by the OOM
> > > > killer.
> > > > For now I've added an arbiter so I have 2 full copies and an arbiter. I
> > > > need to get the new machines synced though as one of the 2 full copies has
> > > > half the hardware as we are in the middle of transitioning to new hardware.
> > > > According to our host, we can't snapshot only the data directy, it would
> > > > have to be the whole server, which would be a mess for config. Pretty sure
> > > > this means we have to do a full re-sync and they keep running out of memory.
> > > > It seems to always happen in the index building phase. Let me know if any
> > > > more information would help (servers, logs, etc.).
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "mongodb-user" group.
> > > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > > To unsubscribe from this group, send email to
> > > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
> -- > You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
Can you please try with a larger swap file. Also, can you please try reproducing the issue with log level 2 on the node that you are trying to resync and post the logs. I am interested in seeing the logs specifically from the time when its building the index. To run a mongo instance with higher verbosity level just pass an extra argument -vv on the command line when you start mongo.
On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable > ubuntu. > 2) data size is like 54GB. Index size is ~34GB. > 3) Disk is not saturated. All the swap gets used too. It just died again, > so I couldn't free -m, but the kernel log shows free swap as nothing so I'm > assuming it is burning that as well.
> It seems to fail at a similar point, right around data file 47 and is > always during index building.
> The service is analytics, so our active set is relatively small compared > to all the data/index size.
> Most data is partitioned a collection per month as well, so only the > latest collections actually receive writes. It does not appear to be > getting to this point yet. Seems to be dying when it gets to a collection a > few months back, or at least that is what is in the log.
> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
> Ok. Can you post the following please :
> i) details about the machine (RAM, platform, OS) etc. > ii) How big is the data size and what is the size of the indexes etc. > iii) Output from free -m while indexing is going on. Also is the disk > saturated when it happens ?
> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
> Yes. 2gb.
> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
> Do you have any swap space ? If yes, how much swap space are you > running i with ?
> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
> I've tried 3 times with one machine and 1 time with another to add another > replica to a set. Each time it gets through 45-47 data files out of 52 and > then starts rapidly using memory until it eventually gets sniped by the OOM > killer.
> For now I've added an arbiter so I have 2 full copies and an arbiter. I > need to get the new machines synced though as one of the 2 full copies has > half the hardware as we are in the middle of transitioning to new hardware.
> According to our host, we can't snapshot only the data directy, it would > have to be the whole server, which would be a mess for config. Pretty sure > this means we have to do a full re-sync and they keep running out of > memory.
> It seems to always happen in the index building phase. Let me know if any > more information would help (servers, logs, etc.).
> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
> Can you please try with a larger swap file. Also, can you please try reproducing the issue with log level 2 on the node that you are trying to resync and post the logs. I am interested in seeing the logs specifically from the time when its building the index. To run a mongo instance with higher verbosity level just pass an extra argument -vv on the command line when you start mongo.
> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
> > 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable ubuntu. > > 2) data size is like 54GB. Index size is ~34GB. > > 3) Disk is not saturated. All the swap gets used too. It just died again, so I couldn't free -m, but the kernel log shows free swap as nothing so I'm assuming it is burning that as well.
> > It seems to fail at a similar point, right around data file 47 and is always during index building.
> > The service is analytics, so our active set is relatively small compared to all the data/index size.
> > Most data is partitioned a collection per month as well, so only the latest collections actually receive writes. It does not appear to be getting to this point yet. Seems to be dying when it gets to a collection a few months back, or at least that is what is in the log.
> > On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
> > > Ok. Can you post the following please :
> > > i) details about the machine (RAM, platform, OS) etc.
> > > ii) How big is the data size and what is the size of the indexes etc.
> > > iii) Output from free -m while indexing is going on. Also is the disk
> > > saturated when it happens ?
> > > On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > > > Yes. 2gb.
> > > > On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si (http://siddharth.si)....@10gen.com (http://10gen.com)> wrote:
> > > > > Do you have any swap space ? If yes, how much swap space are you
> > > > > running i with ?
> > > > > On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > > > > > I've tried 3 times with one machine and 1 time with another to add another
> > > > > > replica to a set. Each time it gets through 45-47 data files out of 52 and
> > > > > > then starts rapidly using memory until it eventually gets sniped by the OOM
> > > > > > killer.
> > > > > > For now I've added an arbiter so I have 2 full copies and an arbiter. I
> > > > > > need to get the new machines synced though as one of the 2 full copies has
> > > > > > half the hardware as we are in the middle of transitioning to new hardware.
> > > > > > According to our host, we can't snapshot only the data directy, it would
> > > > > > have to be the whole server, which would be a mess for config. Pretty sure
> > > > > > this means we have to do a full re-sync and they keep running out of memory.
> > > > > > It seems to always happen in the index building phase. Let me know if any
> > > > > > more information would help (servers, logs, etc.).
> > > > > --
> > > > > You received this message because you are subscribed to the Google
> > > > > Groups "mongodb-user" group.
> > > > > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > > > > To unsubscribe from this group, send email to
> > > > > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > > > > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
> > > -- > > > You received this message because you are subscribed to the Google
> > > Groups "mongodb-user" group.
> > > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > > To unsubscribe from this group, send email to
> > > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
> -- > You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on it in the morning (EST) and post the results.
> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
> > Can you please try with a larger swap file. Also, can you please try reproducing the issue with log level 2 on the node that you are trying to resync and post the logs. I am interested in seeing the logs specifically from the time when its building the index. To run a mongo instance with higher verbosity level just pass an extra argument -vv on the command line when you start mongo.
> > On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
> > > 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable ubuntu. > > > 2) data size is like 54GB. Index size is ~34GB. > > > 3) Disk is not saturated. All the swap gets used too. It just died again, so I couldn't free -m, but the kernel log shows free swap as nothing so I'm assuming it is burning that as well.
> > > It seems to fail at a similar point, right around data file 47 and is always during index building.
> > > The service is analytics, so our active set is relatively small compared to all the data/index size.
> > > Most data is partitioned a collection per month as well, so only the latest collections actually receive writes. It does not appear to be getting to this point yet. Seems to be dying when it gets to a collection a few months back, or at least that is what is in the log.
> > > On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
> > > > Ok. Can you post the following please :
> > > > i) details about the machine (RAM, platform, OS) etc.
> > > > ii) How big is the data size and what is the size of the indexes etc.
> > > > iii) Output from free -m while indexing is going on. Also is the disk
> > > > saturated when it happens ?
> > > > On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > > > > Yes. 2gb.
> > > > > On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si (http://siddharth.si)....@10gen.com (http://10gen.com)> wrote:
> > > > > > Do you have any swap space ? If yes, how much swap space are you
> > > > > > running i with ?
> > > > > > On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com (http://gmail.com)> wrote:
> > > > > > > I've tried 3 times with one machine and 1 time with another to add another
> > > > > > > replica to a set. Each time it gets through 45-47 data files out of 52 and
> > > > > > > then starts rapidly using memory until it eventually gets sniped by the OOM
> > > > > > > killer.
> > > > > > > For now I've added an arbiter so I have 2 full copies and an arbiter. I
> > > > > > > need to get the new machines synced though as one of the 2 full copies has
> > > > > > > half the hardware as we are in the middle of transitioning to new hardware.
> > > > > > > According to our host, we can't snapshot only the data directy, it would
> > > > > > > have to be the whole server, which would be a mess for config. Pretty sure
> > > > > > > this means we have to do a full re-sync and they keep running out of memory.
> > > > > > > It seems to always happen in the index building phase. Let me know if any
> > > > > > > more information would help (servers, logs, etc.).
> > > > > > --
> > > > > > You received this message because you are subscribed to the Google
> > > > > > Groups "mongodb-user" group.
> > > > > > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > > > > > To unsubscribe from this group, send email to
> > > > > > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > > > > > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
> > > > -- > > > > You received this message because you are subscribed to the Google
> > > > Groups "mongodb-user" group.
> > > > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > > > To unsubscribe from this group, send email to
> > > > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > > > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
> > -- > > You received this message because you are subscribed to the Google
> > Groups "mongodb-user" group.
> > To post to this group, send email to mongodb-user@googlegroups.com (mailto:mongodb-user@googlegroups.com)
> > To unsubscribe from this group, send email to
> > mongodb-user+unsubscribe@googlegroups.com (mailto:mongodb-user+unsubscribe@googlegroups.com)
> > See also the IRC channel -- freenode.net (http://freenode.net)#mongodb
So adding extra swap space did help it in making it move forward. As for the logs, yes can you please create a ticket in community private and attach the logs there.
On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
> Failed again last night with 8GB of RAM and 6GB of swap. Got to data file > 49 of 52.
> I'd prefer not to post the logs publicly. Should I send them directly to > you or drop them in jira community private or something?
> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on it > in the morning (EST) and post the results.
> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
> Can you please try with a larger swap file. Also, can you please try > reproducing the issue with log level 2 on the node that you are trying to > resync and post the logs. I am interested in seeing the logs specifically > from the time when its building the index. To run a mongo instance with > higher verbosity level just pass an extra argument -vv on the command line > when you start mongo.
> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable > ubuntu. > 2) data size is like 54GB. Index size is ~34GB. > 3) Disk is not saturated. All the swap gets used too. It just died again, > so I couldn't free -m, but the kernel log shows free swap as nothing so I'm > assuming it is burning that as well.
> It seems to fail at a similar point, right around data file 47 and is > always during index building.
> The service is analytics, so our active set is relatively small compared > to all the data/index size.
> Most data is partitioned a collection per month as well, so only the > latest collections actually receive writes. It does not appear to be > getting to this point yet. Seems to be dying when it gets to a collection a > few months back, or at least that is what is in the log.
> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
> Ok. Can you post the following please :
> i) details about the machine (RAM, platform, OS) etc. > ii) How big is the data size and what is the size of the indexes etc. > iii) Output from free -m while indexing is going on. Also is the disk > saturated when it happens ?
> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
> Yes. 2gb.
> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
> Do you have any swap space ? If yes, how much swap space are you > running i with ?
> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
> I've tried 3 times with one machine and 1 time with another to add another > replica to a set. Each time it gets through 45-47 data files out of 52 and > then starts rapidly using memory until it eventually gets sniped by the OOM > killer.
> For now I've added an arbiter so I have 2 full copies and an arbiter. I > need to get the new machines synced though as one of the 2 full copies has > half the hardware as we are in the middle of transitioning to new hardware.
> According to our host, we can't snapshot only the data directy, it would > have to be the whole server, which would be a mess for config. Pretty sure > this means we have to do a full re-sync and they keep running out of > memory.
> It seems to always happen in the index building phase. Let me know if any > more information would help (servers, logs, etc.).
> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
> So adding extra swap space did help it in making it move forward. As for the logs, yes can you please create a ticket in community private and attach the logs there.
> Thanks.
> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
> Failed again last night with 8GB of RAM and 6GB of swap. Got to data file 49 of 52.
> I'd prefer not to post the logs publicly. Should I send them directly to you or drop them in jira community private or something?
> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on it in the morning (EST) and post the results.
>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
>>> Can you please try with a larger swap file. Also, can you please try reproducing the issue with log level 2 on the node that you are trying to resync and post the logs. I am interested in seeing the logs specifically from the time when its building the index. To run a mongo instance with higher verbosity level just pass an extra argument -vv on the command line when you start mongo.
>>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
>>>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable ubuntu.
>>>> 2) data size is like 54GB. Index size is ~34GB. >>>> 3) Disk is not saturated. All the swap gets used too. It just died again, so I couldn't free -m, but the kernel log shows free swap as nothing so I'm assuming it is burning that as well.
>>>> It seems to fail at a similar point, right around data file 47 and is always during index building.
>>>> The service is analytics, so our active set is relatively small compared to all the data/index size.
>>>> Most data is partitioned a collection per month as well, so only the latest collections actually receive writes. It does not appear to be getting to this point yet. Seems to be dying when it gets to a collection a few months back, or at least that is what is in the log.
>>>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
>>>>> Ok. Can you post the following please :
>>>>> i) details about the machine (RAM, platform, OS) etc.
>>>>> ii) How big is the data size and what is the size of the indexes etc.
>>>>> iii) Output from free -m while indexing is going on. Also is the disk
>>>>> saturated when it happens ?
>>>>> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
>>>>>> Yes. 2gb.
>>>>>> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
>>>>>>> Do you have any swap space ? If yes, how much swap space are you
>>>>>>> running i with ?
>>>>>>> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>>>>>>>> I've tried 3 times with one machine and 1 time with another to add another
>>>>>>>> replica to a set. Each time it gets through 45-47 data files out of 52 and
>>>>>>>> then starts rapidly using memory until it eventually gets sniped by the OOM
>>>>>>>> killer.
>>>>>>>> For now I've added an arbiter so I have 2 full copies and an arbiter. I
>>>>>>>> need to get the new machines synced though as one of the 2 full copies has
>>>>>>>> half the hardware as we are in the middle of transitioning to new hardware.
>>>>>>>> According to our host, we can't snapshot only the data directy, it would
>>>>>>>> have to be the whole server, which would be a mess for config. Pretty sure
>>>>>>>> this means we have to do a full re-sync and they keep running out of memory.
>>>>>>>> It seems to always happen in the index building phase. Let me know if any
>>>>>>>> more information would help (servers, logs, etc.).
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "mongodb-user" group.
>>>>>>> To post to this group, send email to mongodb-user@googlegroups.com
>>>>>>> To unsubscribe from this group, send email to
>>>>>>> mongodb-user+unsubscribe@googlegroups.com
>>>>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "mongodb-user" group.
>>>>> To post to this group, send email to mongodb-user@googlegroups.com
>>>>> To unsubscribe from this group, send email to
>>>>> mongodb-user+unsubscribe@googlegroups.com
>>>>> See also the IRC channel -- freenode.net#mongodb
>>> -- >>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongodb-user@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user+unsubscribe@googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
> -- > You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wrote:
> Nah, swap space didn't really help. It made it to the last data file once > without swap.
> On Jun 22, 2012, at 1:36 PM, Sid <siddharth.si...@10gen.com> wrote:
> So adding extra swap space did help it in making it move forward. As for > the logs, yes can you please create a ticket in community private and > attach the logs there.
> Thanks.
> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
>> Failed again last night with 8GB of RAM and 6GB of swap. Got to data >> file 49 of 52.
>> I'd prefer not to post the logs publicly. Should I send them directly to >> you or drop them in jira community private or something?
>> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on it >> in the morning (EST) and post the results.
>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
>> Can you please try with a larger swap file. Also, can you please try >> reproducing the issue with log level 2 on the node that you are trying to >> resync and post the logs. I am interested in seeing the logs specifically >> from the time when its building the index. To run a mongo instance with >> higher verbosity level just pass an extra argument -vv on the command line >> when you start mongo.
>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable >> ubuntu. >> 2) data size is like 54GB. Index size is ~34GB. >> 3) Disk is not saturated. All the swap gets used too. It just died again, >> so I couldn't free -m, but the kernel log shows free swap as nothing so I'm >> assuming it is burning that as well.
>> It seems to fail at a similar point, right around data file 47 and is >> always during index building.
>> The service is analytics, so our active set is relatively small compared >> to all the data/index size.
>> Most data is partitioned a collection per month as well, so only the >> latest collections actually receive writes. It does not appear to be >> getting to this point yet. Seems to be dying when it gets to a collection a >> few months back, or at least that is what is in the log.
>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
>> Ok. Can you post the following please :
>> i) details about the machine (RAM, platform, OS) etc. >> ii) How big is the data size and what is the size of the indexes etc. >> iii) Output from free -m while indexing is going on. Also is the disk >> saturated when it happens ?
>> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
>> Yes. 2gb.
>> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
>> Do you have any swap space ? If yes, how much swap space are you >> running i with ?
>> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>> I've tried 3 times with one machine and 1 time with another to add another >> replica to a set. Each time it gets through 45-47 data files out of 52 and >> then starts rapidly using memory until it eventually gets sniped by the >> OOM >> killer.
>> For now I've added an arbiter so I have 2 full copies and an arbiter. I >> need to get the new machines synced though as one of the 2 full copies has >> half the hardware as we are in the middle of transitioning to new >> hardware.
>> According to our host, we can't snapshot only the data directy, it would >> have to be the whole server, which would be a mess for config. Pretty sure >> this means we have to do a full re-sync and they keep running out of >> memory.
>> It seems to always happen in the index building phase. Let me know if any >> more information would help (servers, logs, etc.).
>> -- >> You received this message because you are subscribed to the Google >> Groups "mongodb-user" group. >> To post to this group, send email to mongodb-user@googlegroups.com >> To unsubscribe from this group, send email to >> mongodb-user+unsubscribe@googlegroups.com >> See also the IRC channel -- freenode.net#mongodb
>> -- >> You received this message because you are subscribed to the Google >> Groups "mongodb-user" group. >> To post to this group, send email to mongodb-user@googlegroups.com >> To unsubscribe from this group, send email to >> mongodb-user+unsubscribe@googlegroups.com >> See also the IRC channel -- freenode.net#mongodb
>> -- >> You received this message because you are subscribed to the Google >> Groups "mongodb-user" group. >> To post to this group, send email to mongodb-user@googlegroups.com >> To unsubscribe from this group, send email to >> mongodb-user+unsubscribe@googlegroups.com >> See also the IRC channel -- freenode.net#mongodb
>> -- > You received this message because you are subscribed to the Google > Groups "mongodb-user" group. > To post to this group, send email to mongodb-user@googlegroups.com > To unsubscribe from this group, send email to > mongodb-user+unsubscribe@googlegroups.com > See also the IRC channel -- freenode.net#mongodb
The good news is I managed to get one of the two new machines that I need
to sync up to date last night (after like try 5). Going to try syncing the
other one tonight.
I've had this sync problem before as well. Would love to get it sorted out
so I'm not so nervous about losing machines. I can't currently do file
system snapshots so I kind of have to do full syncs. Let me know if you
need anything else from me. Happy to help.
On Mon, Jun 25, 2012 at 10:13 AM, Sid <siddharth.si...@10gen.com> wrote:
> Thanks for filing the ticket along with the logs. Will look into it and
> update the relevant ticket accordingly. Much thanks for reporting this to
> us.
> On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wrote:
>> Nah, swap space didn't really help. It made it to the last data file once
>> without swap.
>> On Jun 22, 2012, at 1:36 PM, Sid <siddharth.si...@10gen.com> wrote:
>> So adding extra swap space did help it in making it move forward. As for
>> the logs, yes can you please create a ticket in community private and
>> attach the logs there.
>> Thanks.
>> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
>>> Failed again last night with 8GB of RAM and 6GB of swap. Got to data
>>> file 49 of 52.
>>> I'd prefer not to post the logs publicly. Should I send them directly to
>>> you or drop them in jira community private or something?
>>> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
>>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on it
>>> in the morning (EST) and post the results.
>>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
>>> Can you please try with a larger swap file. Also, can you please try
>>> reproducing the issue with log level 2 on the node that you are trying to
>>> resync and post the logs. I am interested in seeing the logs specifically
>>> from the time when its building the index. To run a mongo instance with
>>> higher verbosity level just pass an extra argument -vv on the command line
>>> when you start mongo.
>>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
>>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest stable
>>> ubuntu.
>>> 2) data size is like 54GB. Index size is ~34GB.
>>> 3) Disk is not saturated. All the swap gets used too. It just died
>>> again, so I couldn't free -m, but the kernel log shows free swap as nothing
>>> so I'm assuming it is burning that as well.
>>> It seems to fail at a similar point, right around data file 47 and is
>>> always during index building.
>>> The service is analytics, so our active set is relatively small compared
>>> to all the data/index size.
>>> Most data is partitioned a collection per month as well, so only the
>>> latest collections actually receive writes. It does not appear to be
>>> getting to this point yet. Seems to be dying when it gets to a collection a
>>> few months back, or at least that is what is in the log.
>>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
>>> Ok. Can you post the following please :
>>> i) details about the machine (RAM, platform, OS) etc.
>>> ii) How big is the data size and what is the size of the indexes etc.
>>> iii) Output from free -m while indexing is going on. Also is the disk
>>> saturated when it happens ?
>>> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
>>> Yes. 2gb.
>>> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
>>> Do you have any swap space ? If yes, how much swap space are you
>>> running i with ?
>>> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>>> I've tried 3 times with one machine and 1 time with another to add
>>> another
>>> replica to a set. Each time it gets through 45-47 data files out of 52
>>> and
>>> then starts rapidly using memory until it eventually gets sniped by the
>>> OOM
>>> killer.
>>> For now I've added an arbiter so I have 2 full copies and an arbiter. I
>>> need to get the new machines synced though as one of the 2 full copies
>>> has
>>> half the hardware as we are in the middle of transitioning to new
>>> hardware.
>>> According to our host, we can't snapshot only the data directy, it would
>>> have to be the whole server, which would be a mess for config. Pretty
>>> sure
>>> this means we have to do a full re-sync and they keep running out of
>>> memory.
>>> It seems to always happen in the index building phase. Let me know if any
>>> more information would help (servers, logs, etc.).
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongodb-user@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user+unsubscribe@**googlegroups.com<mongodb-user+unsubscribe@google groups.com>
>>> See also the IRC channel -- freenode.net#mongodb
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongodb-user@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user+unsubscribe@**googlegroups.com<mongodb-user+unsubscribe@google groups.com>
>>> See also the IRC channel -- freenode.net#mongodb
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongodb-user@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user+unsubscribe@**googlegroups.com<mongodb-user+unsubscribe@google groups.com>
>>> See also the IRC channel -- freenode.net#mongodb
>>> --
>> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@**googlegroups.com<mongodb-user+unsubscribe@google groups.com>
>> See also the IRC channel -- freenode.net#mongodb
>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
On Monday, June 25, 2012 4:36:07 PM UTC-7, jnunemaker wrote:
> The good news is I managed to get one of the two new machines that I need > to sync up to date last night (after like try 5). Going to try syncing the > other one tonight.
> I've had this sync problem before as well. Would love to get it sorted out > so I'm not so nervous about losing machines. I can't currently do file > system snapshots so I kind of have to do full syncs. Let me know if you > need anything else from me. Happy to help.
> On Mon, Jun 25, 2012 at 10:13 AM, Sid <siddhar...@10gen.com <javascript:>>wrote:
>> Thanks for filing the ticket along with the logs. Will look into it and >> update the relevant ticket accordingly. Much thanks for reporting this to >> us.
>> On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wrote:
>>> Nah, swap space didn't really help. It made it to the last data file >>> once without swap.
>>> On Jun 22, 2012, at 1:36 PM, Sid <siddhar...@10gen.com <javascript:>> >>> wrote:
>>> So adding extra swap space did help it in making it move forward. As for >>> the logs, yes can you please create a ticket in community private and >>> attach the logs there.
>>> Thanks.
>>> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
>>>> Failed again last night with 8GB of RAM and 6GB of swap. Got to data >>>> file 49 of 52.
>>>> I'd prefer not to post the logs publicly. Should I send them directly >>>> to you or drop them in jira community private or something?
>>>> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
>>>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on >>>> it in the morning (EST) and post the results.
>>>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
>>>> Can you please try with a larger swap file. Also, can you please try >>>> reproducing the issue with log level 2 on the node that you are trying to >>>> resync and post the logs. I am interested in seeing the logs specifically >>>> from the time when its building the index. To run a mongo instance with >>>> higher verbosity level just pass an extra argument -vv on the command line >>>> when you start mongo.
>>>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
>>>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest >>>> stable ubuntu. >>>> 2) data size is like 54GB. Index size is ~34GB. >>>> 3) Disk is not saturated. All the swap gets used too. It just died >>>> again, so I couldn't free -m, but the kernel log shows free swap as nothing >>>> so I'm assuming it is burning that as well.
>>>> It seems to fail at a similar point, right around data file 47 and is >>>> always during index building.
>>>> The service is analytics, so our active set is relatively small >>>> compared to all the data/index size.
>>>> Most data is partitioned a collection per month as well, so only the >>>> latest collections actually receive writes. It does not appear to be >>>> getting to this point yet. Seems to be dying when it gets to a collection a >>>> few months back, or at least that is what is in the log.
>>>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
>>>> Ok. Can you post the following please :
>>>> i) details about the machine (RAM, platform, OS) etc. >>>> ii) How big is the data size and what is the size of the indexes etc. >>>> iii) Output from free -m while indexing is going on. Also is the disk >>>> saturated when it happens ?
>>>> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
>>>> Yes. 2gb.
>>>> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
>>>> Do you have any swap space ? If yes, how much swap space are you >>>> running i with ?
>>>> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>>>> I've tried 3 times with one machine and 1 time with another to add >>>> another >>>> replica to a set. Each time it gets through 45-47 data files out of 52 >>>> and >>>> then starts rapidly using memory until it eventually gets sniped by the >>>> OOM >>>> killer.
>>>> For now I've added an arbiter so I have 2 full copies and an arbiter. I >>>> need to get the new machines synced though as one of the 2 full copies >>>> has >>>> half the hardware as we are in the middle of transitioning to new >>>> hardware.
>>>> According to our host, we can't snapshot only the data directy, it would >>>> have to be the whole server, which would be a mess for config. Pretty >>>> sure >>>> this means we have to do a full re-sync and they keep running out of >>>> memory.
>>>> It seems to always happen in the index building phase. Let me know if >>>> any >>>> more information would help (servers, logs, etc.).
>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "mongodb-user" group. >>>> To post to this group, send email to mongod...@googlegroups.com<javascript:> >>>> To unsubscribe from this group, send email to >>>> mongodb-user...@**googlegroups.com <javascript:> >>>> See also the IRC channel -- freenode.net#mongodb
>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "mongodb-user" group. >>>> To post to this group, send email to mongod...@googlegroups.com<javascript:> >>>> To unsubscribe from this group, send email to >>>> mongodb-user...@**googlegroups.com <javascript:> >>>> See also the IRC channel -- freenode.net#mongodb
>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "mongodb-user" group. >>>> To post to this group, send email to mongod...@googlegroups.com<javascript:> >>>> To unsubscribe from this group, send email to >>>> mongodb-user...@**googlegroups.com <javascript:> >>>> See also the IRC channel -- freenode.net#mongodb
>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "mongodb-user" group. >>> To post to this group, send email to mongod...@googlegroups.com<javascript:> >>> To unsubscribe from this group, send email to >>> mongodb-user...@**googlegroups.com <javascript:> >>> See also the IRC channel -- freenode.net#mongodb
>>> -- >> You received this message because you are subscribed to the Google >> Groups "mongodb-user" group. >> To post to this group, send email to mongod...@googlegroups.com<javascript:> >> To unsubscribe from this group, send email to >> mongodb-user...@googlegroups.com <javascript:> >> See also the IRC channel -- freenode.net#mongodb
What version are you using Dave? There's been a fix since 2.0.7 https://jira.mongodb.org/browse/SERVER-6414 which addresses introduces 'much better for memory consumption and performance'. Related to this previous issue.
On Friday, September 28, 2012 9:54:30 PM UTC+1, David K Storrs wrote:
> Did you ever find an answer? We are having the same issue?
> Dave
> On Monday, June 25, 2012 4:36:07 PM UTC-7, jnunemaker wrote:
>> The good news is I managed to get one of the two new machines that I need >> to sync up to date last night (after like try 5). Going to try syncing the >> other one tonight.
>> I've had this sync problem before as well. Would love to get it sorted >> out so I'm not so nervous about losing machines. I can't currently do file >> system snapshots so I kind of have to do full syncs. Let me know if you >> need anything else from me. Happy to help.
>> On Mon, Jun 25, 2012 at 10:13 AM, Sid <siddhar...@10gen.com> wrote:
>>> Thanks for filing the ticket along with the logs. Will look into it and >>> update the relevant ticket accordingly. Much thanks for reporting this to >>> us.
>>> On Friday, June 22, 2012 2:03:59 PM UTC-4, jnunemaker wrote:
>>>> Nah, swap space didn't really help. It made it to the last data file >>>> once without swap.
>>>> On Jun 22, 2012, at 1:36 PM, Sid <siddhar...@10gen.com> wrote:
>>>> So adding extra swap space did help it in making it move forward. As >>>> for the logs, yes can you please create a ticket in community private and >>>> attach the logs there.
>>>> Thanks.
>>>> On Friday, June 22, 2012 10:26:19 AM UTC-4, jnunemaker wrote:
>>>>> Failed again last night with 8GB of RAM and 6GB of swap. Got to data >>>>> file 49 of 52.
>>>>> I'd prefer not to post the logs publicly. Should I send them directly >>>>> to you or drop them in jira community private or something?
>>>>> On Thursday, June 21, 2012 at 10:56 PM, John Nunemaker wrote:
>>>>> Bumped up swap to 6GB and changed log verbosity to 2. I'll check on >>>>> it in the morning (EST) and post the results.
>>>>> On Thursday, June 21, 2012 at 11:24 AM, Sid wrote:
>>>>> Can you please try with a larger swap file. Also, can you please try >>>>> reproducing the issue with log level 2 on the node that you are trying to >>>>> resync and post the logs. I am interested in seeing the logs specifically >>>>> from the time when its building the index. To run a mongo instance with >>>>> higher verbosity level just pass an extra argument -vv on the command line >>>>> when you start mongo.
>>>>> On Wednesday, June 20, 2012 3:54:26 PM UTC-4, jnunemaker wrote:
>>>>> 1) 6GB of RAM. Just upped to 8GB of RAM and still failed. Latest >>>>> stable ubuntu. >>>>> 2) data size is like 54GB. Index size is ~34GB. >>>>> 3) Disk is not saturated. All the swap gets used too. It just died >>>>> again, so I couldn't free -m, but the kernel log shows free swap as nothing >>>>> so I'm assuming it is burning that as well.
>>>>> It seems to fail at a similar point, right around data file 47 and is >>>>> always during index building.
>>>>> The service is analytics, so our active set is relatively small >>>>> compared to all the data/index size.
>>>>> Most data is partitioned a collection per month as well, so only the >>>>> latest collections actually receive writes. It does not appear to be >>>>> getting to this point yet. Seems to be dying when it gets to a collection a >>>>> few months back, or at least that is what is in the log.
>>>>> On Wednesday, June 20, 2012 at 12:41 PM, Sid wrote:
>>>>> Ok. Can you post the following please :
>>>>> i) details about the machine (RAM, platform, OS) etc. >>>>> ii) How big is the data size and what is the size of the indexes etc. >>>>> iii) Output from free -m while indexing is going on. Also is the disk >>>>> saturated when it happens ?
>>>>> On Jun 20, 12:10 pm, John Nunemaker <nunema...@gmail.com> wrote:
>>>>> Yes. 2gb.
>>>>> On Jun 20, 2012, at 11:55 AM, Sid <siddharth.si...@10gen.com> wrote:
>>>>> Do you have any swap space ? If yes, how much swap space are you >>>>> running i with ?
>>>>> On Jun 20, 10:30 am, jnunemaker <nunema...@gmail.com> wrote:
>>>>> I've tried 3 times with one machine and 1 time with another to add >>>>> another >>>>> replica to a set. Each time it gets through 45-47 data files out of 52 >>>>> and >>>>> then starts rapidly using memory until it eventually gets sniped by >>>>> the OOM >>>>> killer.
>>>>> For now I've added an arbiter so I have 2 full copies and an arbiter. I >>>>> need to get the new machines synced though as one of the 2 full copies >>>>> has >>>>> half the hardware as we are in the middle of transitioning to new >>>>> hardware.
>>>>> According to our host, we can't snapshot only the data directy, it >>>>> would >>>>> have to be the whole server, which would be a mess for config. Pretty >>>>> sure >>>>> this means we have to do a full re-sync and they keep running out of >>>>> memory.
>>>>> It seems to always happen in the index building phase. Let me know if >>>>> any >>>>> more information would help (servers, logs, etc.).
>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mongodb-user" group. >>>>> To post to this group, send email to mongod...@googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> mongodb-user...@**googlegroups.com >>>>> See also the IRC channel -- freenode.net#mongodb
>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "mongodb-user" group. >>>> To post to this group, send email to mongod...@googlegroups.com >>>> To unsubscribe from this group, send email to >>>> mongodb-user...@**googlegroups.com >>>> See also the IRC channel -- freenode.net#mongodb
>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "mongodb-user" group. >>> To post to this group, send email to mongod...@googlegroups.com >>> To unsubscribe from this group, send email to >>> mongodb-user...@googlegroups.com >>> See also the IRC channel -- freenode.net#mongodb
On Oct 5, 6:19 am, Gianfranco <gianfra...@10gen.com> wrote:
> What version are you using Dave?
Gianfranco,
Pardon the long lag time. We have resolved this now after much
beating of heads. It finally turned out that we had several problems
going on:
- There may have been a piece of faulty hardware on the secondary we
were using; it would reboot randomly when Mongo had issues. After a
hardware swap, this issue stopped; all data synced but indexes were
not built.
- We took the machine out of the RS and built the indexes manually,
then added it back. Probably solved.
- Additionally, the week before my post we upgraded from 2.0 to 2.2,
and one of the config servers was missed; it was still running 2.0
which was keeping the cluster metadata read-only and preventing the
balancer from running. Not related to the replica set issue, but
annoying and it confused the issue with the RS for a time.
Oh, I forgot to add -- we were also using a stock CentOS install,
which had a 1024 file handle limit and had ext3. After reading
http://www.mongodb.org/display/DOCS/Production+Notes we got it upped
to 8k, used ext4, and did some of the other tweaks as specified
therein. That was probably the biggest piece.
Dave
On Oct 10, 1:26 pm, David K Storrs <da...@channelmeter.com> wrote:
> On Oct 5, 6:19 am, Gianfranco <gianfra...@10gen.com> wrote:
> > What version are you using Dave?
> Gianfranco,
> Pardon the long lag time. We have resolved this now after much
> beating of heads. It finally turned out that we had several problems
> going on:
> - There may have been a piece of faulty hardware on the secondary we
> were using; it would reboot randomly when Mongo had issues. After a
> hardware swap, this issue stopped; all data synced but indexes were
> not built.
> - We took the machine out of the RS and built the indexes manually,
> then added it back. Probably solved.
> - Additionally, the week before my post we upgraded from 2.0 to 2.2,
> and one of the config servers was missed; it was still running 2.0
> which was keeping the cluster metadata read-only and preventing the
> balancer from running. Not related to the replica set issue, but
> annoying and it confused the issue with the RS for a time.
On Wednesday, October 10, 2012 9:28:58 PM UTC+1, David K Storrs wrote:
> Oh, I forgot to add -- we were also using a stock CentOS install, > which had a 1024 file handle limit and had ext3. After reading > http://www.mongodb.org/display/DOCS/Production+Notes we got it upped > to 8k, used ext4, and did some of the other tweaks as specified > therein. That was probably the biggest piece.
> Dave
> On Oct 10, 1:26 pm, David K Storrs <da...@channelmeter.com> wrote: > > On Oct 5, 6:19 am, Gianfranco <gianfra...@10gen.com> wrote:
> > > What version are you using Dave?
> > Gianfranco,
> > Pardon the long lag time. We have resolved this now after much > > beating of heads. It finally turned out that we had several problems > > going on:
> > - There may have been a piece of faulty hardware on the secondary we > > were using; it would reboot randomly when Mongo had issues. After a > > hardware swap, this issue stopped; all data synced but indexes were > > not built.
> > - We took the machine out of the RS and built the indexes manually, > > then added it back. Probably solved.
> > - Additionally, the week before my post we upgraded from 2.0 to 2.2, > > and one of the config servers was missed; it was still running 2.0 > > which was keeping the cluster metadata read-only and preventing the > > balancer from running. Not related to the replica set issue, but > > annoying and it confused the issue with the RS for a time.