Message from discussion
Slow-down for a large inserts-only job
Received: by 10.68.191.225 with SMTP id hb1mr11241300pbc.5.1336756355125;
Fri, 11 May 2012 10:12:35 -0700 (PDT)
X-BeenThere: mongodb-user@googlegroups.com
Received: by 10.68.216.166 with SMTP id or6ls3953408pbc.9.gmail; Fri, 11 May
2012 10:12:24 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.68.190.70 with SMTP id go6mr122195pbc.0.1336756344264; Fri, 11
May 2012 10:12:24 -0700 (PDT)
Authentication-Results: ls.google.com; spf=pass (google.com: domain of
zack.shoy...@gmail.com designates internal as permitted sender)
smtp.mail=zack.shoy...@gmail.com; dkim=pass
header...@gmail.com
Received: by k7g2000pbo.googlegroups.com with HTTP; Fri, 11 May 2012 10:12:24
-0700 (PDT)
Date: Fri, 11 May 2012 10:12:24 -0700 (PDT)
In-Reply-To: <dfac561f-fbe3-4fe2-ba04-557d9322d603@l5g2000pbo.googlegroups.com>
References: <de90ed98-4a51-439a-ade1-c088fde0c4b3@o3g2000pby.googlegroups.com>
<CALOM=qiRdJsZTf8XqXKxW247MUyJpho6e3wFbKQGYEbi-crxXQ@mail.gmail.com>
<6242f137-ab41-470b-b1d1-c783ee9b42c6@wp13g2000pbb.googlegroups.com> <dfac561f-fbe3-4fe2-ba04-557d9322d603@l5g2000pbo.googlegroups.com>
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19
(KHTML, like Gecko) Chrome/18.0.1025.168 Safari/535.19,gzip(gfe)
Message-ID: <0cc8fbbc-8af5-4f25-8b1f-1a1f5e5f2ee7@k7g2000pbo.googlegroups.com>
Subject: Re: Slow-down for a large inserts-only job
From: Zack Shoylev <zack.shoy...@gmail.com>
To: mongodb-user <mongodb-user@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Alright, I have made some progress.
First, it seems there is some kind of numeric overflow when specifying
large chunk sizes (such as a chunkSize of 20000). As this is set in
megabytes, the overflow happens because this value is converted to
bytes for the splitVector call. However this did not seem to be
causing a slow-down.
One slow-down I seem to have traced to the Balancer. The slow-down
occurs during the time the balancer acquires a distributed lock, it
seems. For example:
Thu May 10 17:33:55 [Balancer] distributed lock 'balancer/mongo:
27017:1336695564:1804289383' acquired, ts : 4fac5e7384265a7e635f8cbb
Thu May 10 17:33:57 [Balancer] distributed lock 'balancer/mongo:
27017:1336695564:1804289383' unlocked.
During those 2 seconds my inserts drop to 0. Which is obviously very
upsetting, because under heavy inserts this seems to happen often.
Fixed by disabling the balancer.
However, after a few minutes, my rates still drop to almost 0 and I
will have to trace that. The slowdowns happens because mongodb pushes
a lot of small writes to disk (why?)
On May 9, 4:31=A0pm, Zack Shoylev <zack.shoy...@gmail.com> wrote:
> I have also tried:
>
> Turning off the ballancer
> Reducing the shards to 8
> Turning off durability (--nojournal for the shards)
>
> With these modifications, I still experience a slowdown after about
> half a minute or less.
>
> On May 9, 2:57=A0pm, Zack Shoylev <zack.shoy...@gmail.com> wrote:
>
>
>
>
>
>
>
> >http://pastiebin.com/?page=3Dp&id=3D4faae52ecbce6
>
> > The key is a random UUID such as ad35665a-4942-45c1-1b1c-c5d01a4cc169
>
> > I did go over the best practices. However, for now, I am trying to do
> > some testing on a large server. It seems that to get inserts faster
> > than 20k/sec I *must* use multiple shards running on the same server
> > (because of the write lock).
>
> > I also noticed that before I started splitting manually and increased
> > thechunksize, auto-splitting was ridiculouslyslow.
>
> > I don't think it's fragmentation either, I only see 30+ extents for
> > some of the journal files.
>
> > On May 8, 2:29=A0pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
>
> > > Can you run mongostat --discover against a mongos and iostat -xm 2 on
> > > each primary to collection stats. Please post these to
> > > gist/pastie/pastebin/etc.
>
> > > What is your sharded collection's shard key, and are the values you
> > > are importing for the shard key the same or different?
>
> > > What does sh.status() show?
>
> > > Have you followed the best practices and used the suggested
> > > configurations?http://www.mongodb.org/display/DOCS/Production+Notes
>
> > > On Tue, May 8, 2012 at 2:17 PM, Zack Shoylev <zack.shoy...@gmail.com>=
wrote:
> > > > The case:
> > > > 32-core server running 32 mongod shards, a config server, and a
> > > > mongos. 300GB RAM and a large raid disk system (with very high
> > > > throughput).
> > > > Parallel mongoimport jobs starts with about 100k inserts/sec total,
> > > > but quicklyslowdownto 0 to 5k/sec
>
> > > > Logs show a lot of
> > > > Tue May =A08 13:47:46 [conn21] warning: could have autosplit on
> > > > collection: test.test1 but: splitVector command failed: { errmsg:
> > > > "need to specify the desired maxchunksize(maxChunkSize or
> > > > maxChunkSizeBytes)", ok: 0.0 }
> > > > andslowinserts:
> > > > Tue May =A08 13:47:50 [conn22] insert test.test1 1320ms
> > > > Tue May =A08 13:47:50 [conn31] insert test.test1 1423ms
>
> > > > I have chunkSize set to 20000, and 32 chunks (1 per shard) with ful=
ly
> > > > distributed splitting of data.
>
> > > > I need 100k min consistent inserts, but 300k+ would be preferable.
>
> > > > My questions are:
> > > > What's the deal with the splitVector? I am running 2.0.4 and made s=
ure
> > > > to restart everything after setting the chunkSize. Is this what's
> > > > causing theslow-down?
> > > > If not, what could be causing theslow-down? CPU usage is low, and s=
o
> > > > is memory, only disk activity is high. Is mongodb using a "safe mod=
e"
> > > > by default? (flushing to disk?)
>
> > > > --
> > > > You received this message because you are subscribed to the Google =
Groups "mongodb-user" group.
> > > > To post to this group, send email to mongodb-user@googlegroups.com.
> > > > To unsubscribe from this group, send email to mongodb-user+unsubscr=
ibe@googlegroups.com.
> > > > For more options, visit this group athttp://groups.google.com/group=