Slow-down for a large inserts-only job

265 views
Skip to first unread message

Zack Shoylev

unread,
May 8, 2012, 5:17:30 PM5/8/12
to mongodb-user
The case:
32-core server running 32 mongod shards, a config server, and a
mongos. 300GB RAM and a large raid disk system (with very high
throughput).
Parallel mongoimport jobs starts with about 100k inserts/sec total,
but quickly slow down to 0 to 5k/sec

Logs show a lot of
Tue May 8 13:47:46 [conn21] warning: could have autosplit on
collection: test.test1 but: splitVector command failed: { errmsg:
"need to specify the desired max chunk size (maxChunkSize or
maxChunkSizeBytes)", ok: 0.0 }
and slow inserts:
Tue May 8 13:47:50 [conn22] insert test.test1 1320ms
Tue May 8 13:47:50 [conn31] insert test.test1 1423ms

I have chunkSize set to 20000, and 32 chunks (1 per shard) with fully
distributed splitting of data.

I need 100k min consistent inserts, but 300k+ would be preferable.

My questions are:
What's the deal with the splitVector? I am running 2.0.4 and made sure
to restart everything after setting the chunkSize. Is this what's
causing the slow-down?
If not, what could be causing the slow-down? CPU usage is low, and so
is memory, only disk activity is high. Is mongodb using a "safe mode"
by default? (flushing to disk?)

Scott Hernandez

unread,
May 8, 2012, 5:29:34 PM5/8/12
to mongod...@googlegroups.com
Can you run mongostat --discover against a mongos and iostat -xm 2 on
each primary to collection stats. Please post these to
gist/pastie/pastebin/etc.

What is your sharded collection's shard key, and are the values you
are importing for the shard key the same or different?

What does sh.status() show?

Have you followed the best practices and used the suggested
configurations? http://www.mongodb.org/display/DOCS/Production+Notes
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Zack Shoylev

unread,
May 9, 2012, 5:57:55 PM5/9/12
to mongodb-user
http://pastiebin.com/?page=p&id=4faae52ecbce6

The key is a random UUID such as ad35665a-4942-45c1-1b1c-c5d01a4cc169

I did go over the best practices. However, for now, I am trying to do
some testing on a large server. It seems that to get inserts faster
than 20k/sec I *must* use multiple shards running on the same server
(because of the write lock).

I also noticed that before I started splitting manually and increased
the chunk size, auto-splitting was ridiculously slow.

I don't think it's fragmentation either, I only see 30+ extents for
some of the journal files.

On May 8, 2:29 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> Can you run mongostat --discover against a mongos and iostat -xm 2 on
> each primary to collection stats. Please post these to
> gist/pastie/pastebin/etc.
>
> What is your sharded collection's shard key, and are the values you
> are importing for the shard key the same or different?
>
> What does sh.status() show?
>
> Have you followed the best practices and used the suggested
> configurations?http://www.mongodb.org/display/DOCS/Production+Notes

Zack Shoylev

unread,
May 9, 2012, 7:31:18 PM5/9/12
to mongodb-user
I have also tried:

Turning off the ballancer
Reducing the shards to 8
Turning off durability (--nojournal for the shards)

With these modifications, I still experience a slowdown after about
half a minute or less.

Zack Shoylev

unread,
May 11, 2012, 1:12:24 PM5/11/12
to mongodb-user
Alright, I have made some progress.
First, it seems there is some kind of numeric overflow when specifying
large chunk sizes (such as a chunkSize of 20000). As this is set in
megabytes, the overflow happens because this value is converted to
bytes for the splitVector call. However this did not seem to be
causing a slow-down.

One slow-down I seem to have traced to the Balancer. The slow-down
occurs during the time the balancer acquires a distributed lock, it
seems. For example:

Thu May 10 17:33:55 [Balancer] distributed lock 'balancer/mongo:
27017:1336695564:1804289383' acquired, ts : 4fac5e7384265a7e635f8cbb
Thu May 10 17:33:57 [Balancer] distributed lock 'balancer/mongo:
27017:1336695564:1804289383' unlocked.

During those 2 seconds my inserts drop to 0. Which is obviously very
upsetting, because under heavy inserts this seems to happen often.
Fixed by disabling the balancer.

However, after a few minutes, my rates still drop to almost 0 and I
will have to trace that. The slowdowns happens because mongodb pushes
a lot of small writes to disk (why?)
> > > > but quicklyslowdownto 0 to 5k/sec
>
> > > > Logs show a lot of
> > > > Tue May  8 13:47:46 [conn21] warning: could have autosplit on
> > > > collection: test.test1 but: splitVector command failed: { errmsg:
> > > > "need to specify the desired maxchunksize(maxChunkSize or
> > > > maxChunkSizeBytes)", ok: 0.0 }
> > > > andslowinserts:
> > > > Tue May  8 13:47:50 [conn22] insert test.test1 1320ms
> > > > Tue May  8 13:47:50 [conn31] insert test.test1 1423ms
>
> > > > I have chunkSize set to 20000, and 32 chunks (1 per shard) with fully
> > > > distributed splitting of data.
>
> > > > I need 100k min consistent inserts, but 300k+ would be preferable.
>
> > > > My questions are:
> > > > What's the deal with the splitVector? I am running 2.0.4 and made sure
> > > > to restart everything after setting the chunkSize. Is this what's
> > > > causing theslow-down?
> > > > If not, what could be causing theslow-down? CPU usage is low, and so
Reply all
Reply to author
Forward
0 new messages