I have a collection with *~165m* documents, and start sharding it but it is doing very slow.
I have next configuration: *First* machine: mongod, mongos, mongod --configsrv *Second* machine: mongod
Thereis ~3120 chunks, *moveChunk* tooks about *~350-400* secs (so to move all chunks takes ~10 days), speed between shards are about 70MiB/s *Shard by* "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is ObjectId Index on "{key: 1, _id: 1}" exists
I had a *suspicion* that it is because there is a *concurrent read/write to db*(I use 2.1 develop build, that have not full support locking per db), but *when I stop read/write from/to db speed not increased*
Why it is so slow? And maybe I can manually run "sh.moveChunk()" for parallel migration?
step1 (copy indexes) - fast step2 (delete any data already in range) took *~10* secs step3 (initial bulk clone) took *~128* secs step4 (do bulk of mods) - fast step5 (wait for commit) took *~200* secs summary *~338* secs
I use developer version $ git describe *r2.1.0-2093-g31bdfd7*
On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> Hi all
> I have a collection with *~165m* documents, and start sharding it but it > is doing very slow.
> I have next configuration: > *First* machine: > mongod, mongos, mongod --configsrv > *Second* machine: > mongod
> Thereis ~3120 chunks, *moveChunk* tooks about *~350-400* secs (so to move > all chunks takes ~10 days), speed between shards are about 70MiB/s > *Shard by* "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is > ObjectId > Index on "{key: 1, _id: 1}" exists
> I had a *suspicion* that it is because there is a *concurrent read/write > to db*(I use 2.1 develop build, that have not full support locking per > db), but *when I stop read/write from/to db speed not increased*
> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for > parallel migration?
> step1 (copy indexes) - fast > step2 (delete any data already in range) took *~10* secs > step3 (initial bulk clone) took *~128* secs > step4 (do bulk of mods) - fast > step5 (wait for commit) took *~200* secs > summary *~338* secs
> I use developer version > $ git describe > *r2.1.0-2093-g31bdfd7*
On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> Hi all
> I have a collection with *~165m* documents, and start sharding it but it > is doing very slow.
> I have next configuration: > *First* machine: > mongod, mongos, mongod --configsrv > *Second* machine: > mongod
> Thereis ~3120 chunks, *moveChunk* tooks about *~350-400* secs (so to move > all chunks takes ~10 days), speed between shards are about 70MiB/s > *Shard by* "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is > ObjectId > Index on "{key: 1, _id: 1}" exists
> I had a *suspicion* that it is because there is a *concurrent read/write > to db*(I use 2.1 develop build, that have not full support locking per > db), but *when I stop read/write from/to db speed not increased*
> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for > parallel migration?
> step1 (copy indexes) - fast > step2 (delete any data already in range) took *~10* secs > step3 (initial bulk clone) took *~128* secs > step4 (do bulk of mods) - fast > step5 (wait for commit) took *~200* secs > summary *~338* secs
> I use developer version > $ git describe > *r2.1.0-2093-g31bdfd7*
On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> Hi all
> I have a collection with *~165m* documents, and start sharding it but it > is doing very slow.
> I have next configuration: > *First* machine: > mongod, mongos, mongod --configsrv > *Second* machine: > mongod
> Thereis ~3120 chunks, *moveChunk* tooks about *~350-400* secs (so to move > all chunks takes ~10 days), speed between shards are about 70MiB/s > *Shard by* "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is > ObjectId > Index on "{key: 1, _id: 1}" exists
> I had a *suspicion* that it is because there is a *concurrent read/write > to db*(I use 2.1 develop build, that have not full support locking per > db), but *when I stop read/write from/to db speed not increased*
> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for > parallel migration?
> step1 (copy indexes) - fast > step2 (delete any data already in range) took *~10* secs > step3 (initial bulk clone) took *~128* secs > step4 (do bulk of mods) - fast > step5 (wait for commit) took *~200* secs > summary *~338* secs
> I use developer version > $ git describe > *r2.1.0-2093-g31bdfd7*
> How can it write 529 GiB if all db is 173 GiB only?
> The mongo daemon is working about 2 days, not more
> On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
>> Hi all
>> I have a collection with ~165m documents, and start sharding it but it is
>> doing very slow.
>> I have next configuration:
>> First machine:
>> mongod, mongos, mongod --configsrv
>> Second machine:
>> mongod
>> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to move all
>> chunks takes ~10 days), speed between shards are about 70MiB/s
>> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is
>> ObjectId
>> Index on "{key: 1, _id: 1}" exists
>> I had a suspicion that it is because there is a concurrent read/write to
>> db(I use 2.1 develop build, that have not full support locking per db), but
>> when I stop read/write from/to db speed not increased
>> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for
>> parallel migration?
>> step1 (copy indexes) - fast
>> step2 (delete any data already in range) took ~10 secs
>> step3 (initial bulk clone) took ~128 secs
>> step4 (do bulk of mods) - fast
>> step5 (wait for commit) took ~200 secs
>> summary ~338 secs
>> I use developer version
>> $ git describe
>> r2.1.0-2093-g31bdfd7
> To post to this group, send email to mongodb-user@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
> > How can it write 529 GiB if all db is 173 GiB only?
> > The mongo daemon is working about 2 days, not more
> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> >> Hi all
> >> I have a collection with ~165m documents, and start sharding it but it
> is
> >> doing very slow.
> >> I have next configuration:
> >> First machine:
> >> mongod, mongos, mongod --configsrv
> >> Second machine:
> >> mongod
> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to move
> all
> >> chunks takes ~10 days), speed between shards are about 70MiB/s
> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is
> >> ObjectId
> >> Index on "{key: 1, _id: 1}" exists
> >> I had a suspicion that it is because there is a concurrent read/write to
> >> db(I use 2.1 develop build, that have not full support locking per db),
> but
> >> when I stop read/write from/to db speed not increased
> >> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for
> >> parallel migration?
> >> step1 (copy indexes) - fast
> >> step2 (delete any data already in range) took ~10 secs
> >> step3 (initial bulk clone) took ~128 secs
> >> step4 (do bulk of mods) - fast
> >> step5 (wait for commit) took ~200 secs
> >> summary ~338 secs
> >> I use developer version
> >> $ git describe
> >> r2.1.0-2093-g31bdfd7
> > To post to this group, send email to mongodb-user@googlegroups.com.
> > To unsubscribe from this group, send email to
> > mongodb-user+unsubscribe@googlegroups.com.
> > For more options, visit this group at
> > http://groups.google.com/group/mongodb-user?hl=en.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
>> > How can it write 529 GiB if all db is 173 GiB only?
>> > The mongo daemon is working about 2 days, not more
>> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
>> >> Hi all
>> >> I have a collection with ~165m documents, and start sharding it but it
>> >> is
>> >> doing very slow.
>> >> I have next configuration:
>> >> First machine:
>> >> mongod, mongos, mongod --configsrv
>> >> Second machine:
>> >> mongod
>> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to move
>> >> all
>> >> chunks takes ~10 days), speed between shards are about 70MiB/s
>> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" - is
>> >> ObjectId
>> >> Index on "{key: 1, _id: 1}" exists
>> >> I had a suspicion that it is because there is a concurrent read/write
>> >> to
>> >> db(I use 2.1 develop build, that have not full support locking per db),
>> >> but
>> >> when I stop read/write from/to db speed not increased
>> >> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for
>> >> parallel migration?
>> >> step1 (copy indexes) - fast
>> >> step2 (delete any data already in range) took ~10 secs
>> >> step3 (initial bulk clone) took ~128 secs
>> >> step4 (do bulk of mods) - fast
>> >> step5 (wait for commit) took ~200 secs
>> >> summary ~338 secs
>> >> I use developer version
>> >> $ git describe
>> >> r2.1.0-2093-g31bdfd7
>> > To post to this group, send email to mongodb-user@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > mongodb-user+unsubscribe@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/mongodb-user?hl=en.
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com.
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/mongodb-user?hl=en.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
> >> > How can it write 529 GiB if all db is 173 GiB only?
> >> > The mongo daemon is working about 2 days, not more
> >> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> >> >> Hi all
> >> >> I have a collection with ~165m documents, and start sharding it but
> it
> >> >> is
> >> >> doing very slow.
> >> >> I have next configuration:
> >> >> First machine:
> >> >> mongod, mongos, mongod --configsrv
> >> >> Second machine:
> >> >> mongod
> >> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to move
> >> >> all
> >> >> chunks takes ~10 days), speed between shards are about 70MiB/s
> >> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" -
> is
> >> >> ObjectId
> >> >> Index on "{key: 1, _id: 1}" exists
> >> >> I had a suspicion that it is because there is a concurrent read/write
> >> >> to
> >> >> db(I use 2.1 develop build, that have not full support locking per
> db),
> >> >> but
> >> >> when I stop read/write from/to db speed not increased
> >> >> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for
> >> >> parallel migration?
> >> >> step1 (copy indexes) - fast
> >> >> step2 (delete any data already in range) took ~10 secs
> >> >> step3 (initial bulk clone) took ~128 secs
> >> >> step4 (do bulk of mods) - fast
> >> >> step5 (wait for commit) took ~200 secs
> >> >> summary ~338 secs
> >> >> I use developer version
> >> >> $ git describe
> >> >> r2.1.0-2093-g31bdfd7
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups
> >> > "mongodb-user" group.
> >> > To view this discussion on the web visit
> >> > https://groups.google.com/d/msg/mongodb-user/-/2X8IxhWvpyMJ.
> >> > To post to this group, send email to mongodb-user@googlegroups.com.
> >> > To unsubscribe from this group, send email to
> >> > mongodb-user+unsubscribe@googlegroups.com.
> >> > For more options, visit this group at
> >> > http://groups.google.com/group/mongodb-user?hl=en.
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "mongodb-user" group.
> >> To post to this group, send email to mongodb-user@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> mongodb-user+unsubscribe@googlegroups.com.
> >> For more options, visit this group at
> >> http://groups.google.com/group/mongodb-user?hl=en.
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongodb-user@googlegroups.com.
> > To unsubscribe from this group, send email to
> > mongodb-user+unsubscribe@googlegroups.com.
> > For more options, visit this group at
> > http://groups.google.com/group/mongodb-user?hl=en.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> "*avgObjSize*" : 680.34193653284
> $*iostat* -x 2 (from first machine, where mongos)
> Linux 2.6.32-5-amd64 (na4) 05/03/2012 _x86_64_ (16 CPU)
>> >> > How can it write 529 GiB if all db is 173 GiB only?
>> >> > The mongo daemon is working about 2 days, not more
>> >> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
>> >> >> Hi all
>> >> >> I have a collection with ~165m documents, and start sharding it but
>> it
>> >> >> is
>> >> >> doing very slow.
>> >> >> I have next configuration:
>> >> >> First machine:
>> >> >> mongod, mongos, mongod --configsrv
>> >> >> Second machine:
>> >> >> mongod
>> >> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to
>> move
>> >> >> all
>> >> >> chunks takes ~10 days), speed between shards are about 70MiB/s
>> >> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id" -
>> is
>> >> >> ObjectId
>> >> >> Index on "{key: 1, _id: 1}" exists
>> >> >> I had a suspicion that it is because there is a concurrent
>> read/write
>> >> >> to
>> >> >> db(I use 2.1 develop build, that have not full support locking per
>> db),
>> >> >> but
>> >> >> when I stop read/write from/to db speed not increased
>> >> >> Why it is so slow? And maybe I can manually run "sh.moveChunk()" for
>> >> >> parallel migration?
>> >> >> step1 (copy indexes) - fast
>> >> >> step2 (delete any data already in range) took ~10 secs
>> >> >> step3 (initial bulk clone) took ~128 secs
>> >> >> step4 (do bulk of mods) - fast
>> >> >> step5 (wait for commit) took ~200 secs
>> >> >> summary ~338 secs
>> >> >> I use developer version
>> >> >> $ git describe
>> >> >> r2.1.0-2093-g31bdfd7
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "mongodb-user" group.
>> >> > To view this discussion on the web visit
>> >> > https://groups.google.com/d/msg/mongodb-user/-/2X8IxhWvpyMJ.
>> >> > To post to this group, send email to mongodb-user@googlegroups.com.
>> >> > To unsubscribe from this group, send email to
>> >> > mongodb-user+unsubscribe@googlegroups.com.
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/mongodb-user?hl=en.
>> >> --
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "mongodb-user" group.
>> >> To post to this group, send email to mongodb-user@googlegroups.com.
>> >> To unsubscribe from this group, send email to
>> >> mongodb-user+unsubscribe@googlegroups.com.
>> >> For more options, visit this group at
>> >> http://groups.google.com/group/mongodb-user?hl=en.
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "mongodb-user" group.
>> > To post to this group, send email to mongodb-user@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > mongodb-user+unsubscribe@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/mongodb-user?hl=en.
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com.
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/mongodb-user?hl=en.
I'm sorry for put all iostat output here in previous message.
I *run some tests*, at amazon *m1.small* instance, which have next
configuration:
1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage
64-bit platform
And it gave next results:
*No concurrent read/write in all tests*
*By id on 40 million rows, shard by "_id":*
step3 ~ X, 7, 28, 7, 7 secs
step5 ~ 53,97,70,43,55 secs
*By id on 1 million rows, shard by "_id"*
step3 ~ 8 9 9 secs
step5 ~ 12 15 13 secs
Not run iostat
*By id on 1 million rows, shard by "{key: 1, _id: 1}", where key is md5
hash (so.. string from 32 chars)*
step3 ~ 14 8 7 secs
step5 ~ 36 36 40 secs
Not run iostat
And we can see here, moveChunk time increase if key is "complicated" or we
have a "large" (I don`t really think that 40 million of documents is huge)
number of documents
>>> >> > How can it write 529 GiB if all db is 173 GiB only?
>>> >> > The mongo daemon is working about 2 days, not more
>>> >> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
>>> >> >> Hi all
>>> >> >> I have a collection with ~165m documents, and start sharding it
>>> but it
>>> >> >> is
>>> >> >> doing very slow.
>>> >> >> I have next configuration:
>>> >> >> First machine:
>>> >> >> mongod, mongos, mongod --configsrv
>>> >> >> Second machine:
>>> >> >> mongod
>>> >> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to
>>> move
>>> >> >> all
>>> >> >> chunks takes ~10 days), speed between shards are about 70MiB/s
>>> >> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id"
>>> - is
>>> >> >> ObjectId
>>> >> >> Index on "{key: 1, _id: 1}" exists
>>> >> >> I had a suspicion that it is because there is a concurrent
>>> read/write
>>> >> >> to
>>> >> >> db(I use 2.1 develop build, that have not full support locking per
>>> db),
>>> >> >> but
>>> >> >> when I stop read/write from/to db speed not increased
>>> >> >> Why it is so slow? And maybe I can manually run "sh.moveChunk()"
>>> for
>>> >> >> parallel migration?
>>> >> >> step1 (copy indexes) - fast
>>> >> >> step2 (delete any data already in range) took ~10 secs
>>> >> >> step3 (initial bulk clone) took ~128 secs
>>> >> >> step4 (do bulk of mods) - fast
>>> >> >> step5 (wait for commit) took ~200 secs
>>> >> >> summary ~338 secs
>>> >> >> I use developer version
>>> >> >> $ git describe
>>> >> >> r2.1.0-2093-g31bdfd7
>>> >> > --
>>> >> > You received this message because you are subscribed to the Google
>>> >> > Groups
>>> >> > "mongodb-user" group.
>>> >> > To view this discussion on the web visit
>>> >> > https://groups.google.com/d/msg/mongodb-user/-/2X8IxhWvpyMJ.
>>> >> > To post to this group, send email to mongodb-user@googlegroups.com.
>>> >> > To unsubscribe from this group, send email to
>>> >> > mongodb-user+unsubscribe@googlegroups.com.
>>> >> > For more options, visit this group at
>>> >> > http://groups.google.com/group/mongodb-user?hl=en.
>>> >> --
>>> >> You received this message because you are subscribed to the Google
>>> Groups
>>> >> "mongodb-user" group.
>>> >> To post to this group, send email to mongodb-user@googlegroups.com.
>>> >> To unsubscribe from this group, send email to
>>> >> mongodb-user+unsubscribe@googlegroups.com.
>>> >> For more options, visit this group at
>>> >> http://groups.google.com/group/mongodb-user?hl=en.
>>> > --
>>> > You received this message because you are subscribed to the Google
> And we can see here, moveChunk time increase if key is "complicated" or we
have a "large" (I don`t really think that 40 million of documents is
huge)
number of documents
I suspect this is just to do with working set - with a larger number
of documents, the documents to move are more scattered on disk
(there's no guarantee they're ordered like the shard key index). With
a larger key size, the index will be larger (2x?) and so less likely
to be in memory. The filesystem will also have a significant impact -
what filesystem are you using?
On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> I'm sorry for put all iostat output here in previous message.
> I *run some tests*, at amazon *m1.small* instance, which have next
> configuration:
> 1.7 GB memory
> 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> 160 GB instance storage
> 64-bit platform
> And it gave next results:
> *No concurrent read/write in all tests*
> *By id on 40 million rows, shard by "_id":*
> step3 ~ X, 7, 28, 7, 7 secs
> step5 ~ 53,97,70,43,55 secs
> *By id on 1 million rows, shard by "_id"*
> step3 ~ 8 9 9 secs
> step5 ~ 12 15 13 secs
> Not run iostat
> *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key is md5
> hash (so.. string from 32 chars)*
> step3 ~ 14 8 7 secs
> step5 ~ 36 36 40 secs
> Not run iostat
> And we can see here, moveChunk time increase if key is "complicated" or we
> have a "large" (I don`t really think that 40 million of documents is huge)
> number of documents
> On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > Avg time to *moveChunk* grows up to *450*
> > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> >> "*avgObjSize*" : 680.34193653284
> >> $*iostat* -x 2 (from first machine, where mongos)
> >> Linux 2.6.32-5-amd64 (na4) 05/03/2012 _x86_64_ (16 CPU)
> >>> >> > How can it write 529 GiB if all db is 173 GiB only?
> >>> >> > The mongo daemon is working about 2 days, not more
> >>> >> > On Thursday, May 3, 2012 4:16:20 PM UTC+4, Azat Khuzhin wrote:
> >>> >> >> Hi all
> >>> >> >> I have a collection with ~165m documents, and start sharding it
> >>> but it
> >>> >> >> is
> >>> >> >> doing very slow.
> >>> >> >> I have next configuration:
> >>> >> >> First machine:
> >>> >> >> mongod, mongos, mongod --configsrv
> >>> >> >> Second machine:
> >>> >> >> mongod
> >>> >> >> Thereis ~3120 chunks, moveChunk tooks about ~350-400 secs (so to
> >>> move
> >>> >> >> all
> >>> >> >> chunks takes ~10 days), speed between shards are about 70MiB/s
> >>> >> >> Shard by "{key: 1, _id: 1}", where "key" - is md5 hash, and "_id"
> >>> - is
> >>> >> >> ObjectId
> >>> >> >> Index on "{key: 1, _id: 1}" exists
> >>> >> >> I had a suspicion that it is because there is a concurrent
> >>> read/write
> >>> >> >> to
> >>> >> >> db(I use 2.1 develop build, that have not full support locking per
> >>> db),
> >>> >> >> but
> >>> >> >> when I stop read/write from/to db speed not increased
Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that filesystem
can have a significant impact, because we don`t have huge number of files,
or what else
Any way if such speed on huge number of documents will be always, you can`t
use it in production
Also when I canceling sharding, I run "removeshard", which also very slow
https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1... But when I stop this process, and migrate data myself (using
find()/insert()/remove()) it was much faster than this command (~20-50x
times, or greater than 50x), so I guess if I run query to migrate all my
data, instead of "moveChunk", it will be faster ~20-50x times (or greater
than 50x)
I understand that "moveChunk" do some other job, but not so huge
And can you explain me why ot write so many bytes to disk? Seems like it
rewrite all database each week or something like this
On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com> wrote:
> > And we can see here, moveChunk time increase if key is "complicated" or
> we
> have a "large" (I don`t really think that 40 million of documents is
> huge)
> number of documents
> I suspect this is just to do with working set - with a larger number
> of documents, the documents to move are more scattered on disk
> (there's no guarantee they're ordered like the shard key index). With
> a larger key size, the index will be larger (2x?) and so less likely
> to be in memory. The filesystem will also have a significant impact -
> what filesystem are you using?
> On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > I'm sorry for put all iostat output here in previous message.
> > I *run some tests*, at amazon *m1.small* instance, which have next
> > configuration:
> > 1.7 GB memory
> > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > 160 GB instance storage
> > 64-bit platform
> > And it gave next results:
> > *No concurrent read/write in all tests*
> > *By id on 40 million rows, shard by "_id":*
> > step3 ~ X, 7, 28, 7, 7 secs
> > step5 ~ 53,97,70,43,55 secs
> > *By id on 1 million rows, shard by "_id"*
> > step3 ~ 8 9 9 secs
> > step5 ~ 12 15 13 secs
> > Not run iostat
> > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key is md5
> > hash (so.. string from 32 chars)*
> > step3 ~ 14 8 7 secs
> > step5 ~ 36 36 40 secs
> > Not run iostat
> > And we can see here, moveChunk time increase if key is "complicated" or
> we
> > have a "large" (I don`t really think that 40 million of documents is
> huge)
> > number of documents
> > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com>
> wrote:
> > > Avg time to *moveChunk* grows up to *450*
> > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <a3at.m...@gmail.com>
> wrote:
> > >> "*avgObjSize*" : 680.34193653284
> > >> $*iostat* -x 2 (from first machine, where mongos)
> > >> Linux 2.6.32-5-amd64 (na4) 05/03/2012 _x86_64_ (16
> CPU)
You should move to ext4 and/or xfs - there's definitely impact,
particularly when migrating data to new shards, which almost always is
something you do because of data growth. It's in the
www.mongodb.org/display/DOCS/Production+Notes.
Not sure what happened earlier with removeShard, but it seems like
your migrations got hung up on something else - hard to say though
without logs there.
> I understand that "moveChunk" do some other job, but not so huge
In general, we try to be as low-impact as possible with moveChunk,
yielding whenever possible, which sacrifices speed.
> And can you explain me why ot write so many bytes to disk? Seems like it
rewrite all database each week or something like this
Hard to say without knowing more about your traffic and schema. Multi-
key indices, frequent updates to indexed fields, journaling, etc, can
cause lots of repeated I/O. You can also change your disk flush
parameters if you'd like with syncdelay, if you have lots of updates
to the same data - this may also impact step5 of the migrate.
On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that filesystem
> can have a significant impact, because we don`t have huge number of files,
> or what else
> Any way if such speed on huge number of documents will be always, you can`t
> use it in production
> Also when I canceling sharding, I run "removeshard", which also very slowhttps://groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1...
> But when I stop this process, and migrate data myself (using
> find()/insert()/remove()) it was much faster than this command (~20-50x
> times, or greater than 50x), so I guess if I run query to migrate all my
> data, instead of "moveChunk", it will be faster ~20-50x times (or greater
> than 50x)
> I understand that "moveChunk" do some other job, but not so huge
> And can you explain me why ot write so many bytes to disk? Seems like it
> rewrite all database each week or something like this
> On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com> wrote:
> > > And we can see here, moveChunk time increase if key is "complicated" or
> > we
> > have a "large" (I don`t really think that 40 million of documents is
> > huge)
> > number of documents
> > I suspect this is just to do with working set - with a larger number
> > of documents, the documents to move are more scattered on disk
> > (there's no guarantee they're ordered like the shard key index). With
> > a larger key size, the index will be larger (2x?) and so less likely
> > to be in memory. The filesystem will also have a significant impact -
> > what filesystem are you using?
> > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > I'm sorry for put all iostat output here in previous message.
> > > I *run some tests*, at amazon *m1.small* instance, which have next
> > > configuration:
> > > 1.7 GB memory
> > > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > > 160 GB instance storage
> > > 64-bit platform
> > > And it gave next results:
> > > *No concurrent read/write in all tests*
> > > *By id on 40 million rows, shard by "_id":*
> > > step3 ~ X, 7, 28, 7, 7 secs
> > > step5 ~ 53,97,70,43,55 secs
> > > *By id on 1 million rows, shard by "_id"*
> > > step3 ~ 8 9 9 secs
> > > step5 ~ 12 15 13 secs
> > > Not run iostat
> > > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key is md5
> > > hash (so.. string from 32 chars)*
> > > step3 ~ 14 8 7 secs
> > > step5 ~ 36 36 40 secs
> > > Not run iostat
> > > And we can see here, moveChunk time increase if key is "complicated" or
> > we
> > > have a "large" (I don`t really think that 40 million of documents is
> > huge)
> > > number of documents
> > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > wrote:
> > > > Avg time to *moveChunk* grows up to *450*
> > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > wrote:
Azat Khuzhin.
Send from phone.
On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> You should move to ext4 and/or xfs - there's definitely impact,
> particularly when migrating data to new shards, which almost always is
> something you do because of data growth. It's in the
> www.mongodb.org/display/DOCS/Production+Notes.
I'll check in hour what fs in Amazon ec2 instances
Ext 3 in my dedicated server
> Not sure what happened earlier with removeShard, but it seems like
> your migrations got hung up on something else - hard to say though
> without logs there.
> > I understand that "moveChunk" do some other job, but not so huge
> In general, we try to be as low-impact as possible with moveChunk,
> yielding whenever possible, which sacrifices speed.
Is it possible to add some config option for this stuff
> > And can you explain me why ot write so many bytes to disk? Seems like it
> rewrite all database each week or something like this
> Hard to say without knowing more about your traffic and schema. Multi-
> key indices, frequent updates to indexed fields, journaling, etc, can
> cause lots of repeated I/O. You can also change your disk flush
> parameters if you'd like with syncdelay, if you have lots of updates
Many inserts, but thanks for hint, I'll try it
> to the same data - this may also impact step5 of the migrate.
I wrote before, that I tried stop all other queries, and this didn't help
> On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
filesystem
> > can have a significant impact, because we don`t have huge number of
files,
> > or what else
> > Any way if such speed on huge number of documents will be always, you
can`t
> > use it in production
> > Also when I canceling sharding, I run "removeshard", which also very
> > But when I stop this process, and migrate data myself (using
> > find()/insert()/remove()) it was much faster than this command (~20-50x
> > times, or greater than 50x), so I guess if I run query to migrate all my
> > data, instead of "moveChunk", it will be faster ~20-50x times (or
greater
> > than 50x)
> > I understand that "moveChunk" do some other job, but not so huge
> > And can you explain me why ot write so many bytes to disk? Seems like it
> > rewrite all database each week or something like this
> > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com> wrote:
> > > > And we can see here, moveChunk time increase if key is
"complicated" or
> > > we
> > > have a "large" (I don`t really think that 40 million of documents is
> > > huge)
> > > number of documents
> > > I suspect this is just to do with working set - with a larger number
> > > of documents, the documents to move are more scattered on disk
> > > (there's no guarantee they're ordered like the shard key index). With
> > > a larger key size, the index will be larger (2x?) and so less likely
> > > to be in memory. The filesystem will also have a significant impact -
> > > what filesystem are you using?
> > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > I'm sorry for put all iostat output here in previous message.
> > > > I *run some tests*, at amazon *m1.small* instance, which have next
> > > > configuration:
> > > > 1.7 GB memory
> > > > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > > > 160 GB instance storage
> > > > 64-bit platform
> > > > And it gave next results:
> > > > *No concurrent read/write in all tests*
> > > > *By id on 40 million rows, shard by "_id":*
> > > > step3 ~ X, 7, 28, 7, 7 secs
> > > > step5 ~ 53,97,70,43,55 secs
> > > > *By id on 1 million rows, shard by "_id"*
> > > > step3 ~ 8 9 9 secs
> > > > step5 ~ 12 15 13 secs
> > > > Not run iostat
> > > > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key is
md5
> > > > hash (so.. string from 32 chars)*
> > > > step3 ~ 14 8 7 secs
> > > > step5 ~ 36 36 40 secs
> > > > Not run iostat
> > > > And we can see here, moveChunk time increase if key is
"complicated" or
> > > we
> > > > have a "large" (I don`t really think that 40 million of documents is
> > > huge)
> > > > number of documents
> > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > > wrote:
> > > > > Avg time to *moveChunk* grows up to *450*
> > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > > wrote:
On Thu, May 10, 2012 at 9:44 PM, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> Azat Khuzhin.
> Send from phone.
> On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> > You should move to ext4 and/or xfs - there's definitely impact,
> > particularly when migrating data to new shards, which almost always is
> > something you do because of data growth. It's in the
> > www.mongodb.org/display/DOCS/Production+Notes.
> I'll check in hour what fs in Amazon ec2 instances
> Ext 3 in my dedicated server
On amazon ec2 - ext3 too
Yes I know that ext4 have extends, so prealocation works more fast with it,
But I remember that when I copy all data to my dedicated server(current),
from another the replication took ~ 24 hours
So I don't think that prealocation is slowdown migration
BTW chunk size is 64m, and if prealocation slowdowns than It won't slowdown
at every moveChunk command, because prealocation size ~ 2 G
> > Not sure what happened earlier with removeShard, but it seems like
> > your migrations got hung up on something else - hard to say though
> > without logs there.
> > > I understand that "moveChunk" do some other job, but not so huge
> > In general, we try to be as low-impact as possible with moveChunk,
> > yielding whenever possible, which sacrifices speed.
> Is it possible to add some config option for this stuff
> > > And can you explain me why ot write so many bytes to disk? Seems like
> it
> > rewrite all database each week or something like this
> > Hard to say without knowing more about your traffic and schema. Multi-
> > key indices, frequent updates to indexed fields, journaling, etc, can
> > cause lots of repeated I/O. You can also change your disk flush
> > parameters if you'd like with syncdelay, if you have lots of updates
> Many inserts, but thanks for hint, I'll try it
> > to the same data - this may also impact step5 of the migrate.
> I wrote before, that I tried stop all other queries, and this didn't help
> > On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
> filesystem
> > > can have a significant impact, because we don`t have huge number of
> files,
> > > or what else
> > > Any way if such speed on huge number of documents will be always, you
> can`t
> > > use it in production
> > > Also when I canceling sharding, I run "removeshard", which also very
> slowhttps://groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1.
> ..
> > > But when I stop this process, and migrate data myself (using
> > > find()/insert()/remove()) it was much faster than this command (~20-50x
> > > times, or greater than 50x), so I guess if I run query to migrate all
> my
> > > data, instead of "moveChunk", it will be faster ~20-50x times (or
> greater
> > > than 50x)
> > > I understand that "moveChunk" do some other job, but not so huge
> > > And can you explain me why ot write so many bytes to disk? Seems like
> it
> > > rewrite all database each week or something like this
> > > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com> wrote:
> > > > > And we can see here, moveChunk time increase if key is
> "complicated" or
> > > > we
> > > > have a "large" (I don`t really think that 40 million of documents is
> > > > huge)
> > > > number of documents
> > > > I suspect this is just to do with working set - with a larger number
> > > > of documents, the documents to move are more scattered on disk
> > > > (there's no guarantee they're ordered like the shard key index).
> With
> > > > a larger key size, the index will be larger (2x?) and so less likely
> > > > to be in memory. The filesystem will also have a significant impact
> -
> > > > what filesystem are you using?
> > > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > I'm sorry for put all iostat output here in previous message.
> > > > > I *run some tests*, at amazon *m1.small* instance, which have next
> > > > > configuration:
> > > > > 1.7 GB memory
> > > > > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > > > > 160 GB instance storage
> > > > > 64-bit platform
> > > > > And it gave next results:
> > > > > *No concurrent read/write in all tests*
> > > > > *By id on 40 million rows, shard by "_id":*
> > > > > step3 ~ X, 7, 28, 7, 7 secs
> > > > > step5 ~ 53,97,70,43,55 secs
> > > > > *By id on 1 million rows, shard by "_id"*
> > > > > step3 ~ 8 9 9 secs
> > > > > step5 ~ 12 15 13 secs
> > > > > Not run iostat
> > > > > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key
> is md5
> > > > > hash (so.. string from 32 chars)*
> > > > > step3 ~ 14 8 7 secs
> > > > > step5 ~ 36 36 40 secs
> > > > > Not run iostat
> > > > > And we can see here, moveChunk time increase if key is
> "complicated" or
> > > > we
> > > > > have a "large" (I don`t really think that 40 million of documents
> is
> > > > huge)
> > > > > number of documents
> > > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com
> > > > wrote:
> > > > > > Avg time to *moveChunk* grows up to *450*
> > > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <
> a3at.m...@gmail.com>
> > > > wrote:
> Is it possible to add some config option for this stuff
Yeah, we're working on this, though ideally this would be as seamless
as possible
One way forward here would be to get a timed iostat for step 5 -
correlate that with the log timing, to see definitively if it's io or
something else causing the problem. If it's not I/O, it potentially
could be config server negotiation taking a long time for some reason
- though we'd need to see the full logs of all involved shards and the
config server to track deeper.
On May 10, 2:03 pm, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> On Thu, May 10, 2012 at 9:44 PM, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > Azat Khuzhin.
> > Send from phone.
> > On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> > > You should move to ext4 and/or xfs - there's definitely impact,
> > > particularly when migrating data to new shards, which almost always is
> > > something you do because of data growth. It's in the
> > >www.mongodb.org/display/DOCS/Production+Notes.
> > I'll check in hour what fs in Amazon ec2 instances
> > Ext 3 in my dedicated server
> On amazon ec2 - ext3 too
> Yes I know that ext4 have extends, so prealocation works more fast with it,
> But I remember that when I copy all data to my dedicated server(current),
> from another the replication took ~ 24 hours
> So I don't think that prealocation is slowdown migration
> BTW chunk size is 64m, and if prealocation slowdowns than It won't slowdown
> at every moveChunk command, because prealocation size ~ 2 G
> > > Not sure what happened earlier with removeShard, but it seems like
> > > your migrations got hung up on something else - hard to say though
> > > without logs there.
> > > > I understand that "moveChunk" do some other job, but not so huge
> > > In general, we try to be as low-impact as possible with moveChunk,
> > > yielding whenever possible, which sacrifices speed.
> > Is it possible to add some config option for this stuff
> > > > And can you explain me why ot write so many bytes to disk? Seems like
> > it
> > > rewrite all database each week or something like this
> > > Hard to say without knowing more about your traffic and schema. Multi-
> > > key indices, frequent updates to indexed fields, journaling, etc, can
> > > cause lots of repeated I/O. You can also change your disk flush
> > > parameters if you'd like with syncdelay, if you have lots of updates
> > Many inserts, but thanks for hint, I'll try it
> > > to the same data - this may also impact step5 of the migrate.
> > I wrote before, that I tried stop all other queries, and this didn't help
> > > On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
> > filesystem
> > > > can have a significant impact, because we don`t have huge number of
> > files,
> > > > or what else
> > > > Any way if such speed on huge number of documents will be always, you
> > can`t
> > > > use it in production
> > > > Also when I canceling sharding, I run "removeshard", which also very
> > slowhttps://groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1.
> > ..
> > > > But when I stop this process, and migrate data myself (using
> > > > find()/insert()/remove()) it was much faster than this command (~20-50x
> > > > times, or greater than 50x), so I guess if I run query to migrate all
> > my
> > > > data, instead of "moveChunk", it will be faster ~20-50x times (or
> > greater
> > > > than 50x)
> > > > I understand that "moveChunk" do some other job, but not so huge
> > > > And can you explain me why ot write so many bytes to disk? Seems like
> > it
> > > > rewrite all database each week or something like this
> > > > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com> wrote:
> > > > > > And we can see here, moveChunk time increase if key is
> > "complicated" or
> > > > > we
> > > > > have a "large" (I don`t really think that 40 million of documents is
> > > > > huge)
> > > > > number of documents
> > > > > I suspect this is just to do with working set - with a larger number
> > > > > of documents, the documents to move are more scattered on disk
> > > > > (there's no guarantee they're ordered like the shard key index).
> > With
> > > > > a larger key size, the index will be larger (2x?) and so less likely
> > > > > to be in memory. The filesystem will also have a significant impact
> > -
> > > > > what filesystem are you using?
> > > > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > I'm sorry for put all iostat output here in previous message.
> > > > > > I *run some tests*, at amazon *m1.small* instance, which have next
> > > > > > configuration:
> > > > > > 1.7 GB memory
> > > > > > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > > > > > 160 GB instance storage
> > > > > > 64-bit platform
> > > > > > And it gave next results:
> > > > > > *No concurrent read/write in all tests*
> > > > > > *By id on 40 million rows, shard by "_id":*
> > > > > > step3 ~ X, 7, 28, 7, 7 secs
> > > > > > step5 ~ 53,97,70,43,55 secs
> > > > > > *By id on 1 million rows, shard by "_id"*
> > > > > > step3 ~ 8 9 9 secs
> > > > > > step5 ~ 12 15 13 secs
> > > > > > Not run iostat
> > > > > > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where key
> > is md5
> > > > > > hash (so.. string from 32 chars)*
> > > > > > step3 ~ 14 8 7 secs
> > > > > > step5 ~ 36 36 40 secs
> > > > > > Not run iostat
> > > > > > And we can see here, moveChunk time increase if key is
> > "complicated" or
> > > > > we
> > > > > > have a "large" (I don`t really think that 40 million of documents
> > is
> > > > > huge)
> > > > > > number of documents
> > > > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <a3at.m...@gmail.com
> > > > > wrote:
> > > > > > > Avg time to *moveChunk* grows up to *450*
> > > > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <
> > a3at.m...@gmail.com>
> > > > > wrote:
On Sat, May 12, 2012 at 1:38 AM, Greg Studer <g...@10gen.com> wrote:
> > Is it possible to add some config option for this stuff
> Yeah, we're working on this, though ideally this would be as seamless
> as possible
> One way forward here would be to get a timed iostat for step 5 -
> correlate that with the log timing, to see definitively if it's io or
> something else causing the problem. If it's not I/O, it potentially
> could be config server negotiation taking a long time for some reason
> - though we'd need to see the full logs of all involved shards and the
> config server to track deeper.
> On May 10, 2:03 pm, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > On Thu, May 10, 2012 at 9:44 PM, Azat Khuzhin <a3at.m...@gmail.com>
> wrote:
> > > Azat Khuzhin.
> > > Send from phone.
> > > On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> > > > You should move to ext4 and/or xfs - there's definitely impact,
> > > > particularly when migrating data to new shards, which almost always
> is
> > > > something you do because of data growth. It's in the
> > > >www.mongodb.org/display/DOCS/Production+Notes.
> > > I'll check in hour what fs in Amazon ec2 instances
> > > Ext 3 in my dedicated server
> > On amazon ec2 - ext3 too
> > Yes I know that ext4 have extends, so prealocation works more fast with
> it,
> > But I remember that when I copy all data to my dedicated server(current),
> > from another the replication took ~ 24 hours
> > So I don't think that prealocation is slowdown migration
> > BTW chunk size is 64m, and if prealocation slowdowns than It won't
> slowdown
> > at every moveChunk command, because prealocation size ~ 2 G
> > > > Not sure what happened earlier with removeShard, but it seems like
> > > > your migrations got hung up on something else - hard to say though
> > > > without logs there.
> > > > > I understand that "moveChunk" do some other job, but not so huge
> > > > In general, we try to be as low-impact as possible with moveChunk,
> > > > yielding whenever possible, which sacrifices speed.
> > > Is it possible to add some config option for this stuff
> > > > > And can you explain me why ot write so many bytes to disk? Seems
> like
> > > it
> > > > rewrite all database each week or something like this
> > > > Hard to say without knowing more about your traffic and schema.
> Multi-
> > > > key indices, frequent updates to indexed fields, journaling, etc, can
> > > > cause lots of repeated I/O. You can also change your disk flush
> > > > parameters if you'd like with syncdelay, if you have lots of updates
> > > Many inserts, but thanks for hint, I'll try it
> > > > to the same data - this may also impact step5 of the migrate.
> > > I wrote before, that I tried stop all other queries, and this didn't
> help
> > > > On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
> > > filesystem
> > > > > can have a significant impact, because we don`t have huge number of
> > > files,
> > > > > or what else
> > > > > Any way if such speed on huge number of documents will be always,
> you
> > > can`t
> > > > > use it in production
> > > > > Also when I canceling sharding, I run "removeshard", which also
> very
> > > slowhttps://
> groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1.
> > > ..
> > > > > But when I stop this process, and migrate data myself (using
> > > > > find()/insert()/remove()) it was much faster than this command
> (~20-50x
> > > > > times, or greater than 50x), so I guess if I run query to migrate
> all
> > > my
> > > > > data, instead of "moveChunk", it will be faster ~20-50x times (or
> > > greater
> > > > > than 50x)
> > > > > I understand that "moveChunk" do some other job, but not so huge
> > > > > And can you explain me why ot write so many bytes to disk? Seems
> like
> > > it
> > > > > rewrite all database each week or something like this
> > > > > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com>
> wrote:
> > > > > > > And we can see here, moveChunk time increase if key is
> > > "complicated" or
> > > > > > we
> > > > > > have a "large" (I don`t really think that 40 million of
> documents is
> > > > > > huge)
> > > > > > number of documents
> > > > > > I suspect this is just to do with working set - with a larger
> number
> > > > > > of documents, the documents to move are more scattered on disk
> > > > > > (there's no guarantee they're ordered like the shard key index).
> > > With
> > > > > > a larger key size, the index will be larger (2x?) and so less
> likely
> > > > > > to be in memory. The filesystem will also have a significant
> impact
> > > -
> > > > > > what filesystem are you using?
> > > > > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > > I'm sorry for put all iostat output here in previous message.
> > > > > > > I *run some tests*, at amazon *m1.small* instance, which have
> next
> > > > > > > configuration:
> > > > > > > 1.7 GB memory
> > > > > > > 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
> > > > > > > 160 GB instance storage
> > > > > > > 64-bit platform
> > > > > > > And it gave next results:
> > > > > > > *No concurrent read/write in all tests*
> > > > > > > *By id on 1 million rows, shard by "{key: 1, _id: 1}", where
> key
> > > is md5
> > > > > > > hash (so.. string from 32 chars)*
> > > > > > > step3 ~ 14 8 7 secs
> > > > > > > step5 ~ 36 36 40 secs
> > > > > > > Not run iostat
> > > > > > > And we can see here, moveChunk time increase if key is
> > > "complicated" or
> > > > > > we
> > > > > > > have a "large" (I don`t really think that 40 million of
> documents
> > > is
> > > > > > huge)
> > > > > > > number of documents
> > > > > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <
> a3at.m...@gmail.com
> > > > > > wrote:
> > > > > > > > Avg time to *moveChunk* grows up to *450*
> > > > > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <
> > > a3at.m...@gmail.com>
> > > > > > wrote:
If you want to track this deeper, feel free to open a SUPPORT or
SERVER ticket with logs during the migration periods - think there's
too much context here to handle via the newsgroup. We're happy to
help you track this down further - but we'd need to start correlating
the ops in the logs and seeing where the delay actually is detected,
which is probably easier to do via a SUPPORT ticket.
On May 12, 5:45 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> But I could say, that iowait wasn't greater than 1, in 2 days (in which
> migration is running)
> But when I run some MR job, it can up to 15
> So I think that migration not used all resources, like MR job in this
> example
> On Sat, May 12, 2012 at 1:38 AM, Greg Studer <g...@10gen.com> wrote:
> > > Is it possible to add some config option for this stuff
> > Yeah, we're working on this, though ideally this would be as seamless
> > as possible
> > One way forward here would be to get a timed iostat for step 5 -
> > correlate that with the log timing, to see definitively if it's io or
> > something else causing the problem. If it's not I/O, it potentially
> > could be config server negotiation taking a long time for some reason
> > - though we'd need to see the full logs of all involved shards and the
> > config server to track deeper.
> > On May 10, 2:03 pm, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > On Thu, May 10, 2012 at 9:44 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > wrote:
> > > > Azat Khuzhin.
> > > > Send from phone.
> > > > On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> > > > > You should move to ext4 and/or xfs - there's definitely impact,
> > > > > particularly when migrating data to new shards, which almost always
> > is
> > > > > something you do because of data growth. It's in the
> > > > >www.mongodb.org/display/DOCS/Production+Notes.
> > > > I'll check in hour what fs in Amazon ec2 instances
> > > > Ext 3 in my dedicated server
> > > On amazon ec2 - ext3 too
> > > Yes I know that ext4 have extends, so prealocation works more fast with
> > it,
> > > But I remember that when I copy all data to my dedicated server(current),
> > > from another the replication took ~ 24 hours
> > > So I don't think that prealocation is slowdown migration
> > > BTW chunk size is 64m, and if prealocation slowdowns than It won't
> > slowdown
> > > at every moveChunk command, because prealocation size ~ 2 G
> > > > > Not sure what happened earlier with removeShard, but it seems like
> > > > > your migrations got hung up on something else - hard to say though
> > > > > without logs there.
> > > > > > I understand that "moveChunk" do some other job, but not so huge
> > > > > In general, we try to be as low-impact as possible with moveChunk,
> > > > > yielding whenever possible, which sacrifices speed.
> > > > Is it possible to add some config option for this stuff
> > > > > > And can you explain me why ot write so many bytes to disk? Seems
> > like
> > > > it
> > > > > rewrite all database each week or something like this
> > > > > Hard to say without knowing more about your traffic and schema.
> > Multi-
> > > > > key indices, frequent updates to indexed fields, journaling, etc, can
> > > > > cause lots of repeated I/O. You can also change your disk flush
> > > > > parameters if you'd like with syncdelay, if you have lots of updates
> > > > Many inserts, but thanks for hint, I'll try it
> > > > > to the same data - this may also impact step5 of the migrate.
> > > > I wrote before, that I tried stop all other queries, and this didn't
> > help
> > > > > On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
> > > > filesystem
> > > > > > can have a significant impact, because we don`t have huge number of
> > > > files,
> > > > > > or what else
> > > > > > Any way if such speed on huge number of documents will be always,
> > you
> > > > can`t
> > > > > > use it in production
> > > > > > Also when I canceling sharding, I run "removeshard", which also
> > very
> > > > slowhttps://
> > groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1.
> > > > ..
> > > > > > But when I stop this process, and migrate data myself (using
> > > > > > find()/insert()/remove()) it was much faster than this command
> > (~20-50x
> > > > > > times, or greater than 50x), so I guess if I run query to migrate
> > all
> > > > my
> > > > > > data, instead of "moveChunk", it will be faster ~20-50x times (or
> > > > greater
> > > > > > than 50x)
> > > > > > I understand that "moveChunk" do some other job, but not so huge
> > > > > > And can you explain me why ot write so many bytes to disk? Seems
> > like
> > > > it
> > > > > > rewrite all database each week or something like this
> > > > > > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com>
> > wrote:
> > > > > > > > And we can see here, moveChunk time increase if key is
> > > > "complicated" or
> > > > > > > we
> > > > > > > have a "large" (I don`t really think that 40 million of
> > documents is
> > > > > > > huge)
> > > > > > > number of documents
> > > > > > > I suspect this is just to do with working set - with a larger
> > number
> > > > > > > of documents, the documents to move are more scattered on disk
> > > > > > > (there's no guarantee they're ordered like the shard key index).
> > > > With
> > > > > > > a larger key size, the index will be larger (2x?) and so less
> > likely
> > > > > > > to be in memory. The filesystem will also have a significant
> > impact
> > > > -
> > > > > > > what filesystem are you using?
> > > > > > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > > > I'm sorry for put all iostat output here in previous message.
> > > > > > > > And we can see here, moveChunk time increase if key is
> > > > "complicated" or
> > > > > > > we
> > > > > > > > have a "large" (I don`t really think that 40 million of
> > documents
> > > > is
> > > > > > > huge)
> > > > > > > > number of documents
> > > > > > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <
> > a3at.m...@gmail.com
> > > > > > > wrote:
> > > > > > > > > Avg time to *moveChunk* grows up to *450*
> > > > > > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <
> > > > a3at.m...@gmail.com>
> > > > > > > wrote:
On Tue, May 15, 2012 at 6:45 AM, Greg Studer <g...@10gen.com> wrote:
> If you want to track this deeper, feel free to open a SUPPORT or
> SERVER ticket with logs during the migration periods - think there's
> too much context here to handle via the newsgroup. We're happy to
> help you track this down further - but we'd need to start correlating
> the ops in the logs and seeing where the delay actually is detected,
> which is probably easier to do via a SUPPORT ticket.
> On May 12, 5:45 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > I didn't have possibility to measure iostat
> > But I could say, that iowait wasn't greater than 1, in 2 days (in which
> > migration is running)
> > But when I run some MR job, it can up to 15
> > So I think that migration not used all resources, like MR job in this
> > example
> > On Sat, May 12, 2012 at 1:38 AM, Greg Studer <g...@10gen.com> wrote:
> > > > Is it possible to add some config option for this stuff
> > > Yeah, we're working on this, though ideally this would be as seamless
> > > as possible
> > > One way forward here would be to get a timed iostat for step 5 -
> > > correlate that with the log timing, to see definitively if it's io or
> > > something else causing the problem. If it's not I/O, it potentially
> > > could be config server negotiation taking a long time for some reason
> > > - though we'd need to see the full logs of all involved shards and the
> > > config server to track deeper.
> > > On May 10, 2:03 pm, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > On Thu, May 10, 2012 at 9:44 PM, Azat Khuzhin <a3at.m...@gmail.com>
> > > wrote:
> > > > > Azat Khuzhin.
> > > > > Send from phone.
> > > > > On May 10, 2012 9:29 PM, "Greg Studer" <g...@10gen.com> wrote:
> > > > > > You should move to ext4 and/or xfs - there's definitely impact,
> > > > > > particularly when migrating data to new shards, which almost
> always
> > > is
> > > > > > something you do because of data growth. It's in the
> > > > > >www.mongodb.org/display/DOCS/Production+Notes.
> > > > > I'll check in hour what fs in Amazon ec2 instances
> > > > > Ext 3 in my dedicated server
> > > > On amazon ec2 - ext3 too
> > > > Yes I know that ext4 have extends, so prealocation works more fast
> with
> > > it,
> > > > But I remember that when I copy all data to my dedicated
> server(current),
> > > > from another the replication took ~ 24 hours
> > > > So I don't think that prealocation is slowdown migration
> > > > BTW chunk size is 64m, and if prealocation slowdowns than It won't
> > > slowdown
> > > > at every moveChunk command, because prealocation size ~ 2 G
> > > > > > Not sure what happened earlier with removeShard, but it seems
> like
> > > > > > your migrations got hung up on something else - hard to say
> though
> > > > > > without logs there.
> > > > > > > I understand that "moveChunk" do some other job, but not so
> huge
> > > > > > In general, we try to be as low-impact as possible with
> moveChunk,
> > > > > > yielding whenever possible, which sacrifices speed.
> > > > > Is it possible to add some config option for this stuff
> > > > > > > And can you explain me why ot write so many bytes to disk?
> Seems
> > > like
> > > > > it
> > > > > > rewrite all database each week or something like this
> > > > > > Hard to say without knowing more about your traffic and schema.
> > > Multi-
> > > > > > key indices, frequent updates to indexed fields, journaling,
> etc, can
> > > > > > cause lots of repeated I/O. You can also change your disk flush
> > > > > > parameters if you'd like with syncdelay, if you have lots of
> updates
> > > > > Many inserts, but thanks for hint, I'll try it
> > > > > > to the same data - this may also impact step5 of the migrate.
> > > > > I wrote before, that I tried stop all other queries, and this
> didn't
> > > help
> > > > > > On May 10, 2:54 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > > Filesystem: ext3(raid10 from 6 sas hdd), but I don`t think that
> > > > > filesystem
> > > > > > > can have a significant impact, because we don`t have huge
> number of
> > > > > files,
> > > > > > > or what else
> > > > > > > Any way if such speed on huge number of documents will be
> always,
> > > you
> > > > > can`t
> > > > > > > use it in production
> > > > > > > Also when I canceling sharding, I run "removeshard", which also
> > > very
> > > > > slowhttps://
> > > groups.google.com/forum/?fromgroups#!topic/mongodb-user/R7XS1.
> > > > > ..
> > > > > > > But when I stop this process, and migrate data myself (using
> > > > > > > find()/insert()/remove()) it was much faster than this command
> > > (~20-50x
> > > > > > > times, or greater than 50x), so I guess if I run query to
> migrate
> > > all
> > > > > my
> > > > > > > data, instead of "moveChunk", it will be faster ~20-50x times
> (or
> > > > > greater
> > > > > > > than 50x)
> > > > > > > I understand that "moveChunk" do some other job, but not so
> huge
> > > > > > > And can you explain me why ot write so many bytes to disk?
> Seems
> > > like
> > > > > it
> > > > > > > rewrite all database each week or something like this
> > > > > > > On Thu, May 10, 2012 at 6:50 AM, Greg Studer <g...@10gen.com>
> > > wrote:
> > > > > > > > > And we can see here, moveChunk time increase if key is
> > > > > "complicated" or
> > > > > > > > we
> > > > > > > > have a "large" (I don`t really think that 40 million of
> > > documents is
> > > > > > > > huge)
> > > > > > > > number of documents
> > > > > > > > I suspect this is just to do with working set - with a larger
> > > number
> > > > > > > > of documents, the documents to move are more scattered on
> disk
> > > > > > > > (there's no guarantee they're ordered like the shard key
> index).
> > > > > With
> > > > > > > > a larger key size, the index will be larger (2x?) and so less
> > > likely
> > > > > > > > to be in memory. The filesystem will also have a significant
> > > impact
> > > > > -
> > > > > > > > what filesystem are you using?
> > > > > > > > On May 5, 6:24 am, Azat Khuzhin <a3at.m...@gmail.com> wrote:
> > > > > > > > > I'm sorry for put all iostat output here in previous
> message.
> > > > > > > > > And we can see here, moveChunk time increase if key is
> > > > > "complicated" or
> > > > > > > > we
> > > > > > > > > have a "large" (I don`t really think that 40 million of
> > > documents
> > > > > is
> > > > > > > > huge)
> > > > > > > > > number of documents
> > > > > > > > > On Fri, May 4, 2012 at 12:18 PM, Azat Khuzhin <
> > > a3at.m...@gmail.com
> > > > > > > > wrote:
> > > > > > > > > > Avg time to *moveChunk* grows up to *450*
> > > > > > > > > > On Thu, May 3, 2012 at 9:17 PM, Azat Khuzhin <
> > > > > a3at.m...@gmail.com>
> > > > > > > > wrote: