fastest way to import a large dataset

tcurdt

unread,

Feb 8, 2011, 4:47:00 AM2/8/11

to mongodb-user

Hey there,

we are having big trouble importing a large dataset into mongo in a
reasonable time.
We have a 6 node sharded cluster and we tried a couple of different
approaches.

The dataset consist of 1.4B small documents. Average size about 70
bytes.
Fastest import we have seen was 24 hours.

We would have thought that a mongos per machine with a couple of
mongoimports per node should give the best results. But oddly enough -
that's not faster - it's rather slower than a single mongoimport for
the whole cluster.

Right now I am wondering if there is a way to import the pre-sharded
documents into the shard databases using the --dbpath option and the
adjust the config database accordingly. Would that work? ...and be
faster?
Indexes beforehand or after?

cheers,
Torsten

Nat

unread,

Feb 8, 2011, 4:52:29 AM2/8/11

to mongodb-user

What is your shard key?

- Index after is better than index before hand
- If you already preshard the data, turn the balancer off first
- You should break the import data in the same way that you preshard
and use mongoimport to load them up
- Your data should be sorted by shard key if possible

Torsten Curdt

unread,

Feb 8, 2011, 5:15:45 AM2/8/11

to mongod...@googlegroups.com

> What is your shard key?

We tried _id (ObjectIds) as well as our preferred keys

> - Index after is better than index before hand

So far we have been trying to index while importing.
We can give that another try.

> - If you already preshard the data, turn the balancer off first

I would shut down config server and mongos for the import.
Is that what you mean?

> - You should break the import data in the same way that you preshard

Of course.

> and use mongoimport to load them up
> - Your data should be sorted by shard key if possible

OK

Biggest question: will it be worth it?

cheers,
Torsten

Nat

unread,

Feb 8, 2011, 5:22:12 AM2/8/11

to mongodb-user

- If you use ObjectId as a shard key, you won't be able to scale the
import. The maximum speed is limited by the speed of one machine.
- You can leave your config server and mongos up and do the import via
mongos.
- To turn off balancer,
> use config
> db.settings.update({_id:"balancer"},{$set : {stopped:true}},
true)

Torsten Curdt

unread,

Feb 8, 2011, 6:20:35 AM2/8/11

to mongod...@googlegroups.com

> - If you use ObjectId as a shard key, you won't be able to scale the
> import. The maximum speed is limited by the speed of one machine.

Why is that?
The ObjectIds should be quite different across the machines and so
hopefully fall into different chunks.

> - You can leave your config server and mongos up and do the import via
> mongos.

Confused - that's what I was doing before.

mongo1: shardsrv mongos 2*mongoimport configsrv
mongo2: shardsrv mongos 2*mongoimport configsrv
mongo3: shardsrv mongos 2*mongoimport configsrv
mongo4: shardsrv mongos 2*mongoimport
mongo5: shardsrv mongos 2*mongoimport
mongo6: shardsrv mongos 2*mongoimport

Or do you mean...

Splitting up the pre-sharded dataset across the nodes. Then turn off
balancing. But instead of using --dbpath use mongos? Wouldn't --dbpath
be faster? Wouldn't writes still get routed to other shards with
mongos?

> - To turn off balancer,
> > use config
> > db.settings.update({_id:"balancer"},{$set : {stopped:true}},
> true)

Ah ... OK.

cheers,
Torsten

Nat

unread,

Feb 8, 2011, 6:33:20 AM2/8/11

to mongodb-user

- ObjectId is keyed by timestamp first.
- You can use --dbpath but you have to take mongod offline. I just
recommended another way without taking down mongod. As you will
perform mongoimport splitted by shard key, mongos should route
requests to one server per mongoimport.
- Do you have mongostat, iostat, db.stats() during import process?

Torsten Curdt

unread,

Feb 8, 2011, 7:05:39 AM2/8/11

to mongod...@googlegroups.com

> - ObjectId is keyed by timestamp first.

True ... but even with our preferred sharding key [user, time] it
doesn't behave much better.

> - You can use --dbpath but you have to take mongod offline.

That's fine.

> I just recommended another way without taking down mongod. As you will
> perform mongoimport splitted by shard key, mongos should route
> requests to one server per mongoimport.

But doesn't that depend on what chunks are configured in the config server?

> - Do you have mongostat, iostat, db.stats() during import process?

Certainly. With the current non-pre-sharded import...

- mongostat shows looong "holes" with no ops at all. I assume that's
the balancer - but not sure. numbers were much better in the beginning
of the import.

- iostat shows quite uneven activity across the nodes.

- db.stats() we are monitoring over time. the following shows the
objects graphed:

https://skitch.com/tcurdt/rpti6/import-speed

Nat

unread,

Feb 8, 2011, 7:09:02 AM2/8/11

to mongodb-user

if you use the sharding key [user, time], turn off balancer, you
should see better result. Can you post iostat and mongostat result?

Eliot Horowitz

unread,

Feb 8, 2011, 7:16:47 AM2/8/11

to mongod...@googlegroups.com

What version are you on?
You should shard on user,time as you want to do.
The speed is probably because of migrations.

2 main options:
- try 1.7.5
- pre-split the collection into a lot of chunks, let the balancer
move them around, then insert.
this will prevent migrates.

I would not mess with --dbpath or turning off the balancer, that's
much more complicate than you need to do.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Torsten Curdt

unread,

Feb 8, 2011, 7:38:46 AM2/8/11

to mongod...@googlegroups.com

mongostats: https://gist.github.com/eff0203fcaaf241585d5
iostats: https://gist.github.com/51ee89cae758722c0bf0
db.stats: https://gist.github.com/d9392815ac61b4945e92

Torsten Curdt

unread,

Feb 8, 2011, 7:47:25 AM2/8/11

to mongod...@googlegroups.com

> What version are you on?

1.6.5

> You should shard on user,time as you want to do.

_id was just for testing whether that improves things.

> The speed is probably because of migrations.
>
> 2 main options:
> - try 1.7.5

Anything particular why? ...or just trying the latest and greatest?

> - pre-split the collection into a lot of chunks, let the balancer
> move them around, then insert.
> this will prevent migrates.

The collection data is already split up in various files:

collectionA-1.json.gz
collectionA-2.json.gz
...

We are iterating over those files and handing them to mongoimport.
How would I "let the balancer move them around" before insert?

> I would not mess with --dbpath or turning off the balancer, that's
> much more complicate than you need to do.

Certainly not eager to. It would be a last resort measure.
But we were to go down that road - would it be faster?

cheers,
Torsten

Eliot Horowitz

unread,

Feb 8, 2011, 8:18:23 AM2/8/11

to mongod...@googlegroups.com

>> - try 1.7.5
>
> Anything particular why? ...or just trying the latest and greatest?

Lots of sharding improvements in 1.7.x
Though you should probably wait for 1.7.6 or use the nightly.

>> - pre-split the collection into a lot of chunks, let the balancer
>> move them around, then insert.
>> this will prevent migrates.
>
> The collection data is already split up in various files:
>
> collectionA-1.json.gz
> collectionA-2.json.gz
> ...
>
> We are iterating over those files and handing them to mongoimport.
> How would I "let the balancer move them around" before insert?

That's not quite what I meant.
I meant actual split the mongo chunks up:
http://www.mongodb.org/display/DOCS/Splitting+Chunks

So you could call split 1000 times (make sure you pick the points reasonably).
Then mogno will balanced those 1000 chunks.
Once its done, start the import again.

>> I would not mess with --dbpath or turning off the balancer, that's
>> much more complicate than you need to do.
>
> Certainly not eager to. It would be a last resort measure.
> But we were to go down that road - would it be faster?

This would not be faster than doing the pre-splitting.
That's going to give you the best results.

Torsten Curdt

unread,

Feb 8, 2011, 8:41:47 AM2/8/11

to mongod...@googlegroups.com

>> The collection data is already split up in various files:
>>
>> collectionA-1.json.gz
>> collectionA-2.json.gz
>> ...
>>
>> We are iterating over those files and handing them to mongoimport.
>> How would I "let the balancer move them around" before insert?
>
> That's not quite what I meant.
> I meant actual split the mongo chunks up:
> http://www.mongodb.org/display/DOCS/Splitting+Chunks
>
> So you could call split 1000 times (make sure you pick the points reasonably).
> Then mogno will balanced those 1000 chunks.

Let's assume I start a fresh import. The db is empty.

I have 6 machines and say 600 users.

I could do 5 splits then, on 100, 200, 300, 400, 500 which would give
me 6 segments in the key space.
The balancer then would assign those evenly to my 6 shards.
And when I now import everything should get distributed evenly without
any costly migrations.

Did I get that idea right?

Eliot Horowitz

unread,

Feb 8, 2011, 8:45:48 AM2/8/11

to mongod...@googlegroups.com

Yes, though you need more than 6.
The minimum for balancing is 8.
I would do 100 splits or so.

Torsten Curdt

unread,

Feb 10, 2011, 7:00:15 AM2/10/11

to mongod...@googlegroups.com

So I've started with a fresh test cluster. Set all up and then before
the import run

admin.runCommand( { split : obj.shardcollection , middle : { _id : key } } )

100 times to pre-split the key space into smaller chunks.

Looking at db.printShardingStatus() though there is only one big chunk
for every collection

{ "u" : { $minKey : 1 } } -->> { "u" : { $maxKey : 1 } } on :
shard0000 { "t" : 1000, "i" : 0 }

and all key are pointing to a single shard.

What am I missing?

(And what's the { "t" : 1000, "i" : 0 } btw)

cheers,
Torsten

Nat

unread,

Feb 10, 2011, 7:11:42 AM2/10/11

to mongodb-user

what's the exact command do you use to split? Can you list down a few?
It doesn't seem to split anything.

Torsten Curdt

unread,

Feb 10, 2011, 8:01:11 AM2/10/11

to mongod...@googlegroups.com

Bah. Stupid me. Wrong key :)
Looking better now.

Torsten Curdt

unread,

Feb 10, 2011, 8:05:38 AM2/10/11

to mongod...@googlegroups.com

Hm ... but the chunks are by no means evenly distributed across the
shards (yet).
Shouldn't the balancer do this in seconds as the db is still empty?

Torsten Curdt

unread,

Feb 10, 2011, 8:08:34 AM2/10/11

to mongod...@googlegroups.com

And what does this mean?

shard0001 { "t" : 34000, "i" : 0 }

What's "t" and what's "i" ?

Nat

unread,

Feb 10, 2011, 8:09:09 AM2/10/11

to mongodb-user

you need to move chunks first to other servers.
http://www.mongodb.org/display/DOCS/Moving+Chunks

On Feb 10, 9:05 pm, Torsten Curdt <tcu...@vafer.org> wrote:
> Hm ... but the chunks are by no means evenly distributed across the
> shards (yet).
> Shouldn't the balancer do this in seconds as the db is still empty?
>
>
>
>
>
>
>
> On Thu, Feb 10, 2011 at 14:01, Torsten Curdt <tcu...@vafer.org> wrote:
> > Bah. Stupid me. Wrong key :)
> > Looking better now.
>

Torsten Curdt

unread,

Feb 10, 2011, 8:15:05 AM2/10/11

to mongod...@googlegroups.com

Manuallly??

I thought that's what the balancer is for - to distribute the chunks
evenly across the nodes.

Nat

unread,

Feb 10, 2011, 8:20:23 AM2/10/11

to mongodb-user

If it's empty, why need to move it :)

Torsten Curdt

unread,

Feb 10, 2011, 8:40:54 AM2/10/11

to mongod...@googlegroups.com

So that the data import ends up on the nodes distributed without the
need for migrations.
Which hopefully will result in faster import speed.

Eliot Horowitz

unread,

Feb 10, 2011, 10:03:59 AM2/10/11

to mongod...@googlegroups.com

Until you start putting data in, mongo has no idea what the ranges are.
Even if you know its numeric, you can't split the entire key space up
ahead of time.

In 1.8 we've done some work to make the initial loading of a collection faster.
You may want to try with the 1.7 nightly to compare.

Torsten Curdt

unread,

Feb 10, 2011, 10:22:37 AM2/10/11

to mongod...@googlegroups.com

> Until you start putting data in, mongo has no idea what the ranges are.
> Even if you know its numeric, you can't split the entire key space up
>ahead of time.

Now I am confused. Isn't that what the pre-sharding was for?

I know the keyspace and I evenly distributed it across the nodes.

> In 1.8 we've done some work to make the initial loading of a collection faster.
> You may want to try with the 1.7 nightly to compare.

How confident would you be to go into production with a 1.7 nightly at
this stage?

cheers,
Torsten

Eliot Horowitz

unread,

Feb 10, 2011, 10:26:11 AM2/10/11

to mongod...@googlegroups.com

> Now I am confused. Isn't that what the pre-sharding was for?
>
> I know the keyspace and I evenly distributed it across the nodes.

Now I'm confused :) Yes, that's what pre-sharding is for.
Was there an issue after you did that?
If you pre-split into a large number of chunks initially, were they
evenly distributed?
Long thread I know.

>> In 1.8 we've done some work to make the initial loading of a collection faster.
>> You may want to try with the 1.7 nightly to compare.
>
> How confident would you be to go into production with a 1.7 nightly at
> this stage?
>
> cheers,
> Torsten
>

Eliot Horowitz

unread,

Feb 10, 2011, 10:27:15 AM2/10/11

to mongod...@googlegroups.com

> How confident would you be to go into production with a 1.7 nightly at
> this stage?

Really depends on your comfort and how much testing you can do beforehand.
I certainly wouldn't just throw it into production without testing it first.

Torsten Curdt

unread,

Feb 10, 2011, 10:47:12 AM2/10/11

to mongod...@googlegroups.com

>> Now I am confused. Isn't that what the pre-sharding was for?
>>
>> I know the keyspace and I evenly distributed it across the nodes.
>
> Now I'm confused :) Yes, that's what pre-sharding is for.

:)

I think were the confusion started for me was that after splitting I
still had to
to manually move chunks (of data that is not even imported yet)

> Was there an issue after you did that?

Still quite early in the import but so far it looks better.

> If you pre-split into a large number of chunks initially, were they
> evenly distributed?

Well, I did the distribution by splitting and assigning the chunks to
the shards - so yes :)

cheers,
Torsten

Eliot Horowitz

unread,

Feb 10, 2011, 11:35:51 AM2/10/11

to mongod...@googlegroups.com

> I think were the confusion started for me was that after splitting I
> still had to
> to manually move chunks (of data that is not even imported yet)

How many splits did you do?
Once there were more than 8 chunks, it should have started moving them.
Did that not happen?
How long did you wait?
Can you send the mongos logs from that period?

Torsten Curdt

unread,

Feb 10, 2011, 11:59:01 AM2/10/11

to mongod...@googlegroups.com

>> I think were the confusion started for me was that after splitting I
>> still had to
>> to manually move chunks (of data that is not even imported yet)
>
> How many splits did you do?

I created 100 splits.

> Once there were more than 8 chunks, it should have started moving them.

When exactly? After splitting ...or after I started the import?

> Did that not happen?
> How long did you wait?

Here is what I did:

I started with a fresh mongo cluster. Then did the 100 pre-splits.
Looked at the config server. Distribution of the chunks was not good.
Waited a couple of minutes (5?). Distribution of the chunks was still not good.
Then moved the chunks manually so the config server showed a good
distribution of chunks across the shards.
And then I started the import.

> Can you send the mongos logs from that period?

Sure can do ....or is the above expected behavior?

Eliot Horowitz

unread,

Feb 10, 2011, 2:05:10 PM2/10/11

to mongod...@googlegroups.com

The chunks should have started moving immediately.
About 2/minute.

So would have taken 25 minutes to be even.

After 5 minutes, should have been 90/10 or so.

Torsten Curdt

unread,

Feb 10, 2011, 2:09:27 PM2/10/11

to mongod...@googlegroups.com

Why does it take so long when there is no data imported yet?

Eliot Horowitz

unread,

Feb 10, 2011, 2:14:21 PM2/10/11

to mongod...@googlegroups.com

Sorry, I was a bit off.
It should do ~10 chunks per minute.
So after 5 minutes should have been very balanced.

Torsten Curdt

unread,

Feb 10, 2011, 4:27:53 PM2/10/11

to mongod...@googlegroups.com

Although this import is doing much better it's now gradually getting slower.
Still despite the pre-sharding

db.locks.find()

shows that the balancer is getting in the way.

Now I have turned off balancing

db.settings.update({_id:"balancer"},{$set : {stopped:true}}, true)

Is there a way of aborting current migration operations? I still see
the locks in the config db.

cheers,
Torsten

Eliot Horowitz

unread,

Feb 10, 2011, 6:00:55 PM2/10/11

to mongod...@googlegroups.com

When you started the import, how balanced were things?

Are you sure your shard key is optimal?

What is the shard key?
What is the data distribution?
What order are you inserting in?

Torsten Curdt

unread,

Feb 10, 2011, 6:49:23 PM2/10/11

to mongod...@googlegroups.com

> Are you sure your shard key is optimal?

Pretty sure it's not optimal yet.

> What is the shard key?

Currently it's only user. It should better be user, time.

> What is the data distribution?

Not evenly enough I guess.

Just so I understand correctly: once the size of the data of all the
docs that belong to a certain key range become bigger than the defined
chunk size, the balancer will kick in and split the chunk in the
middle and transfers a half of the documents onto the most empty
shard. That correct?

> What order are you inserting in?

No particular order. We have many files. Within these files the shard
key will probably be monotonically increasing though.

>> Is there a way of aborting current migration operations? I still see
>> the locks in the config db.

Is there? I just want to confirm it's the migrations.

cheers,
Torsten

Eliot Horowitz

unread,

Feb 10, 2011, 7:12:43 PM2/10/11

to mongod...@googlegroups.com

So lets you shard by user and insert into user order.
You pre-split into 100 chunks.
Now you start inserting documents. All inserts are probably going to
1 shard because the chunks are contiguous.
So a couple of things you can try.
- pre-splitting a lot more so you don't have to split during insertion
- running multiple import jobs at the same time from different ends
of the ranges

As for killing migrates, you can kill the "moveChunk" command on the shard.

Torsten Curdt

unread,

Feb 10, 2011, 7:45:36 PM2/10/11

to mongod...@googlegroups.com

> So lets you shard by user and insert into user order.
> You pre-split into 100 chunks.
> Now you start inserting documents. All inserts are probably going to
> 1 shard because the chunks are contiguous.

Not quite. I have 2 imports per machine in a 3 node cluster setup.
So a total of 6 imports. They should be working mostly on different ranges.
Disk activity graphs seem to back up that assumption.

> So a couple of things you can try.
> - pre-splitting a lot more so you don't have to split during insertion

(nod)

> - running multiple import jobs at the same time from different ends
> of the ranges

see above.

Seems like we are peaking at 40,000 inserts/s until the memory is
full. When we hit the disk it seems to go down to 12.000 inserts/s.

https://skitch.com/tcurdt/rqfw4/stats

Which probably is something we could live with. I just would like to
get to the bottom of where "mongostat" shows phases of 0 inserts/s
over several seconds. Right now I am just assuming it's the
migrations.

> As for killing migrates, you can kill the "moveChunk" command on the shard.

That a individual process? Or how do I do that?

cheers,
Torsten

Eliot Horowitz

unread,

Feb 10, 2011, 10:41:37 PM2/10/11

to mongod...@googlegroups.com

Can you send the mongos log when you see the pauses?

Torsten Curdt

unread,

Feb 11, 2011, 3:01:50 AM2/11/11

to mongod...@googlegroups.com

Sent via direct mail:

- a mongostat excerpt showing the insert pauses
- the logs of all 3 nodes
- stats.tsv which is count/objects total/per collection/per shard over time
- stats.html which is a munin dashboard

If you graph the stats.tsv you can see nicely the bump when migrations
were turned off.
The pauses are still there though. And the insert speed is again
flattening after a while.
Which now could be because of an imbalance when comparing to the munin stats)

My next try would be to pre-shard into more chunks (1000) and then
leave the balancer off for the import.
But the insert pauses are still concerning.

cheers,
Torsten

Eliot Horowitz

unread,

Feb 11, 2011, 11:24:57 AM2/11/11

to mongod...@googlegroups.com

Seems like there are 2 things happening here:

1) Balancer - this should be much better in 1.7.5 so may want to test
with 1.8.0 when it comes out (or 1.8.0-rc0)
2) Hitting machine ram/index limits - are you adding indexes on the
data before you insert? Do you have mongostat from the shards, or
just mongos?

Torsten Curdt

unread,

Feb 12, 2011, 8:43:17 AM2/12/11

to mongod...@googlegroups.com

> Seems like there are 2 things happening here:
>
> 1) Balancer - this should be much better in 1.7.5 so may want to test
> with 1.8.0 when it comes out (or 1.8.0-rc0)

When is the 1.8.0rc0 expected to come out?

> 2) Hitting machine ram/index limits - are you adding indexes on the
> data before you insert?

Yes, this import has indexes before the insert.

The idea was that since there should be not much balancing because of
the pre-sharding, having the index during the import should not make
that much of a difference on the overall time. (Which might not have
been a good idea) Indexing while importing makes the process a bit
more predictable though.

> Do you have mongostat from the shards, or
> just mongos?

I just realized this could be the problem. It's a mongostat from one
mongos - there are 3.

But IIRC mongostat in 1.6.5 does not support a list of mongos yet.

Eliot Horowitz

unread,

Feb 12, 2011, 10:13:52 AM2/12/11

to mongod...@googlegroups.com

>
> When is the 1.8.0rc0 expected to come out?

Couple of weeks

>> 2) Hitting machine ram/index limits - are you adding indexes on the
>> data before you insert?
>
> Yes, this import has indexes before the insert.
>
> The idea was that since there should be not much balancing because of
> the pre-sharding, having the index during the import should not make
> that much of a difference on the overall time. (Which might not have
> been a good idea) Indexing while importing makes the process a bit
> more predictable though.

If the indexes or data are bigger than ram, than creating the indexes
after will be much faster.
You may want to try that.

>> Do you have mongostat from the shards, or
>> just mongos?
>
> I just realized this could be the problem. It's a mongostat from one
> mongos - there are 3.
>
> But IIRC mongostat in 1.6.5 does not support a list of mongos yet.

You can use mongostat from 1.7.5 with servers from 1.6.5

Torsten Curdt

unread,

Feb 14, 2011, 1:04:42 AM2/14/11

to mongod...@googlegroups.com

> If the indexes or data are bigger than ram, than creating the indexes
> after will be much faster.
> You may want to try that.

Alright ... another test just finished.

1000 splits up front and indexes last.

The import took only 6 hours. Which is fantastic! Pretty much linear
import speed.
Creating the indexes took about 4h. Still good!
But splitting up the keyspace up front (on an empty database!) took
almost 6 hours. Which is neither acceptable nor understandable. How
can moving zero data take so long? I would have expected that to be
instant.

Eliot Horowitz

unread,

Feb 14, 2011, 1:06:21 AM2/14/11

to mongod...@googlegroups.com

Do you have the log for that?
That makes no sense.

Torsten Curdt

unread,

Feb 14, 2011, 5:14:11 AM2/14/11

to mongod...@googlegroups.com

> That makes no sense.

Glad you see it that way too :)

> Do you have the log for that?

Logs are on your way.

cheers,
Torsten

Eliot Horowitz

unread,

Feb 15, 2011, 3:13:53 AM2/15/11

to mongod...@googlegroups.com

Can you send the the contents of the config server?
At the very least the changelog collection?

Also, can you send the code that you used for pre-splitting?

Torsten Curdt

unread,

Feb 15, 2011, 1:42:41 PM2/15/11

to mongod...@googlegroups.com

> Can you send the the contents of the config server?
> At the very least the changelog collection?

On it's way via email.

> Also, can you send the code that you used for pre-splitting?

https://gist.github.com/626ff7549261d83e8855

cheers,
Torsten

Eliot Horowitz

unread,

Feb 17, 2011, 1:52:53 PM2/17/11

to mongod...@googlegroups.com, Torsten Curdt

Nothing obvious there.
Do you still have the logs from mongos and the shards?

Jayesh

unread,

Feb 18, 2011, 2:50:41 PM2/18/11

to mongodb-user

Hi Torsten,

I am not surprised by what you observed !

MongoDB has a very good architecture - however every architecture has
its nuances, strengths and weakness.
I like the current sharding and replication and believe there are more
exciting changes happening (and coming).

When data is coming through a large pipe or trickle or large number of
concurrent, online transactions - sharding and replication work good.

However when you have to do a massive data load - like what you are
doing, we have to think outside the box.

I come from an ETL and data management background, and nowadays,
rather than having "batch processing" at the end of the day
(there is no such thing as end-of-the-day when systems are 24x7), we
resort to trickle ETL or trickle TL (TTL).
In this approach, we continuously process new/changed OLTP data to
load it into data warehouses and such,
without trying to read the whole transactional system from the
begining or a high watermark point every time.

So lets come to your situation.
You have ~1.4 billion rows to load, each of which is about 70 bytes.
Also, you have a cluster of 6 nodes.
The above numbers result in about 200-300 GB of data with the usual
overhead (2,000,000,000,000 x 1 or 1.5)

Given this, let me ask you a few questions -
- What is your requirement to complete the data load? What are the
driving factors for that time constraint?
- Are there any other constraints? (e.g. throughput, storage space,
hardware/network, etc.)?
- Can you give some idea on your document structure?
- You mentioned using the objectid to shard your data? Is that a
requirement?
- Do you have any other application based option for a unique way to
identify a row?
(in other words, do you have a user/primary/application key)
- What are the specs on your hardware?
- Is there any option to reconfigure/reallocate your hardware in a
different way if required?
- How often do you need to this kind of data load?
From what you implied in your post, it seems you have already done
it atleast one which took 24 hours.
So is this a one time thing or a repeated thing?
- Once you have loaded the data how do you expect to use it?
- Do you know the query pattern?
- Is there any subsequent append or update of the data?
- Is the data being loaded from one or more clients or from the Mongo
cluster?
- Can you give some idea on the indexes that you need?

So here's what I would suggest on what I know - it can be refined once
we know more from the answers to the above questions.
I am assuming that you
- Select a user-defined key from your data if possible (or generate
one if necessary).
- Use an appropriate hashing algorithm that is able to generate a good
distribution. I have found MD5 to give pretty good results.
- Next pre-shard your data - since we know that the key will always be
32 characters with each character's domain being 0-9A-F
One algorithm can be '0xxxxx..' through '200000000..' should go to
shard 1 and so on. Or you can do round-robin shard - by having
'0xxxx...' to go to shard 1, '1xxxxx....' to go to shard 2, etc.
(where xxxx = any character - so you have to create the range
appropriately)
- I would suggest creating the auxilliary indexes upfront and not in
the end.

Hope that helps......

-- Jayesh

On Feb 8, 3:47 am, tcurdt <tcu...@vafer.org> wrote:
> Hey there,
>
> we are having big trouble importing a large dataset into mongo in a
> reasonable time.
> We have a 6 node sharded cluster and we tried a couple of different
> approaches.
>
> The dataset consist of 1.4B small documents. Average size about 70
> bytes.
> Fastest import we have seen was 24 hours.
>
> We would have thought that a mongos per machine with a couple of
> mongoimports per node should give the best results. But oddly enough -
> that's not faster - it's rather slower than a single mongoimport for
> the whole cluster.
>
> Right now I am wondering if there is a way to import the pre-sharded
> documents into the shard databases using the --dbpath option and the
> adjust the config database accordingly. Would that work? ...and be
> faster?
> Indexes beforehand or after?
>
> cheers,
> Torsten

Torsten Curdt

unread,

Feb 19, 2011, 12:33:18 PM2/19/11

to mongod...@googlegroups.com

On Thu, Feb 17, 2011 at 19:52, Eliot Horowitz <elioth...@gmail.com> wrote:
> Nothing obvious there.
> Do you still have the logs from mongos and the shards?

sent via email

cheers,
Torsten

Torsten Curdt

unread,

Feb 19, 2011, 1:13:41 PM2/19/11

to mongod...@googlegroups.com, Jayesh

Hey Jayesh,

thanks for the long-ish reply :)

> However when you have to do a massive data load - like what you are
> doing, we have to think outside the box.

That's how the idea of the manual pre-sharding came to life.

<snip/>

> In this approach, we continuously process new/changed OLTP data to
> load it into data warehouses and such,
> without trying to read the whole transactional system from the
> begining or a high watermark point every time.

Right. We do not necessarily want to do this continuously.
But we need it for iterations on the data model and one has to test
worst case recovery scenarios.

<snip/>

> Given this, let me ask you a few questions -
> - What is your requirement to complete the data load? What are the
> driving factors for that time constraint?

Please, see below.

> - Are there any other constraints? (e.g. throughput, storage space,
> hardware/network, etc.)?

Well, I guess in the real world all those constraint apply - at least somehow :)

We currently have the following objectives
- find maximal app throughput with out current hardware
- find the fastest way to fill the cluster from our json document
...from scratch

> - Can you give some idea on your document structure?

At the stage of the initial import it's really simple and flat. Along
the lines of

{
user: 1,
a: 1,
b: 2,
c: 3,
d: 4,
}

> - You mentioned using the objectid to shard your data? Is that a
> requirement?

No, we had big problems with the data and load distribution across the
shards. So we started experimenting. Using _id for sharding was one of
the tests. We really want to shard on the userid. And the last
pre-sharded test was based on the userid.

That said: our application performance dropped from 2ms per query to
200ms per query when switching from _id to userid. Which I don't fully
understand yet. We have large indexes - too large for keeping them in
all in memory. I assume that will be related. But that's for a
different thread.

> - Do you have any other application based option for a unique way to
> identify a row?
> (in other words, do you have a user/primary/application key)

Yes, we do. Most documents can identified uniquely. There are a few
that also vary too much - we would need to use a md5 for those. We
also thought about creating a custom/non-objectid _id key for that
very reason. This means though, that quite some logic needs to moved
into the application layer. It would be great to avoid that.

> - What are the specs on your hardware?

Mid-range servers with 32GB of RAM. Need to look up the exact specs if
of interest.

> - Is there any option to reconfigure/reallocate your hardware in a
> different way if required?

WDYM? Well, I am certain we could add more shards and improve the
situation. But I fear if we cannot support this app/data combination
with a 6 machine cluster then mongo isn't the right tool.

> - How often do you need to this kind of data load?

Ideally? once ...but we're still testing various data models. So more
realistically - we are iterating. A turn around of 10h hours would be
nice.

> From what you implied in your post, it seems you have already done
> it atleast one which took 24 hours.

Indeed. The last import was faster though.

pre-sharding: 6h (which in my book should take 0h)
importing: 6h
indexing: 4-5h

If only the pre-sharding (on an empty db!) would not take that long -
I would be happy.
The import and the indexing was perfectly distributed when pre-sharded.

> So is this a one time thing or a repeated thing?

In theory it's a one time thing - but this would also need to work as
a repeated thing.

> - Once you have loaded the data how do you expect to use it?

Lots of upserts and reads. Not finally sure about the ratio yet.

> - Do you know the query pattern?

Yes, we do. We are replaying a full day of user actions as fast as
possible to find the maximum throughput we can sustain.

> - Is there any subsequent append or update of the data?
> - Is the data being loaded from one or more clients or from the Mongo
> cluster?

This is only for the app itself. This thread so far focused only on
the initial import.

> So here's what I would suggest on what I know - it can be refined once
> we know more from the answers to the above questions.
> I am assuming that you
> - Select a user-defined key from your data if possible (or generate
> one if necessary).

Done.

> - Use an appropriate hashing algorithm that is able to generate a good
> distribution. I have found MD5 to give pretty good results.

This is another option we also thought about. (see above)

> - Next pre-shard your data - since we know that the key will always be
> 32 characters with each character's domain being 0-9A-F
> One algorithm can be '0xxxxx..' through '200000000..' should go to
> shard 1 and so on. Or you can do round-robin shard - by having
> '0xxxx...' to go to shard 1, '1xxxxx....' to go to shard 2, etc.
> (where xxxx = any character - so you have to create the range
> appropriately)

That's how we did the last import.

> - I would suggest creating the auxilliary indexes upfront and not in
> the end.

What do you mean by "auxilliary indexes" ?

cheers,
Torsten

Torsten Curdt

unread,

Feb 24, 2011, 4:05:35 AM2/24/11

to Eliot Horowitz, mongod...@googlegroups.com

Last night I started another import with 1.8.0-rc0.

Just out of curiosity with no pre-splitting and using _id as sharding
key. No indexes up front.

While it started OK the performance is now terrible again.

insert query update delete getmore command mapped vsize
res faults locked % idx miss % netIn netOut conn repl time
ams-db007 0 0 0 0 0 1 0m 395m
9m 0 0 0 970b 771b 4 RTR 08:58:00
ams-db008 8 0 0 0 0 1 0m 371m
8m 0 0 0 942b 771b 3 RTR 08:58:00
ams-db009 0 0 0 0 0 1 0m 234m
4m 0 0 0 62b 767b 1 RTR 08:58:00

ams-db007 0 0 0 0 0 1 0m 395m
9m 0 0 0 1k 771b 4 RTR 08:58:01
ams-db008 0 0 0 0 0 1 0m 371m
7m 0 0 0 1k 771b 3 RTR 08:58:01
ams-db009 0 0 0 0 0 1 0m 234m
4m 0 0 0 62b 767b 1 RTR 08:58:01

While the above is an exceptionally poor sample, insert performance
generally hovers at around 160 insert/s - across the cluster.

Seems like also in 1.8 the balancing is just killing the performance.
I don't see the sharding improvements I was hoping for.

cheers,
Torsten

Eliot Horowitz

unread,

Feb 24, 2011, 11:13:29 AM2/24/11

to mongod...@googlegroups.com

What order are you inserting in?

If its sequential, its going to be bad without pre-splitting for sure.

Torsten Curdt

unread,

Feb 24, 2011, 11:25:14 AM2/24/11

to mongod...@googlegroups.com, Eliot Horowitz

How can I pre-split if the objectId is automatically generated?

But even then it seems the impact of balancing is affecting the system
way too much.
I was hoping things had improved in that regard going from 1.6.5 to 1.8

Torsten Curdt

unread,

Feb 24, 2011, 4:13:02 PM2/24/11

to mongod...@googlegroups.com, Eliot Horowitz

Even worse ...the cluster shows barely CPU usage and barely any writes.

Happy to share the munin graphs if you like.

Eliot Horowitz

unread,

Feb 25, 2011, 10:03:15 AM2/25/11

to mongod...@googlegroups.com

What was the clsuter doing?
You can see from the logs for example if migrates were happening and
how long they were taking if any were running.

Eliot Horowitz

unread,

Feb 25, 2011, 10:06:38 AM2/25/11

to mongod...@googlegroups.com

You may also want to just use an md5 hash of an ObjcetId as the shard
key rather than ObjectId

Torsten Curdt

unread,

Feb 28, 2011, 6:52:22 AM2/28/11

to mongod...@googlegroups.com, Eliot Horowitz

> You may also want to just use an md5 hash of an ObjcetId as the shard key rather than ObjectId

Would love to ...but that's nothing I can just switch on, right? Needs
to happen in the app layer I presume.

>> You can see from the logs for example if migrates were happening and
>> how long they were taking if any were running.

Well, even if those are migrations... shouldn't they be as fast as
possible if the cluster is idle?

What I was wondering too - should one see migration activity in
mongostat? ...or is that activity kept out of those stats?

cheers,
Torsten

Eliot Horowitz

unread,

Feb 28, 2011, 10:33:51 AM2/28/11

to mongod...@googlegroups.com

> Would love to ...but that's nothing I can just switch on, right? Needs
> to happen in the app layer I presume.

Yes, though we're going to add an option to do it in the server:
http://jira.mongodb.org/browse/SERVER-2001

> Well, even if those are migrations... shouldn't they be as fast as
> possible if the cluster is idle?

Yes, the question is what the bound is.
Did you check disk/io/network on both sides of the migrate?

> What I was wondering too - should one see migration activity in
> mongostat? ...or is that activity kept out of those stats?

You should see a lot of commands, but there is no explicit migrate
column, nor do the inserts show up.

Reply all

Reply to author

Forward