Increase in replication time due to increase in opLog size

98 views
Skip to first unread message

Gambitg

unread,
May 18, 2011, 11:43:42 PM5/18/11
to mongodb-user
I did some tests and it seems like that the time to replicate
increases as size of opLog increases.
For example a write (with w=ALL) that takes x seconds on a small opLog
will take about 1.3x seconds on a large size of opLog. There is no
catchup happening, just a huge opLog is impacting the write speed.

Is this true in general ? What is the reason and any thumb-rule for
the amount of impact ?

Thanks.

Eliot Horowitz

unread,
May 19, 2011, 2:06:51 AM5/19/11
to mongod...@googlegroups.com
The size of the oplog shouldn't cause slowdowns like that.
Can you show how you're measuring?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Gambitg

unread,
May 19, 2011, 10:25:27 AM5/19/11
to mongodb-user
See my results at: http://bit.ly/jkscTr

I am running a single thread of writes. Each write is of 100KB and I
measure the time for 10K sequential writes. I repeat this pattern till
I fill the oplog (each run is about 1G, 100KB * 10000).
I cap my oplog at 50GB. I also cap the collection I am writing too (as
I have limited space).
It is Replica Set of 3 machines and I set safe=True,w=3. I.e. the
write time is measured as time required to all the replicas.

You can see distinctly that when the size of the oplog reaches 50GB,
the write times take a jump and stay high.

Eliot Horowitz

unread,
May 19, 2011, 10:40:38 AM5/19/11
to mongod...@googlegroups.com
Can you send the code you are using for this?

Gambitg

unread,
May 19, 2011, 11:24:14 AM5/19/11
to mongodb-user
The script is at: http://dl.dropbox.com/u/7258508/mongo_stress.py
Cassandra has a stress script. I basically took that and modified it
to work with MongoDB.
You will find some parts in the script that does not apply to MongoDB.

The options I ran the script with are: -n 10240 -t 1 -c 1 -S 100000 -d
123 -o insert -k

-n 10240 : Implies 10240 total writes
-t 1: Implies only one thread.
-c 1: ignore it. does not matter for MongoDB.
-S 100000: About 100KB document size.
-d 123: Ignore. Does not matter for MongoDB.
-o insert: insert operation.
-k: continue even if errors (btw, I did not get any errors)

I run the above in a loop:
for i in `seq 1 100`;do
python mongo_stress.py -n 10240 -t 1 -c 1 -S 100000 -d 123 -o insert -
k >> op.txt
done

Captured the total time for each 1GB write (i.e. each 10240 writes)
and plotted it.

Adam Fields

unread,
May 19, 2011, 1:48:23 PM5/19/11
to mongod...@googlegroups.com

On May 19, 2011, at 11:24 AM, Gambitg wrote:

> The script is at: http://dl.dropbox.com/u/7258508/mongo_stress.py
> Cassandra has a stress script. I basically took that and modified it
> to work with MongoDB.
> You will find some parts in the script that does not apply to MongoDB.
>
> The options I ran the script with are: -n 10240 -t 1 -c 1 -S 100000 -d
> 123 -o insert -k
>
> -n 10240 : Implies 10240 total writes
> -t 1: Implies only one thread.
> -c 1: ignore it. does not matter for MongoDB.
> -S 100000: About 100KB document size.
> -d 123: Ignore. Does not matter for MongoDB.
> -o insert: insert operation.
> -k: continue even if errors (btw, I did not get any errors)
>
> I run the above in a loop:
> for i in `seq 1 100`;do
> python mongo_stress.py -n 10240 -t 1 -c 1 -S 100000 -d 123 -o insert -
> k >> op.txt
> done
>
> Captured the total time for each 1GB write (i.e. each 10240 writes)
> and plotted it.
>

What kind of hardware did you do this test on? Which OS / mongo version?

Gambitg

unread,
May 20, 2011, 1:47:21 PM5/20/11
to mongodb-user
Amazon AWS. 3 large instances in the same AZ with EBS disks.
Linux/1.8.1

Adam Fields

unread,
May 22, 2011, 12:05:52 PM5/22/11
to mongod...@googlegroups.com

On May 20, 2011, at 1:47 PM, Gambitg wrote:

> Amazon AWS. 3 large instances in the same AZ with EBS disks.
> Linux/1.8.1
>
> On May 19, 1:48 pm, Adam Fields <fie...@street86.com> wrote:
>> On May 19, 2011, at 11:24 AM, Gambitg wrote:
>>
>>

This is a production system running at peak capacity so I can't do an independent stress test, but I can now confirm this anecdotally.

I had an oplog size of 100GB, and I dropped it down to 40GB two days ago. That machine is now performing noticeably better with a smaller oplog (and no other changes I'm aware of). If I had to estimate, I'd say overall performance is up by about 50%, since the actual throughput metrics of the machine aren't solely dependent on write performance. This also seems to have no effect on whether the primary can serve up the oplog fast enough to the secondary - this machine is still completely unable to participate in replication under load.

I'd say this is worth investigating further.

Reply all
Reply to author
Forward
0 new messages