Sharding stress test: some insertions failed (err:10429 setShardVersion failed)

131 views
Skip to first unread message

Shi Shei

unread,
Apr 7, 2011, 12:58:51 PM4/7/11
to mongodb-user
I'm encountering the following error when inserting a lot of documents
by several threads in parallel:

err:10429 setShardVersion failed! { "oldVersion" : { "t" : 51000 ,
"i" : 0 }, "assertion" : "assertion s/d_state.cpp:501", "errmsg" : "db
assertion failure", "ok" : 0 }

My environment:
- 3 shards each one running on a dedicated physical server
- 3 config server all running on the same physical server (different
port though)
- 1 router running on a dedicated physical server
- mongodb-linux-x86_64-1.8.0 running on every server
- java driver 2.5.3 used by the stress application
- each server has 24 cores and 48 GB RAM

The stress test application inserts documents into one sharded
collection. There are no indexes. _id is uniquely generated by the
stress application. Each document is about 1KB. 50 threads are
hammering mongo. From time to time I get the error above. I can
reproduce it at one time or another.
Any hints or suggestions?
Thanks!

Gaetan Voyer-Perrault

unread,
Apr 7, 2011, 2:15:43 PM4/7/11
to mongod...@googlegroups.com
There were a couple off issues with the mongoS that were fixed in 1.8.1.

Would you be able to retry your tests with 1.8.1?

If the error recurs, please file a ticket in JIRA. (jira.mongodb.org)
You will need to log into JIRA with your account.
If you do not have an account, you can create one for free.

======

Additionally, it looks like you are performing some form of stress testing.

It looks like you're only running one mongoS process. Based on past experience, sharded configs tend to do better with more routers (mongos processes).

You may get better throughput by running more.

- Gates


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Gaetan Voyer-Perrault

unread,
Apr 7, 2011, 5:34:34 PM4/7/11
to mongod...@googlegroups.com
Also, do you have any steps you can share to help us reproduce the issue you're seeing?

Are you hammering with "random" data?
Are there some scripts that could be run in the lab to reproduce?

On Thu, Apr 7, 2011 at 9:58 AM, Shi Shei <QTRAUR...@spammotel.com> wrote:

Shi Shei

unread,
Apr 7, 2011, 6:46:37 PM4/7/11
to mongodb-user
Thank you for your fast reply! So, I'll try v1.8.1.

Yes, it's a form of stress testing and I'm aware that I could get more
throughput when using more than 1 mongos but I wanted to test first
the worst case when only 1 mongos would survive.

I'm inserting documents of strictly the same document schema. No
fields are added or removed or altered during the stress test. However
field values change for every document. For simplicity, most field
values are random doubles or timestamps (long).
I can post an example of my test document tomorrow.

I think it's easy to reproduce:
- set up a sharded system (one sharded collection) as mentioned above
- shard key is _id
- no indexes, exept _id
- I'm running 50 threads within one JVM
- each thread uses its own unique key range to create globally unique
_id's
- key's are well distributed among the entire key range, so all shards
are equally stressed
- each thread has it's own connection to mongo
- the test stops when 50 million documents are inserted
- sometimes the first error arrives within some seconds, sometimes it
may take some minutes but it's very rare that the test terminates
without any error

If you need more information or log traces etc. just let me know.
Thanks!


On Apr 7, 11:34 pm, Gaetan Voyer-Perrault <ga...@10gen.com> wrote:
> Also, do you have any steps you can share to help us reproduce the issue
> you're seeing?
>
> Are you hammering with "random" data?
> Are there some scripts that could be run in the lab to reproduce?
>

Gaetan Voyer-Perrault

unread,
Apr 7, 2011, 8:22:28 PM4/7/11
to mongod...@googlegroups.com
How much Java code is involved?
Is it possible to post the code itself? (say via a gist?)

- Gates

Shi Shei

unread,
Apr 8, 2011, 4:30:07 AM4/8/11
to mongodb-user
Well, I installed on all servers mongodb-linux-x86_64-1.8.1.
The error message has slightly changed:

err:assertion s/strategy_shard.cpp:175

I'll try to clean-up the java code to put it online asap.

My documents have the following strucure. Only the values change for
every document:

{
"_id": 100,
"han": "",
"mappedManufacturerId": 8.8241608370483e+18,
"rebuild": true,
"location": "ANP",
"clipboardId": -8.5317594529609e+17,
"shippingCosts": {
"2": {
"grp": 0.87988732138365,
"etx": 0.34920080715597
},
"1": {
"cab": 0.87478302949456,
"cod": 0.97115492919134
}
},
"downloadable": false,
"rank": -951318255,
"idealoProductIds": [
-4.1126762472936e+18,
-5.2936760236935e+18,
-8.009793774518e+17
],
"lastEditionTool": "last edition tool: 292459749548907687",
"shopId": -7.8590571752773e+18,
"categoryString": "shop category name",
"description": {
"2": "two",
"1": "one"
},
"clickCount": {
"2": 657634885,
"1": 656406163
},
"bundle": true,
"flagsLocked": true,
"offerTS": "Mon, 28 Mar 2011 11:35:26 +0200",
"offerTitle": {
"2": "two",
"1": "one"
},
"contract": false,
"voucherCode": {
"2": "two",
"1": "one"
},
"brandSearchtext": "manufacturer text: -1350481523494814372",
"bokey": "aRandomBokey_-6692115854028462425",
"ignoredInWatchedProducts": false,
"ean": "ean: -5121694881552227892",
"quantity": -1747899725,
"price": {
"2": 0.46634895461405,
"1": 0.37555621918319
},
"expireDate": "Mon, 28 Mar 2011 11:35:26 +0200",
"url": {
"2": "two",
"1": "one"
},
"used": true,
"basePrice": {
"2": "two",
"1": "one"
},
"whyFoundInfo": "product crawl reason: 7556396522554281396",
"clipboardChangeTime": "Mon, 28 Mar 2011 11:35:26 +0200",
"shippingCostInfo": {
"2": "two",
"1": "one"
},
"asin": "asin: 4588715988413285996",
"lastEditor": "last editor: -7559178330682038611",
"deliveryString": {
"2": "two",
"1": "one"
},
"version": 0,
"currency": {
"2": "two",
"1": "one"
},
"shopDataFP": "finger print: -8901687323528589466",
"lastShopDataChange": "Mon, 28 Mar 2011 11:35:26 +0200",
"categoryBokey": "shop category bokey: -6524082929670383501",
"iamsureType": 1420315045,
"lastClipboardId": 3.5221401261158e+18,
"lastModified": "Mon, 28 Mar 2011 11:35:26 +0200",
"priceDeviation": {
"2": 0.3014484490605,
"1": 0.49124945556585
},
"onlineProductIds": [
7.1966577424882e+18,
5.1616023316237e+17
],
"bulk": false,
"smallPicture": {
"2": "two",
"1": "one"
},
"clipboardUser": "clipboard user: 2119358643810916732",
"clipboardChangeEditor": "last clipboard editor:
-7398359804329819924",
"clusterId": -7.2559779366796e+18,
"deliveryStatus": {
"2": "MEDIUM",
"1": "LONG"
}
}


On Apr 8, 2:22 am, Gaetan Voyer-Perrault <ga...@10gen.com> wrote:
> How much Java code is involved?
> Is it possible to post the code itself? (say via a gist?)
>
> - Gates
>

Shi Shei

unread,
Apr 8, 2011, 10:43:21 AM4/8/11
to mongodb-user
I've built a standalone project which I can share with you. As I'm
already in e-mail contact with Gerry Treacy, I suggest that I send him
the stress test project so you can easily reproduce the error.

Here is the error trace that I get running mongo 1.8.0:

2011-04-08 16:10:15,231 INFO [main] (Stresser.java:80) - Status:
Offers done: 46921 All offers/sec: 9384 Last offers/sec: 9384
2011-04-08 16:10:20,232 INFO [main] (Stresser.java:80) - Status:
Offers done: 136699 All offers/sec: 13669 Last offers/sec: 17955
2011-04-08 16:10:23,838 ERROR [Thread-100] (OfferStorageMongo.java:77)
- Error while inserting new offer 49009406
java.lang.IllegalArgumentException: insertNewOffer LastError: 10429
setShardVersion failed! { "oldVersion" : { "t" : 16000 , "i" : 3 },
"assertion" : "assertion s/d_state.cpp:501", "errmsg" : "db assertion
failure", "ok" : 0 }
at
de.idealo.mongo.offerstore.OfferStorageMongo.insertNewOffer(OfferStorageMongo.java:
72)
at de.idealo.mongo.stress.UpdateWorker.batchInsert(UpdateWorker.java:
105)
at de.idealo.mongo.stress.UpdateWorker.run(UpdateWorker.java:354)
at java.lang.Thread.run(Thread.java:662)
2011-04-08 16:10:25,866 INFO [main] (Stresser.java:80) - Status:
Offers done: 272542 All offers/sec: 18169 Last offers/sec: 27168

The app continues after the error. Mongo is not dead either.

Running mongo 1.8.1. I get this one:

2011-04-08 16:21:20,942 INFO [main] (Stresser.java:80) - Status:
Offers done: 3772478 All offers/sec: 20391 Last offers/sec: 23749
2011-04-08 16:21:22,126 ERROR [Thread-6] (OfferStorageMongo.java:77) -
Error while inserting new offer 2195401
java.lang.IllegalArgumentException: insertNewOffer LastError:
assertion s/strategy_shard.cpp:175
at
de.idealo.mongo.offerstore.OfferStorageMongo.insertNewOffer(OfferStorageMongo.java:
72)
at de.idealo.mongo.stress.UpdateWorker.batchInsert(UpdateWorker.java:
105)
at de.idealo.mongo.stress.UpdateWorker.run(UpdateWorker.java:354)
at java.lang.Thread.run(Thread.java:662)
2011-04-08 16:21:25,943 INFO [main] (Stresser.java:80) - Status:
Offers done: 3913481 All offers/sec: 20597 Last offers/sec: 28200
2011-04-08 16:21:31,385 INFO [main] (Stresser.java:80) - Status:
Offers done: 4049774 All offers/sec: 20662 Last offers/sec: 27258
2011-04-08 16:21:36,386 INFO [main] (Stresser.java:80) - Status:
Offers done: 4183116 All offers/sec: 20811 Last offers/sec: 26668

Scott Hernandez

unread,
Apr 8, 2011, 10:50:19 AM4/8/11
to mongod...@googlegroups.com
Please create a "community private" jira issue and attach the code and
synopsis/repo-steps there.

Shi Shei

unread,
Apr 8, 2011, 11:11:57 AM4/8/11
to mongodb-user
Done:
https://jira.mongodb.org/browse/SUPPORT-112

On Apr 8, 4:50 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> Please create a "community private" jira issue and attach the code and
> synopsis/repo-steps there.
>
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages