How to shard existing database?

1,547 views
Skip to first unread message

praveenj...@gmail.com

unread,
Dec 13, 2013, 4:31:32 AM12/13/13
to mongod...@googlegroups.com
Hi All,

Started working around sharding in mongodb and already posted few of my question here.   I don't understand what is the purpose of config server when we do sharding?

I already have a database, and I would like to shard a particular collection named "test" into shard1(A-K) and shard2(L-Z).  I tried a lot but I'm unable to shard an existing database.  I have already posted question related to it.  But still I'm not able to understand how to configure.

Please share your suggestions?


Thanks,

Praveen Jeganathan.



Asya Kamsky

unread,
Dec 15, 2013, 4:08:42 AM12/15/13
to mongodb-user
I recommend you find a sharding tutorial and follow it exactly step-by-step.

The configdb lives on the config servers and it's where the information about how your data is partitioned and distributed across the shards is kept.   If you didn't have that information, mongos processes wouldn't know which shard to go to for which data.

Asya



--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
 
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

praveenj...@gmail.com

unread,
Dec 18, 2013, 4:29:36 AM12/18/13
to mongod...@googlegroups.com
Thanks @Asya.  I took few more tutorials and I have made some progress.  Here what I have done:

1.  Actual database in localhost:30000 ------------- I used this as Shard1
2. another mongod  in localhost:30001 ------------- Shard2
3. another mongod  in localhost:30002 ------------- Shard3

Added 1 config server and Router pointing the configdb

Now in Router,
1. mongo admin
2. sh.addShard("localhost:30000")..30002                               //Added 3 shards successfully
3. sh.enableSharding("myDatabasename in shard1")               //Enabled successfully
4. sh.shardCollection(...)                                                       //Shard Collection successfully

I checked each shards and I am able to see that it is split-ed in ascending order like

shard key starting with A-D in shard2 as chunk 1
shard key starting with E-H in shard3 as chunk 1
shard key starting with I-O in shard1 as chunk 1
shard key starting with P in shard1 as chunk 2                        //had upto P only


I tried in shard1 and inserted a new  record with sharding key starting with E, which is getting inserted in shard1 itself whereas it has to be inserted in shard3.

Now I need to insert a new record means in which mongo instance I have to insert?  

Thanks
Praveen Jeganathan.

Asya Kamsky

unread,
Dec 19, 2013, 2:16:22 AM12/19/13
to mongodb-user
You should only communicate with sharded cluster through 'mongos'.

That means you do NOT insert anything into shard1 or shard3 - you only insert into mongos and it then figures out where the record needs to go.

Asya



--

praveenj...@gmail.com

unread,
Dec 19, 2013, 4:17:10 AM12/19/13
to mongod...@googlegroups.com
Yes @Asya.  I tried to do the same insert in mongos But I'm facing error like 

error preparing documents for insert :: caused by :: tried to insert object with no valid shard key for { aff: 1.0 } : { _id: ObjectId('52b28a305bbcf6ab

aff is my shard key. in this collection db.Col.insert({....., "aff": "Ell, SA",...}) and the above is the error I got

This is a sample of that collection Col.

{
"data": {
        ..............
"aff": "Effy, DA",
             ..............
}
} 

This is how I set shard collection sh.shardCollection("db.Col", {aff: 1}).


Thank you,
Praveen

Asya Kamsky

unread,
Dec 19, 2013, 10:05:12 AM12/19/13
to mongodb-user
If you are getting that error, that means there is a record you are trying to insert which does *NOT* have the shard key set.
Are you sure the record you show is the one that is getting an error?   Can you try to insert it though mongos and show the full attempt, from insert to getting the error (with full document visible)?

Asya



--

praveenj...@gmail.com

unread,
Dec 20, 2013, 2:17:05 AM12/20/13
to mongod...@googlegroups.com
Thanks @Asya for those pointers.  Here is my full insert command.
 
db.BaseEvent.insert({
    "data": {
        "date": "12/17/2011 3:23:21 PM",
        "Status": "New",
        "guid": "2b6f0cc904d137be2e1730235f5664094b831186",
        "Email": "cit...@gmail.com",
        "user": "Joe",
        "loc": "Intersection of State and Main",
        "lat": "43.012658",
        "lon": "-87.837492",                
        "affi": "Eff, GS",
        "urls": [
            {                
                "url": "https://host.com/img1.jpg"
            },
            {                
                "url": "https://host.com/audio.mp3"
            }
        ]
    }
})

 Below is the error message:

error preparing documents for insert :: caused by :: tried to insert object with
 no valid shard key for { aff: 1.0 } : { _id: ObjectId('52b3edac0b30c055
7c514e36'), data: { date: "12/17/2011 3:23:21 PM", Status: "New", guid: "2b6f0cc
904d137be2e1730235f5664094b831186", Email: "cit...@gmail.com", user: "Joe", loc
: "Intersection of State and Main", lat: "43.012658", lon: "-87.837492", affilia
tion: "Erie, PA", urls: [ { url: "https://host.com/img1.jpg" }, { url: "https://

This is how I set shard collection sh.shardCollection("db.Col", {aff: 1}).
I'm able to do read operation, whereas stuck in write operation as mentioned above.

Please shed some light on this issue.

Thanks,
Praveen

praveenj...@gmail.com

unread,
Dec 20, 2013, 2:23:56 AM12/20/13
to mongod...@googlegroups.com
Sorry @Asya a smallspelling mistake,

db.BaseEvent.insert({
    "data": {
        "date": "12/17/2011 3:23:21 PM",
        "Status": "New",
        "guid": "2b6f0cc904d137be2e1730235f5664094b831186",
        "Email": "cit...@gmail.com",
        "user": "Joe",
        "loc": "Intersection of State and Main",
        "lat": "43.012658",
        "lon": "-87.837492",                
        "aff": "Eff, GS",                       //remove i
        "urls": [
            {                
                "url": "https://host.com/img1.jpg"
            },
            {                
                "url": "https://host.com/audio.mp3"
            }
        ]
    }
})
------------------------------------------------------------------------------------------------
In another try like changing the above insert as 

db.BaseEvent.insert({
        "date": "12/17/2011 3:23:21 PM",           //Remove the data: {}
        "Status": "New",
        "guid": "2b6f0cc904d137be2e1730235f5664094b831186",
        "Email": "cit...@gmail.com",
        "user": "Joe",
        "loc": "Intersection of State and Main",
        "lat": "43.012658",
        "lon": "-87.837492",                
        "aff": "Eff, GS",
        "urls": [
            {                
                "url": "https://host.com/img1.jpg"
            },
            {                
                "url": "https://host.com/audio.mp3"
            }
        ]
})

It is inserting the data to shard 3 as expected.  

But this is not the right way.  Now I guess it is easy for you to share suggestion.

Thanks,
Praveen.

Lefyer Li

unread,
Dec 20, 2013, 3:22:22 AM12/20/13
to mongod...@googlegroups.com
Hi,

are you spell the right column name for shard key?? from you error, it was "affiliation" not "aff"
affiliation: "Erie, PA"


Best Regards,
Feifei


--

praveenj...@gmail.com

unread,
Dec 20, 2013, 4:19:50 AM12/20/13
to mongod...@googlegroups.com
@Lefyer Oh I have placed the wrong error message.  I'm trying with 2 different databases so that is the reason for the error.  

In one database I had "affiliation" as shard key whereas in other a shorter version "aff".  That is the reason for confusion.

sh.shardCollection("db.Col", {aff: 1})
----------------------------------------------------------------------
db.BaseEvent.insert({
        "date": "12/17/2011 3:23:21 PM",           //Remove the data: {}
        "Status": "New",
        "guid": "2b6f0cc904d137be2e1730235f5664094b831186",
        "Email": "cit...@gmail.com",
        "user": "Joe",
        "loc": "Intersection of State and Main",
        "lat": "43.012658",
        "lon": "-87.837492",                
        "aff": "Eff, GS",
        "urls": [
            {                
                "url": "https://host.com/img1.jpg"
            },
            {                
                "url": "https://host.com/audio.mp3"
            }
        ]
})
---------------------------------------------------------------------------------------------------------------
error preparing documents for insert :: caused by :: tried to insert object with
 no valid shard key for { aff: 1.0 } : { _id: ObjectId('52b3edac0b30c055
7c514e36'), data: { date: "12/17/2011 3:23:21 PM", Status: "New", guid: "2b6f0cc
904d137be2e1730235f5664094b831186", Email: "cit...@gmail.com", user: "Joe", loc
: "Intersection of State and Main", lat: "43.012658", lon: "-87.837492", aff: 
"Eff, GS", urls: [ { url: "https://host.com/img1.jpg" }, { url: "https://


It is not working  :(, sorry for the confusion

Thanks 
Praveen 

praveenj...@gmail.com

unread,
Dec 21, 2013, 5:42:54 AM12/21/13
to mongod...@googlegroups.com
@Asya, Gotcha! I have found the problem :).  Actually it is not a problem :( at all. As I already said I was unable to insert keeping aff inside data object.  Before inserting, the data is processed by java play framework and then it is inserted.    

Sorry for wasting you time :(

Thanks and Regards,
Praveen.

Asya Kamsky

unread,
Dec 21, 2013, 10:15:11 AM12/21/13
to mongodb-user
That's okay - if the document is in the form { Data: { ... } } then you can make the shard key "Data.aff" but then you have to make sure that this is the form of your documents always - and I recommend not switching between "aff" and "affiliation", that also causes confusion.

Lastly, I'm not certain that this looks like a very good shard key - what if there is a very large number of documents associated with a single value of "aff"?   It would then not be splittable and auto-balancing process will have more trouble as jumbo chunks (those bigger than chunkSize/64MB) can't be moved/migrated to another shard.

Asya



--

praveenj...@gmail.com

unread,
Dec 23, 2013, 6:12:22 AM12/23/13
to mongod...@googlegroups.com
@Asya, you're right about the Shard key.  When I tried the sharding with 2 instance(previously I was using 3 shard server) it took a long time to split.    

{
    "date": "12/17/20113: 23: 21PM",
    "Status": "New",
    "guid": "2b6f0cc904d137be2e1730235f5664094b831186",
    "Email": "cit...@gmail.com",
    "user": "Joe",
    "loc": "IntersectionofStateandMain",
    "lat": "43.012658",
    "lon": "-87.837492",
    "affiliation": "Ess"
}

These are the mandatory items in the json, so I basically choose the affiliation(also I have created an index).  I have searched and found that we can use two items in shard key, kindly advice on how can I choose the shard?  

Thank you,
Praveen.

Asya Kamsky

unread,
Dec 23, 2013, 11:08:48 AM12/23/13
to mongodb-user
There is no way to choose a good shard key without understand the data distribution and the read and write patterns.
There are many articles on the subject, see:


If you go through the sections on that page, you will have a much better idea of how to choose a good shard key.

Asya



--

Anbu Cheeralan

unread,
Feb 21, 2014, 7:38:16 PM2/21/14
to mongod...@googlegroups.com
Praveen - What is the Mongo module you use with Play 2 for Java? 
Reply all
Reply to author
Forward
0 new messages