Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Using mongo as a performance data store, what's a good shard key?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Chris Matta  
View profile  
 More options Oct 19 2012, 12:40 am
From: Chris Matta <cma...@gmail.com>
Date: Thu, 18 Oct 2012 21:40:48 -0700 (PDT)
Local: Fri, Oct 19 2012 12:40 am
Subject: Using mongo as a performance data store, what's a good shard key?

(cross-posting my question from stack over flow here, I hope that's ok :
http://stackoverflow.com/questions/12961873/whats-a-good-mongodb-shar...
 )

I'm storing performance metrics in the following schema:

{
        "_id" : ObjectId("5069d68700a2934015000000"),
        "port_name" : "CL1-A",
        "metric" : 340,
        "port_number" : "0",
        "datetime" : ISODate("2012-09-30T13:44:00Z"),
        "array_serial" : "12345"

}

Each array has 128 ports, the port names are CL1-A, CL2-A, CL3-A etc.. the
names correspond to port_numbers 0, 1, 2, 3 etc.. and I'm storing minutely
data for each metric in 1 collection per metric. I'd like to be able to
shard the collections but I'm having trouble figuring out a proper shard
key, and figuring out a unique index strategy.

Am I correct in the knowledge that the only way to enforce a unique key on
a sharded system is on the shard key? If I want to ensure a unique key on
array_serial, port_name, datetime, is that going to be an ok shard key?
Will it provide enough cardinality while still allowing for query
localization, and manageable chunks?

Or should I shard only on port_name, that way the records are evenly spread
out across the cluster? If this is the shard key do I have to keep a proxy
collection like made up like this:

{
        "_id" : ObjectId("5069d68700a2934015000000"),
        "key" : "1234,CL-1A,<dateinmiliseconds>"

}

And only write to the sharded collection if a write to the above proxy
collection succeeds? That seems like a lot of extra overhead.

Sorry, this is all a bit new and confusing.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Hillick  
View profile  
 More options Oct 19 2012, 8:43 am
From: Mark Hillick <m...@10gen.com>
Date: Fri, 19 Oct 2012 05:43:17 -0700 (PDT)
Local: Fri, Oct 19 2012 8:43 am
Subject: Re: Using mongo as a performance data store, what's a good shard key?

Hi Chris,

I answered your query (as best I could with the information available) on
StackOverflow earlier, I didn't see it here until now.

Let me know your thoughts.

Thanks

Mark


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »