Thanks, Jignesh
> --
>
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>
>
Has anyone figured out a way to make the _id consistent and unique
using a custom solution that is.
On Dec 19, 8:05 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> You can add whatever _id field you want to the object you save.
> In terms of doing auto-increment, the reason we don't do it is that
> its hard to do with sharding.
> If you want to do it, you'll need to figure out your own way to keep
> that consistent.
>
> On Sat, Dec 19, 2009 at 2:31 AM, jignesh patel
>
At the moment, I can't think of why I would need a custom _id, but
let's take twitter for example.
Here's the URL of a tweet I sent to you. http://twitter.com/sdotsen/status/6847722531
If I wanted to do the same w/ my own app, how would I go about making
the tweet unique? Besides the username field, how would I
differentiate between all my tweets.
I understand one can use fancy URLs, but that could get out of hand.
Am I missing something w/ the _ids generated by mongodb?
On Dec 19, 10:14 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> why don't you want to use the auto-generated _ids?
> they're a bit ugly, but it forces you to make urls nice anyway :)
> and this way it will work with sharding, etc..
>
On Dec 19, 10:47 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> they are 24 characters in hex- compared to the 11 of the twitter url.
> if you base64 encode its 17 characters, and its sharding safe.
>
> On Sat, Dec 19, 2009 at 10:23 PM, sdotsen <samnang....@gmail.com> wrote:
> > isn't the auto-generated _ids like 20+ characters long? Am I missing
> > something?
>
> > At the moment, I can't think of why I would need a custom _id, but
> > let's take twitter for example.
> > Here's the URL of a tweet I sent to you.http://twitter.com/sdotsen/status/6847722531
if you encode in base64,
64^17 > 2^96
so that's one option for putting it in urls
I use sequential ids for my application - but I don't care if they are
rigidly chronological or if there are gaps in the sequences from time
to time.
I use an unsharded collection of sequences (one per collection), and
have a stored javascript function that increments and returns the
value for a given sequence. I call the function in a db.eval so it is
executed atomically.
It's actually slightly more complex than that. For performance reasons
each web server requests and caches a block of 100 ids at a time
whenever they run out. This pretty closely mimics the way Oracle Grid
sequences work (or so I've been told). Gaps in the sequence do arise
if servers restart - but that doesn't happen very often.
On Dec 19, 9:15 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> object ids are 12 bytes, so
> 2^(8*12)
>
> if you encode in base64,
> 64^17 > 2^96
>
> so that's one option for putting it in urls
>
You can get a huge number of ids in 8 bytes without running into
embarrassing cuss words by using 0-9, a-z, A-Z but without the vowels
(technically I guess you can get "fcksht" but probably acceptable).
You then get 52^8 combinations which is about 53 trillion ids. With 7
letters, you can still get 1 trillion. 52 is because there are 21
letters without vowels, x2 for the capitalized versions and 10 more
numeric values.
These make pretty nice URLs
/forum/thread/b8xf23Nm
I'd recommending creating the IDs sequentially but make the actual id
in reverse order to improve sort times. This will give a nice
distribution on the first character used for sorting.
Reserving the ids 100 at a time is a good idea. I think I did mine
1000 at a time in an implementation for IDs in postgresql. There is a
good article on distributed ID generation here.
http://horicky.blogspot.com/2007/11/distributed-uuid-generation.html
Funny this got brought up because I was thinking of doing this for
MongoDB as well precisely because of the long URLs. 17 characters is
still pretty long. 7 is quite nice. 24 felt very long.
Note also that if they are sequential, there is no need to pad them so
then can start out small and they can grow to fill any any number of
characters as required. So your first 7 million ids will actually be
only 4 characters long.
Mathias, on the idea of assigning prefixes, the only problem I have
with that is coming up with a scheme to guarantee the prefixes are
unique. This probably means manually assigning them. I found grabbing
a pool of IDs was nice because it didn't matter whether I had 1 or
1000 servers and whether they the number of servers changed often or
never, the algorithm is exactly the same.
Sunny Hirai
Regards
ChX
The prefix could be based on a guaranteed unique number like the IP
address but then I'd have to give up the compactness of the ids. I'd
lose bytes storing the unique prefix. Also, multiple applications
running simultaneously on the same server (same IP address, different
applications) will need to be handled.
Going fully random increases the overhead as one would need to check
to see if the id already exists in the database before inserting.
Another benefit of using a sequential id is that you are guaranteed
uniqueness within the entire database instead of just within a
collection (compared to fully Random, this happens with Mongo's
current id generator though as well). This can eliminate certain types
of bugs and adds flexibility in building features.
As an example, let's say I had both blog posts and wiki page
collections. I write a bit of generic code that allows me to add
comments to either the posts or page collections using their id. I
could, if I wanted, use the same collection for both posts and pages
because the id of the post or page would be unique. Now if I needed to
do some sort of migration on the data, I would only have to deal with
one collection.
I think maybe one thing I forgot to mention about my use case though
is that I built all this into the data access library itself. I
probably wouldn't recommend somebody roll their own without putting it
into its own library. After it is built, using a globally unique id
generator is dead simple (something like db.get_id). It does have to
be built first though. :)
Sunny Hirai
On Dec 20, 4:18 am, Mathias Stearn <math...@10gen.com> wrote:
> @Valentin: That algorithm has a race condition. It is better to try to
> insert then call getlasterror (or use safe mode) and try a new id if it
> failed. Also, if possible, better to add a random offset rather than -1
> since collisions will be less likely. Seehttp://github.com/RedBeard0531/Mongurl/blob/master/mongurl.py#L37for an
> example.
>
> @Sunny: You could base the prefix on a number guaranteed to be unique such
> as the IP address (in most cases for servers). If you are reversing the ID
> anyway, why not just go fully random rather than sequential and minimize the
> need for synchronization between your servers?
>
> On Sun, Dec 20, 2009 at 3:16 AM, Valentin Golev <v.go...@gmail.com> wrote:
> > For my application (something like calendar of events), I generate URLs
> > using algorithm:
>
> > 1. Generate url /year/month/day/hour-minute-title, like
> > /2010/01/01/00-00-New-Year-Celebration
> > 2. Query by this url; if anything found, add -1
> > (like /2010/01/01/00-00-New-Year-Celebration-1)
> > 3. Try incrementing this number until it become truly unique
>
> > Now I'm thinking about creating another collection of { url; dbref; } to be
> > able to give an unique url to any entity in my application.
> > These urls are much nicer, I think. You can try something like this.
>
> > - Valentin Golev
>
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> > > >> >> >> > For more options, visit this group athttp://
> >> groups.google.com/group/mongodb-user?hl=en.
>
> >> > > >> >> > --
>
> >> > > >> >> > You received this message because you are subscribed to the
> >> Google Groups "mongodb-user" group.
> >> > > >> >> > To post to this group, send email to
> >> mongod...@googlegroups.com.
> >> > > >> >> > To unsubscribe from this group, send email to
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> > > >> >> > For more options, visit this group athttp://
> >> groups.google.com/group/mongodb-user?hl=en.
>
> >> > > >> > --
>
> >> > > >> > You received this message because you are subscribed to the
> >> Google Groups "mongodb-user" group.
> >> > > >> > To post to this group, send email to
> >> mongod...@googlegroups.com.
> >> > > >> > To unsubscribe from this group, send email to
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> > > >> > For more options, visit this group athttp://
> >> groups.google.com/group/mongodb-user?hl=en.
>
> >> > > > --
>
> >> > > > You received this message because you are subscribed to the Google
> >> Groups "mongodb-user" group.
> >> > > > To post to this group, send email to mongod...@googlegroups.com.
> >> > > > To unsubscribe from this group, send email to
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> > > > For more options, visit this group athttp://
> >> groups.google.com/group/mongodb-user?hl=en.
>
> >> --
>
> >> You received this message because you are subscribed to the Google Groups
> >> "mongodb-user" group.
> >> To post to this group, send email to mongod...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> --
>
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
My intension was not to change virtually unique field like "_id"
generated by mongo.
I am having an application in which i am using (human readable) id's
of my entity tables like twitter with urls many places.
How can i use another unique auto increment field for reference beside
"_id"?
e.g
category collection:
{ "_id" : ObjectId("4b29ec56641b4f75d67e029e"), "idCategory" : 1,
"value" : "PHP" }
{ "_id" : ObjectId("4b29ec56641b4f75d67e029e"), "idCategory" : 2,
"value" : "JAVA" }
{ "_id" : ObjectId("4b29ec56641b4f75d67e029e"), "idCategory" : 3,
"value" : "RUBY" }
{ "_id" : ObjectId("4b29ec56641b4f75d67e029e"), "idCategory" : 4,
"value" : "ASP" }
....
here "idCategory" field?
On Dec 21, 5:41 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> Not quite - _id can be any type, so you don't need to create a MongoId
>
> On Sun, Dec 20, 2009 at 4:55 PM, chx <chx1...@gmail.com> wrote:
> > While the thread mostly concentrated on proper _id creation it must be
> > noted that with the PHP driver once you have a $_id you need to call
> > $_id= new MongoId($_id); and $_id must be 24 hexidecimal characters
> > as stated onhttp://php.net/manual/en/mongoid.construct.php
> I am having an application in which i am using (human readable) id's
> of my entity tables like twitter with urls many places.
If you want a numeric sequence to make it more user friendly you are doing it wrong. :-) Human readable categories (to take your example) would be "Book", "Blu-Ray", "Dining Tables"; not 1, 2, 3, ...
- ask
Reason for numeric auto_increment in RDBMS serves various purposes
like uniqueness of row/record, small/clean URL, indexes and sorting.
While in case of MongoDB purpose of such auto_generated/auto_increment
field is quite different to keep each record unique throughout all
databases, collections, servers.
Now consider following scenario of database structure:
1 collection called "student" having 1 million documents/records
having 20 fields (or keys) in flat mode (i.e all information regarding
a student is stored in denormalized way as MongoDB suggests). Out of
20 fields, there are 10 fields whose values refer as numeric value to
fields of other entity collections like city, state, country, course1,
course2 etc. because it is required to manage those entities
separately and to show in list box in various forms.
Now in RDBMS, since those ref. fields are Numeric and can never be
more than 3 digits longer (in our case), it helps keep value easily
readable (I can ask my friend hey check record with this number 568),
helps RDBMS sort and search based on numerical values (as there will
be indices on them). While in mongoDB since those ref. are 24
characters long, my question is "would they effect, search, sorting,
disk-space (3 digits vs. 24 characters X for 10 fields X for 1 million
records) etc.?
I know that in mongoDB, "_id" field is mainly designed to keep each
document unique, but I just want to know would it be *effective*
solution to use those 24 characters ref. or to manage numerical digits
on own (like in RDBMS)?
On Dec 19, 6:05 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> You can add whatever _id field you want to the object you save.
> In terms of doing auto-increment, the reason we don't do it is that
> its hard to do with sharding.
> If you want to do it, you'll need to figure out your own way to keep
> that consistent.
>
> On Sat, Dec 19, 2009 at 2:31 AM, jignesh patel
>
But I would like ask what would be best strategy from following 3 in
terms of performance, storage, retrieval, maintenance?
#1 Storing object ID of entities what mongoDB generates as reference
#2 Storing manually generated numerical IDs of entities as reference
(as you have mentioned here)
#3 Entity string itself instead of reference?
Thanks
Anirudh Zala
> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> > .
> > > > For more options, visit this group athttp://
> > groups.google.com/group/mongodb-user?hl=en.
>
> > --
>
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.