Thanks for your responses.
On May 29, 2013, at 00:40, Martin Wawrusch wrote:
> Why not simply use base56 encoding of the object id?
I haven't tried base56 (is there a module or built-in function you'd recommend for that?) but a mongodb objectid begins with 4 bytes of timestamp, which will be very similar for large periods of time, and then 3 bytes of machineid and 2 bytes of processid, which will be identical for large periods of time, so a base-anything encoding of such ids would tend to look quite similar. I'd like my user-facing ids to look "more random" than that.
On May 29, 2013, at 00:48, Stuart P. Bentley wrote:
> ```js
> modelSchema.virtual('hashid').get(function () {
> var oidhex = this._id.toHexString();
> return hashids.encrypt(parseInt(oidhex.slice(0,12),16),parseInt(oidhex.slice(12),16));
> });
>
> modelSchema.virtual('hashid').set(function (hashid) {
> var halves = hashids.decrypt(hashid);
> var zeroes = '000000000000';
> this._id = new ObjectID((zeroes+halves[0]).slice(-12)+(zeroes+halves[1]).slice(-12));
> });
> ```
Thanks, this is along the lines I was originally thinking. I just have to train myself to set and get the "hashid" field instead of the "id" field. I'll use this for now. Since I may need a hashid on multiple models, I made a function to add the virtuals which I can call when defining each model.
I was hoping for actual real-world experience though. How do I find a database record with a hashid? To find by objectid, I just do:
Thing.findOne({_id: req.params.thingid}, function(err, thing) {...});
It seems like even if finding on a virtual field works, it would be slow, since the index would be on the id, not the hashid. And as it turns out I can't get it to work; there's no error, it just doesn't return any results. So instead I've done:
Thing.findOne({_id: fromHashId(req.params.thingid)}, function(err, thing) {...});
where fromHashId does like your virtual('hashid').get() function.
On May 29, 2013, at 00:37, George Snelling wrote:
> FWIW, we scratched our head over the same problem, gave up, and wrote our own _id generator. It's a glorified timestamp with a big random seed after milliseconds part, formatted to be read by humans and look reasonable in urls. Since the high-order part increases with time, it shards well. We found it much easier to simply check for a unique index violation error on insert and retry with a new key whenever that happens than to solve the problem you're trying to solve.
That's good to know, thanks. What are other people actually using for their short ids, regardless of backend storage system? Are you generating them yourself? How are you dealing with collisions? Has it been a problem?
I want the impossible! :) I want short ids that people posting urls to twitter will appreciate, but I don't want collisions or the overhead of verifying that there aren't any.