generate keys

847 views
Skip to first unread message

bsr

unread,
Dec 31, 2012, 10:48:42 AM12/31/12
to golan...@googlegroups.com
Hello,

It is rather a general question. I want to generate unique serial identifiers / keys (alpha numeric string, preferably 10 -12 char long). I saw some discussion about UUIDs and rsc has a simple program (http://play.golang.org/p/vmAiGTvKaK), which prints like

4c5eca36-197d-48ba-850e-17c15a1b509e
95707b61-d8de-47d7-a8a0-ec4e9ec851bb
c09e8328-d7c8-4677-b165-843c0951a35a
...

Also saw a good library for uuid here http://code.google.com/p/go-uuid/

But UUID may skip many of the usable combinations, and rather too long.
So I may persist the last value, and 'increment' by one to get the next in the sequence.
Not sure how it can be done with a string to be within the range of alpha numeric, and char length.
something like

0000000001
0000000002
....
100000000a
....
s42424242x
...
zzzzzzzzzz



Tamás Gulácsi

unread,
Dec 31, 2012, 11:21:40 AM12/31/12
to golan...@googlegroups.com
Uuid's value is uniqueness and cost (cheap, no central repi needed).
It is 16 bytes, can be encoded shorter iy you use bigger alphabet, not just hex (encoding/base64, encoding/ascii85 for example).

Kevin Malachowski

unread,
Dec 31, 2012, 11:21:58 AM12/31/12
to golan...@googlegroups.com
If you just wanted unique Ids for the current running instance, you could spin up a go routine that is in charge of keeping track of the "current" uuid and which communicates back the sequential uuids via a (slightly buffered) channel. Therefore any function that needs a uuid could just query this channel (or a method that accessed a package-level variable of the chan) If you needed persistent uuids you could always store the last-highest uuid somewhere and at the start of your program init the uuid go routine to start with that number+1

Now, onto your actual question: if you wanted uuids to have the characters in [0-9a-z] you could always store the current uuid as a big integer (see package big, I believe) and then generate the text string as a base 36 number, padding it with 0s when needed. I don't know if this conversion logic exists in go's standard packages, but a google search will quickly reveal the algorithm if you don't know it already.

bsr

unread,
Dec 31, 2012, 1:08:14 PM12/31/12
to golan...@googlegroups.com
Thanks Kevin, Tamas.

Since UUID is too long for my use case, I was trying to generate as http://play.golang.org/p/PXTwhW3LtZ (it wont run in playground, but locally)
this is based on the example in rand package.
How likely to have collision for 10 byte random generator.
Also, I noticed "==" in all the string. Why is that?

go run test.go 
viPjH0N3TLvqIg==
RpDe8CwKSjL8Aw==

if I set c = 8

iCdolgiZQIc=
4crUUh2ApRc=
8Fgj32SIHKM=
fsvppz53Y9I=

so I guess something to do with the length.

bsr

unread,
Dec 31, 2012, 1:36:57 PM12/31/12
to golan...@googlegroups.com
Well, about == , I saw the reason here http://stackoverflow.com/questions/6916805/why-base64-encoding-string-have-sign-in-the-last
Please comment on the chance of collision. thanks 

Dustin Sallings

unread,
Dec 31, 2012, 1:35:25 PM12/31/12
to golan...@googlegroups.com
bsr <bsr...@gmail.com> writes:

> Since UUID is too long for my use case, I was trying
> to generate as http://play.golang.org/p/PXTwhW3LtZ (it wont run in
> playground, but locally)
> this is based on the example in rand package.
> How likely to have collision for 10 byte random generator.

Sequential numbers will collide less frequently.

> Also, I noticed "==" in all the string. Why is that?

Because you base64 encoded it.

--
dustin

Patrick Mylund Nielsen

unread,
Dec 31, 2012, 1:41:47 PM12/31/12
to Dustin Sallings, golang-nuts



--
dustin

--



Kevin Malachowski

unread,
Dec 31, 2012, 5:59:07 PM12/31/12
to golan...@googlegroups.com
For a perfectly random bit distribution, for 10 8-bit values the chance if collision is 2^-80, or a very small number. In practice, though, it will be higher and depends on your random number generator.

Joey Geralnik

unread,
Dec 31, 2012, 6:56:05 PM12/31/12
to Kevin Malachowski, golan...@googlegroups.com
2^-80 is the chance that two specific hashes will collide. You also have to take into account the birthday paradox since you want to figure out what the chance is that any two hashes will collides.

Using the first approximation from wikipedia and assuming you have 1,000,000 hashes, the chance of collision is approx 4e-13 (2^-80 is 8e-25).

With 10 million hashes the chance of collision goes up to 4e-11, with 10 billion hashes the chance goes up to 4e-5.

By the time you get up to 1 trillion hashes the chance of collision goes all the way up to 34%, and by 3 trillion goes up to 98%.

So in short, as long as you have less than a few billion hashes and assuming that your random number generator works well (which may be a big assumption) you should be fine.

--Joey


On Tue, Jan 1, 2013 at 12:59 AM, Kevin Malachowski <nifta...@gmail.com> wrote:
For a perfectly random bit distribution, for 10 8-bit values the chance if collision is 2^-80, or a very small number. In practice, though, it will be higher and depends on your random number generator.

--



Joey Geralnik

unread,
Dec 31, 2012, 6:56:59 PM12/31/12
to Kevin Malachowski, golan...@googlegroups.com
Wikipedia link for those curious: http://en.wikipedia.org/wiki/Birthday_problem#Approximations
--Joey

Kevin Malachowski

unread,
Dec 31, 2012, 7:07:52 PM12/31/12
to golan...@googlegroups.com
Gah you're completely right. I'm disappointed I made that mistake: I'm usually the one who brings that point up in discussions among my friends. Good catch.

bsr

unread,
Dec 31, 2012, 8:17:15 PM12/31/12
to golan...@googlegroups.com
Joey,

thank you for very informative explanation.
My requirement is in few million range, and can live with very rare collision.
This simple implementation uses Reader, http://play.golang.org/p/PXTwhW3LtZ

From the docs:
Reader is a global, shared instance of a cryptographically strong pseudo-random generator. On Unix-like systems, Reader reads from /dev/urandom. On Windows systems, Reader uses the CryptGenRandom API.

I trust the go core dev, and assume this would suffice me :-)

thanks again.

Nate Finch

unread,
Jan 2, 2013, 2:18:41 PM1/2/13
to golan...@googlegroups.com
Do you need IDs to be able to be generated by two different programs / instances and still be unique? This is the main reason to use UUIDs - because you can make them to your heart's content in as many places as you want, and they'll always be unique. If you only need them to be unique to a single process, then they're kind of overkill - they are so huge to make the keyspace huge, to reduce the liklihood of random collisions.

However, if it's just a single program that is creating these IDs, there's no reason you can't use a single uint64 and just increment it whenever you make a new id.

Something like this:

func IdGenerator(lastId uint64) <-chan (uint64) {
next := make(chan uint64)
id := lastId + 1
go func() {
for {
next <- id
id++
}
}()
return next
}

then you can use it like this:

idChan := IdGenerator(lastId)
...
object.id = <- idChan

Channel synchronization will make sure no two objects get the same id.

This doesn't take care of storing the last-used ID, which you'll need if you want to maintain ids between program sessions (not sure if that's necessary for your project or not). 

Dmitry Chestnykh

unread,
Jan 2, 2013, 5:21:34 PM1/2/13
to golan...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages