[ANN] clj-uuid: thread-safe, performant unique identifiers

423 views
Skip to first unread message

danl...@gmail.com

unread,
Feb 16, 2015, 8:25:17 PM2/16/15
to clo...@googlegroups.com
Hello Clojurians,

I've just been polishing my modest library, clj-uuid <http://danlentz.github.io/clj-uuid/> and would like to invite everyone to have a look if such a thing might be of interest. 

What is it?

clj-uuid is a Clojure library for generation and utilization of UUIDs (Universally Unique Identifiers) as described by RFC-4122. This library extends the standard Java UUID class to provide true v1 (time based) and v3/v5 (namespace based) identifier generation. Additionally, a number of useful supporting utilities are provided to support serialization and manipulation of these UUIDs in a simple, efficient manner.

Why is it useful?

The JVM UUID class only provides a constructor for random (v4) and (non-namespaced) pseudo-v3 UUID's. Where appropriate, this library does use the internal JVM UUID implementation. The benefit with this library is that clj-uuid provides an easy way to get v1 and true namespaced v3 and v5 UUIDs.  v1 UUIDs are really useful because they can be generated faster than v4's as they don't need to call a cryptographic random number generator.  v5 UUID's are necessary because many of the interesting things that you can do with UUID's require namespaced identifiers.



Best,
Dan Lentz

David Sargeant

unread,
Feb 16, 2015, 10:30:54 PM2/16/15
to clo...@googlegroups.com
Looks good.  Thanks for sharing.

David

Francis Avila

unread,
Feb 17, 2015, 1:06:11 AM2/17/15
to clo...@googlegroups.com
This is nice to have, thank you! uuid v5 generation seems to be something I reimplement over and over, but which is never big enough for a library. I would like to stop doing that and just include your library in the future.

However, I think your v3/v5 implementations need much more control over canonicalization to guarantee consistent uuid generation on different machines. Right now you just turn a string into bytes with String.getBytes() and feed that to a hash algorithm. The bytes generated are going to depend on the platform charset, and there's no way to just feed plain bytes in to get around this issue because these functions demand a string.

I think you should add a UuidNameBytes protocol and implement a sensible default encoding for strings (say UTF-8), and some implementations for at least byte arrays and uuids. Then your v3/v5 functions can accept anything "byte-able" as a name argument. I've done something like this before and would be happy to submit a pull request if you are interested.

Finally, this is just a style point, but you use :use a fair amount in your namespaces. :use is an anti-pattern in Clojure and avoided because it obscures where the functions come from. (I had a lot of trouble reading your digest namespace because of it!)

danl...@gmail.com

unread,
Feb 17, 2015, 3:48:04 AM2/17/15
to clo...@googlegroups.com
Adding a UUIDNameBytes protocol is an excellent idea. That is definitely something I will look at doing.

Thanks!

Steven Deobald

unread,
Feb 17, 2015, 7:24:33 AM2/17/15
to clo...@googlegroups.com
It's always a pleasure to see someone fill these gaps. :) Thanks Dan.

Steven Deobald --  -- nilenso.com

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

danl...@gmail.com

unread,
Feb 18, 2015, 9:37:16 AM2/18/15
to clo...@googlegroups.com
These were really good suggestions.  As you mentioned, I implemented a UUIDNameBytes Protocol with a default representation of UTF8 for strings and will directly pass-through byte-array local-part for use in v3/v5. 

Avi Avicenna

unread,
Feb 19, 2015, 9:49:06 AM2/19/15
to clo...@googlegroups.com
Thank you for providing sequential UUID. Looking forward to uuid library that provides squuid.

Yours,
Avicenna

danl...@gmail.com

unread,
Feb 22, 2015, 2:41:46 PM2/22/15
to clo...@googlegroups.com
I did some work to reduce consing and our generation of v1 (time-based) UUID's using clj-uuid are now about 40% faster than invoking the JVM's java.util.UUID/randomUUID static method:  

user> (criterium.core/bench (uuid/v1))

Evaluation count : 51250020 in 60 samples of 854167 calls.
Execution time mean :  1.130674 µs

user> (criterium.core/bench (java.util.UUID/randomUUID))

Evaluation count : 31868100 in 60 samples of 531135 calls.
Execution time mean : 1.920089 µs
Reply all
Reply to author
Forward
0 new messages