There should be an UUID class (feature proposal)

62 views
Skip to first unread message

Sebastian Skałacki

unread,
Feb 20, 2018, 9:30:06 PM2/20/18
to Ruby on Rails: Core
Hi!

Currently, Rails represents UUIDs as regular strings.  Active Support defines
Digest::UUID, but it's a module with four functions which all return strings
containing UUIDs in a human-readable form.

I propose representing UUIDs as instances of a brand new UUID class to Active
Support, alternatively turning existing Digest::UUID module into a class for
this purpose.

Why I consider it beneficial:
-----------------------------

1. String comparison (meaning #==) is not very suitable for UUID comparison.
For instance, following three strings all represent the same UUID, but aren't
equal to each other: "123e4567-e89b-12d3-a456-426655440000",
"123E4567-E89B-12D3-A456-426655440000", "123e4567e89b12d3a456426655440000".

2. UUIDs should be immutable.  Strings returned by Ruby 2.5's SecureRandom#uuid
are not.

3. For two UUIDs u1 and u2, if u1 == u2, then also u1.equal?(u2) should occur
(they should be the same object).

4. Many databases do not support UUID data type natively.  A common solution
is to use some binary type instead, as UUIDs are basically 16 byte long numbers.
This saves some space (a human friendly string is 36 characters long, or 32
without dashes).  As a consequence, it also makes indexes smaller and
better-performant.  (Binary columns can be indexed, at least in MySQL:
class could also provide means for conversion between human-readable and
binary forms (the latter represented in Ruby as ASCII-8BIT-encoded strings).

To be precise, converstion from human-readable to binary should be equivalent to
following:

    uuid_str = "445d818d-54c6-48d2-bf9e-cb923671dcc4"
    uuid_bin = uuid_str.delete("-").scan(/../).map { |x| x.hex.chr }.join
    # => "D]\x81\x8DT\xC6H\xD2\xBF\x9E\xCB\x926q\xDC\xC4"
    uuid_str.size #=> 36
    uuid_bin.size #=> 16

What Rails uses UUIDs for:
--------------------------

Searching the repository for "UUID" word gives 32 results:

Currently, UUIDs are used for three things:

* In Active Record, for PostgreSQL-specific UUID column type
* In Action Pack, for request identifiers
* In Active Job, for job identifiers

Hence, I think that implementing this feature will require only few changes,
and upgrading will be rather simple to users (if require any actions).

Other considerations:
---------------------

1. All members of Ruby stdlib Digest module are already classes, but digests
which can be produced by their instances (or more conviniently, via class
methods) are strings.  Hence, I think that introducing a brand new UUID class
is a better idea than using Digest::UUID for that.

2. What should be printed by #inspect method?  Simply a human readable string
representation?  Or something like #<UUID 17d929bc-e2ea-4a22-a95f-240f84150a9c>?

3. This is a breaking change.  However Rails 6 development has started already,
so it shouldn't be a problem, right?

4. As noted before, SecureRandom#uuid returns a regular string.  Proposed change
will introduce some incompatibility between SecureRandom#uuid, and the UUID
generation methods provided by Rails: they will return values of different
types.

5. There are several gems which provide similar features.  That said, these
I have stumbled upon are either hardly maintained, or database-specific:

Sebastian Skałacki

unread,
Feb 20, 2018, 9:34:30 PM2/20/18
to Ruby on Rails: Core
Oh, I forgot to mention that I am willing to implement this feature after current discussion.

Benjamin Fleischer

unread,
Mar 6, 2018, 11:37:28 AM3/6/18
to Ruby on Rails: Core


On Tuesday, February 20, 2018 at 8:34:30 PM UTC-6, Sebastian Skałacki wrote:
Oh, I forgot to mention that I am willing to implement this feature after current discussion.

Any reason you can't just store and/or compare uuid's all upcased?  

A valid uuid matches /\A[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}\z/i
but if you want to compare two, for whatever reason, only the case should matter, no?

Sebastian Skałacki

unread,
Mar 8, 2018, 3:05:05 PM3/8/18
to Ruby on Rails: Core
One reason is space.  A string representation of UUID is 36 characters long (with dashes), and the number it represents is 16 bytes long.  Obviously, it's very unlikely that one would run out of disk space on your database server because of that.  However UUID columns are often indexed, they can even serve as table's primary key.  And for indexes, size matters.  The bigger portion of index you can keep in memory, the better it performs.

Another reason is database-agnosticism.  One can use the uuid type in config/schema.rb, but then this file won't work with other databases than PostgreSQL.  I see no good reason for that.
Reply all
Reply to author
Forward
0 new messages