Proposal - UUID data type

Kless

unread,

Jul 11, 2008, 6:26:06 AM7/11/08

to

It would be very usefull a data type to store UUIDs, as PostgreSQL 8.3
has added [1], where UUIDs are stored as 16 binary bytes.

See: pgsql/src/backend/utils/adt/uuid.c:45:uuid_out

[1] http://www.postgresql.org/docs/8.3/static/datatype-uuid.html

Kless

unread,

Jul 13, 2008, 3:52:52 PM7/13/08

to

This article [1] speaks about why they added UUIDs into her web
application, so it's a real case where it's working. In addition, it
would avoid the collision of keys so if it were used in any tables as
'users', two companies could merge its databases without pain.

[1] http://www.justatheory.com/bricolage/why_uuids.html

Jerry Stuckle

unread,

Jul 13, 2008, 4:37:37 PM7/13/08

to

Rather, I would think you should propose the UUID data type be added to
the SQL standard. There are enough variations from the standards now.

And as it is, UUID's can be stored as a 32 byte string. So it isn't as
if there is not an alternative.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstu...@attglobal.net
==================

Kless

unread,

Jul 13, 2008, 5:21:56 PM7/13/08

to

On Jul 13, 9:37 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> Rather, I would think you should propose the UUID data type be added to
> the SQL standard. There are enough variations from the standards now.

Those variations must be managed by the language and not by the RDBMS,
where its *main work* is storing data using the specific data types.
A lot of languages already have modules to working with UUID [1].

> And as it is, UUID's can be stored as a 32 byte string. So it isn't as
> if there is not an alternative.

If they're stored in ASCII form (32 hex digits), would indeed be very
inefficient. So it's necessary that the RDBMS have a specific data
type to handle the UUIDs.

In PostgreSQL they're stored as 16 binary bytes [2], and the core
database does not include any function for generating UUIDs

[1] http://en.wikipedia.org/wiki/Uuid#Implementations
[2] http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/uuid.c

Jerry Stuckle

unread,

Jul 13, 2008, 5:29:49 PM7/13/08

to

Kless wrote:
> On Jul 13, 9:37 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> Rather, I would think you should propose the UUID data type be added to
>> the SQL standard. There are enough variations from the standards now.
> Those variations must be managed by the language and not by the RDBMS,
> where its *main work* is storing data using the specific data types.
> A lot of languages already have modules to working with UUID [1].
>

Yes, they must be managed by the language. Which is why it should be
part of the standard. That way, changing databases does not require
changing code.

>> And as it is, UUID's can be stored as a 32 byte string. So it isn't as
>> if there is not an alternative.
> If they're stored in ASCII form (32 hex digits), would indeed be very
> inefficient. So it's necessary that the RDBMS have a specific data
> type to handle the UUIDs.
>
> In PostgreSQL they're stored as 16 binary bytes [2], and the core
> database does not include any function for generating UUIDs
>
>
> [1] http://en.wikipedia.org/wiki/Uuid#Implementations
> [2] http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/uuid.c
>

Yep, which in the grand scheme of things, probably makes zero
difference. The difference between 16 and 32 bytes in any single row is
minuscule.

I am not in favor of adding more database-specific types to ANY database
- and I think PostGres doing it was a mistake.

If there is a demand for it, then it should be added to the SQL
standard. That is the correct way to propose a change. That's why
there are standards.

Kless

unread,

Jul 14, 2008, 5:32:43 AM7/14/08

to

Jerry Stuckle, you should read any comment from 'pgsql.hackers' news:

http://groups.google.com/group/pgsql.hackers/browse_thread/thread/b4101b6eb3b910d6

Jerry Stuckle

unread,

Jul 14, 2008, 7:51:25 AM7/14/08

to

Kless wrote:
> Jerry Stuckle, you should read any comment from 'pgsql.hackers' news:
>
> http://groups.google.com/group/pgsql.hackers/browse_thread/thread/b4101b6eb3b910d6
>

I really don't care what's on pgsql.hackers. I DO care about
interoperability between databases.

As I said - you should learn how to propose changes to a language which
has ANSI standards. Trying to get individual databases to implement
non-standard items is not the way to go. If there's such a demand, then
it's something which should go into the ANSI standard.

Kless

unread,

Jul 14, 2008, 8:43:42 AM7/14/08

to

Well, I think that that answer of Martijn van Oosterhout is
interesting so I'm going to add it here:

----------------

> On Jul 14, 12:51 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> I am not in favor of adding more database-specific types to ANY
> database
> - and I think PostGres doing it was a mistake.

So you think that adding full text indexing, gist/gin indexes, text,
geometric types should have waited until the SQL standard specified
them? With that kind of thinking we'd still be in the database stone
age.

One of postgresql's greatest strengths is user-defined types, lets use
it.

> If there is a demand for it, then it should be added to the SQL
> standard. That is the correct way to propose a change. That's why
> there are standards.

You are ofcourse free to propose it to them, but the question is if
they'd listen...

Have a nice day,
--
Martijn van Oosterhout <klep...@svana.org> http://svana.org/kleptog/
----------------

On Jul 14, 12:51 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> Kless wrote:
> > Jerry Stuckle, you should read any comment from 'pgsql.hackers' news:
>

> >http://groups.google.com/group/pgsql.hackers/browse_thread/thread/b41...

Jerry Stuckle

unread,

Jul 14, 2008, 12:18:22 PM7/14/08

to

Quite frankly, I don't care that PostGres has user-defined types. They
restrict you to a single database, when others might be better for other
reasons.

And yes, I think other things should have been proposed to the SQL
standards committee. It doesn't take that long to get a good proposal
into the standards. No, it isn't immediate. But if there is a case to
be made for it, then the committee will act.

Then all databases get the feature, eventually.

As I said. Do it the right way. Submit your proposal. If you have a
case, it will be added to the SQL standard. If not, then it's not that
important.

Kless

unread,

Jul 15, 2008, 8:08:02 AM7/15/08

to

Finally I want to leave this answer here of Mark Mielke:

-----------------------------

Jerry Stuckle <jstuck...@attglobal.net> wrote:
> Yes, they must be managed by the language. Which is why it should be
> part of the standard. That way, changing databases does not require
> changing code.

You are correct that putting widely used features into a standard that
is implemented by everyone is good.

This does not extend to the conclusion that one should never put in a
feature until it is standard. Look at any successful software product
and see how it usually leads the standard rather than follows it.
People
only tend to make standards once they realize things are getting out
of
control, which is long after the products are in use.

>> In PostgreSQL they're stored as 16 binary bytes [2], and the core
>> database does not include any function for generating UUIDs

> Yep, which in the grand scheme of things, probably makes zero

> difference. The difference between 16 and 32 bytes in any single row
> is minuscule.

This is incorrect. UUID at 16 bytes is already "long" in terms of
being
used as a primary index. In an 8K page, one can only fit 512 UUIDs
(forgetting the requirement for headers) - if it was stored as 32
bytes
- or 36 bytes, or 40 bytes (with punctuation), it would be at less
than
256 UUIDs per page. For a join table joining one set of UUID to
another
set, that's < 256 vs < 128. Doubling the size of an index row roughly
doubles the time to look up the value.

> I am not in favor of adding more database-specific types to ANY
> database
> - and I think PostGres doing it was a mistake.

As somebody who wrote his own module to do UUID for PostgreSQL when I
needed it in PostgreSQL 8.0, I don't agree. Just as you think defining
it in a standard is better than each vendor doing it their own way, I
think doing it in one product is better than each user of the product
doing it their own way.

> If there is a demand for it, then it should be added to the SQL
> standard. That is the correct way to propose a change. That's why
> there are standards.

Provide a real example of any similar product doing this. Exactly
which
enhancement to a standard was defined without even a prototype
existing
used in an existing product that purports to implement the standard?

I'm sure one or two examples must exist, but I cannot think of any.
Every enhancement I can think of that eventually made it into a
standard, was first implemented within a popular product, and then
demanded as a standard to be applied to all other products.

Cheers,
mark

--
Mark Mielke <m...@mielke.cc>
-----------------------------

Jerry Stuckle

unread,

Jul 15, 2008, 8:31:13 AM7/15/08

to

Kless wrote:
> Finally I want to leave this answer here of Mark Mielke:
>
> -----------------------------
> Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> Yes, they must be managed by the language. Which is why it should be
>> part of the standard. That way, changing databases does not require
>> changing code.
>
> You are correct that putting widely used features into a standard that
> is implemented by everyone is good.
>
> This does not extend to the conclusion that one should never put in a
> feature until it is standard. Look at any successful software product
> and see how it usually leads the standard rather than follows it.
> People
> only tend to make standards once they realize things are getting out
> of
> control, which is long after the products are in use.
>

Non-standard features just force people to stick with that one product.
In the long run, the only people who benefit are the product developers.

>>> In PostgreSQL they're stored as 16 binary bytes [2], and the core
>>> database does not include any function for generating UUIDs
>
>> Yep, which in the grand scheme of things, probably makes zero
>> difference. The difference between 16 and 32 bytes in any single row
>> is minuscule.
>
> This is incorrect. UUID at 16 bytes is already "long" in terms of
> being
> used as a primary index. In an 8K page, one can only fit 512 UUIDs
> (forgetting the requirement for headers) - if it was stored as 32
> bytes
> - or 36 bytes, or 40 bytes (with punctuation), it would be at less
> than
> 256 UUIDs per page. For a join table joining one set of UUID to
> another
> set, that's < 256 vs < 128. Doubling the size of an index row roughly
> doubles the time to look up the value.
>

Incorrect. Doubling the size of the index has very little effect on how
long it takes to look up a value. Intelligent databases use a binary
search so doubling the size only means one additional comparison need be
done. And heavily used indexes are generally cached in memory anyway.

>> I am not in favor of adding more database-specific types to ANY
>> database
>> - and I think PostGres doing it was a mistake.
>
> As somebody who wrote his own module to do UUID for PostgreSQL when I
> needed it in PostgreSQL 8.0, I don't agree. Just as you think defining
> it in a standard is better than each vendor doing it their own way, I
> think doing it in one product is better than each user of the product
> doing it their own way.
>

Fine. Whatever you want for your code. But don't expect the rest of
the world to jump because you want it.

>> If there is a demand for it, then it should be added to the SQL
>> standard. That is the correct way to propose a change. That's why
>> there are standards.
>
> Provide a real example of any similar product doing this. Exactly
> which
> enhancement to a standard was defined without even a prototype
> existing
> used in an existing product that purports to implement the standard?
>
> I'm sure one or two examples must exist, but I cannot think of any.
> Every enhancement I can think of that eventually made it into a
> standard, was first implemented within a popular product, and then
> demanded as a standard to be applied to all other products.
>
> Cheers,
> mark
>
> --
> Mark Mielke <m...@mielke.cc>
> -----------------------------
>

Most features added to the SQL standard, for instance. Like explicit
JOINs, recursive SQL and a bunch more. Also changes to the C++ standard
such as exceptions were at least in the process of being evaluated and
approved before they were in any product.

There's a reason for having a process to propose features to a product.
And it does not require the proposed change to be in any product.