Using a UUID as primary key

2,414 views
Skip to first unread message

Wichert Akkerman

unread,
May 6, 2013, 5:22:14 PM5/6/13
to sqlal...@googlegroups.com
I was looking at using a UUID as primary key for a table. Using the backend-agnostic GUID type from http://docs.sqlalchemy.org/en/rel_0_8/core/types.html#backend-agnostic-guid-type I get strange behaviour though. I whipped up a simple test case:


class Data(BaseObject):
    __tablename__ = 'data'
    uuid = Column(GUID(), primary_key=True)
                
            
metadata.create_all()
session = sessionmaker(autocommit=False)()

uuid = 'ac4daeff-3af3-4b83-b8c7-3aa34e34151c'
obj = Data(uuid=uuid)
session.add(obj)
assert session.query(Data).get(uuid) is obj
assert session.query(Data).filter(Data.uuid == uuid).first()  is obj

Both asserts fail when I do this. The problem appears to be conversion from to a uuid.UUID instance in GUID.process_result_value: this always creates a new UUID instance, which probably causes a miss when SQLAlchemy tries to find an instance in its identify map.

Is this a bug in the GUID example in the documentation, or is this something that can be improved in the identity map?

Wichert.

Wichert Akkerman

unread,
May 6, 2013, 5:24:19 PM5/6/13
to sqlal...@googlegroups.com
For reference I have attached a complete test case including a copy of the GUID code from the documentation.

tst.py

Michael Bayer

unread,
May 6, 2013, 6:52:10 PM5/6/13
to sqlal...@googlegroups.com
well what's happening here is fairly simple, the mapper and ORM don't know anything about the conversion from string to UUID.  So when you pass it Data(uuid="some string"), it persists it, passing it off to the Core which converts the UUID, but because you've supplied the primary key, it then assumes the "key" for this object is "some string", not the UUID object, and places it into the identity map with that string identity.   Upon loading the first row from the database, that row comes back with UUID("some string") instead, so now you have two different identities in the identity map.

I don't actually use UUIDs for primary keys, I spent a year with one project that did and it was definitely a mistake, so these days I tend to have the UUID as a supplemental identifier for use in web services and such, but not as the internal primary key, which is why that recipe doesn't have any note about this - but also I don't use the string coercion you see here; if I'm dealing with a UUID I'd normally make sure it's a UUID the moment it enters my application (like if it were in a web form, the form library would coerce it).

Anyway, the immediate solution to this test would be to make sure the value is coerced at the ORM level, most simply by using a @validates decorator.




On May 6, 2013, at 5:24 PM, Wichert Akkerman <wic...@wiggy.net> wrote:

For reference I have attached a complete test case including a copy of the GUID code from the documentation.

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
<tst.py>

Lycovian

unread,
May 7, 2013, 1:07:00 AM5/7/13
to sqlal...@googlegroups.com
I was curious if you might elaborate on the issues with using a GUID as the primary key?  

I use them extensively in my professional database projects where I have multiple remote geographic sites that might have key collisions.  Actually I haven't used a bigint serial PK in years and haven't really had a problem, at least at the DB level.  I'm about to implement a large project using SQLAlchemy (really awesome work BTW) and had planned on using the GUID PK's (stored as varchar) in all of the tables as a partitioning key.  I tend to store the he GUID as a string though (not as PG's UUID native data type), other than the obvious storage wastage are there other issues with SA I should be aware of?

Cheers.

mfw

Michael Bayer

unread,
May 7, 2013, 11:10:56 AM5/7/13
to sqlal...@googlegroups.com
There's no issue with SQLAlchemy, the issues are the space concerns, as well as less efficient indexing and overall degraded performance when you have lots of foreign keys that also need to all be GUIDs - so the space concerns go well beyond just the primary key column itself.   This was using Postgresql.
Reply all
Reply to author
Forward
0 new messages