How to ensure that a property has unique value across all entities of the same kind?

1,624 views
Skip to first unread message

Edoardo Marcora

unread,
Apr 13, 2008, 2:37:27 AM4/13/08
to Google App Engine
How does one replicate the UNIQUE index of SQL databases in the Google
datastore?

I was thinking of writing a validator to pass to the property
constructor, but - if I am reading the docs right - validation is run
when an object is initialized or a property is assigned a new value,
before "put" is executed. Therefore, one could "put" an entity after
the unique validator has run on that property, eventhough another
entity has been saved with the same property value in the datastore by
somebody else.

How does one ensure uniqueness of a property value??? And how about
ensuring uniqueness of two property values together (composite unique
index)?

It would be great to at least have a unique=True/False argument to
pass to the property constructor (like in Django) to ensure
uniqueness. As a corollary, for properties where unique=True one
should be able to fetch an entity by doing
Model.get_by_uniquepropertyname and also
Model.get_or_insert_by_uniquepropertyname... like Rails does with
ActiveRecord (e.g., Book.find_or_create_by_isbn).

Just my $0.02

Dado

Banesiu.Sever

unread,
Apr 13, 2008, 3:39:44 AM4/13/08
to Google App Engine


On Apr 13, 9:37 am, Dado <edoardo.marc...@gmail.com> wrote:
> How does one replicate the UNIQUE index of SQL databases in the Google
> datastore?
>
> I was thinking of writing a validator to pass to the property
> constructor, but - if I am reading the docs right - validation is run
> when an object is initialized or a property is assigned a new value,
> before "put" is executed. Therefore, one could "put" an entity after
> the unique validator has run on that property, eventhough another
> entity has been saved with the same property value in the datastore by
> somebody else.
>
> How does one ensure uniqueness of a property value???

You can ensure the uniqueness of a property value by putting the
entity in datastore, save the key, check if the unique constraint is
meet after the put and if not delete the entity.

def put(self):
key = super(...).put()
if YourClass.gql(...).count() > 1:
YourClass.get(key).delete()
raise YourException

ryan

unread,
Apr 13, 2008, 2:44:30 PM4/13/08
to Google App Engine
Another alternative is to specify the primary key directly, using the
key_name keyword argument to your Model subclass's constructor.

http://code.google.com/appengine/docs/datastore/keysandentitygroups.html
http://code.google.com/appengine/docs/datastore/modelclass.html#Model

You can use Model.get_or_insert() to atomically get an entity with a
given key_name, or create it if it doesn't yet exist.

http://code.google.com/appengine/docs/datastore/modelclass.html#Model_get_or_insert

Edoardo Marcora

unread,
Apr 14, 2008, 2:25:43 AM4/14/08
to Google App Engine
Thanx to both of you for your suggestions.

Banesiu - I can certainly do that, but it feels like a hack... but
perhaps this is the way to go in distributed data-store land ;)

Ryan - that works for the primary key, but not for standard attributes
(since the key_name is immutable and "unique" attributes need not to
be immutable). I just thought that having a "unique" parameter to the
property constructor makes more sense... since the google data store
api appears to follow that of Django, which provides such an option...
but of course Django models are based on SQL dbs.

Dado

On Apr 13, 11:44 am, ryan <ryanb+appeng...@google.com> wrote:
> Another alternative is to specify the primary key directly, using the
> key_name keyword argument to your Model subclass's constructor.
>
> http://code.google.com/appengine/docs/datastore/keysandentitygroups.htmlhttp://code.google.com/appengine/docs/datastore/modelclass.html#Model
>
> You can use Model.get_or_insert() to atomically get an entity with a
> given key_name, or create it if it doesn't yet exist.
>
> http://code.google.com/appengine/docs/datastore/modelclass.html#Model...

Jared

unread,
Apr 15, 2008, 5:49:21 PM4/15/08
to Google App Engine
Can you provide a more clear example of this method?

On Apr 13, 3:39 am, "Banesiu.Sever" <Banesiu.Se...@gmail.com> wrote:
> On Apr 13, 9:37 am, Dado <edoardo.marc...@gmail.com> wrote:
>
> > How does one replicate theUNIQUEindex of SQL databases in the Google
> > datastore?
>
> > I was thinking of writing a validator to pass to the property
> > constructor, but - if I am reading the docs right - validation is run
> > when an object is initialized or a property is assigned a new value,
> > before "put" is executed. Therefore, one could "put" an entity after
> > theuniquevalidator has run on that property, eventhough another
> > entity has been saved with the same property value in the datastore by
> > somebody else.
>
> > How does one ensure uniqueness of a property value???
>
> You can ensure the uniqueness of a property value by putting the
> entity in datastore, save the key, check if theuniqueconstraint is

Banesiu.Sever

unread,
Apr 15, 2008, 7:00:42 PM4/15/08
to Google App Engine
On Apr 16, 12:49 am, Jared <greenwaldja...@gmail.com> wrote:
> Can you provide a more clear example of this method?

Jared, the idea is to define an overloaded method put() in your Model
class that does the job: save the key returned by the super(...).put()
call so you can later delete the entity if it doesn't meet the
uniqueness constraint (you can check that using a gql query).
But be careful, while this method ensures that your datastore doesn't
become corrupt, it doesn't solve all your problems. In worst case
scenario while trying to insert two different entities both put()
calls may fail (if each uniqueness check happens in the same time,
both entities will be deleted and none will end-up in datastore).

Edoardo Marcora

unread,
Apr 17, 2008, 12:34:04 PM4/17/08
to Google App Engine
Please star my issue # 178 at http://code.google.com/p/googleappengine/issues/detail?id=178
requesting a "unique" argument for the property constructor. I don't
know how this could solve the need for composite unique constraints
though, but it's a start. As a corollary, db.Model should allow for
dynamic find and create methods based on the unique key. So for
example, if you have a book model with a unique isbn key, one ought to
be able to do Book.get_or_insert_by_isbn(isbn, *other_required_attrs)
similar to what is already possible with the key_name (get_or_insert
method).

Edoardo Marcora

unread,
Apr 24, 2008, 7:48:36 PM4/24/08
to Google App Engine
> You can ensure the uniqueness of a property value by putting the
> entity in datastore, save the key, check if theuniqueconstraint is
> meet after the put and if not delete the entity.
>
> def put(self):
>     key = super(...).put()
>     if YourClass.gql(...).count() > 1:
>         YourClass.get(key).delete()
>         raise YourException

I actually get a "Can't query inside a transaction" error when I do
this. My code is actually like this (for an article model with a
unique identifier called 'pmid'):

def put(self):
key = super(Article, self).put()
query = Article.gql("WHERE pmid = :pmid", pmid=self.pmid)
if query.count() == 0:
return key
else:
Article.get(key).delete()
query.bind()
return query.get().key()

The error is raised by query.count().

Any way around this????

Dado

Jeff Hinrichs

unread,
Apr 24, 2008, 9:35:35 PM4/24/08
to Google App Engine
This may not answer your question, but have you decided that
Model.get_or_insert()
http://code.google.com/appengine/docs/datastore/modelclass.html#Model_get_or_insert
won't give you the desired result?

Regards,

Jeff

Edoardo Marcora

unread,
Apr 24, 2008, 9:43:16 PM4/24/08
to Google App Engine
Let me explain with an example:

For example, say you have a User model (not the GAE's one) with an
email property that you want to make sure is unique across your user
base (but that can also change from time to time, so that it can't be
embedded in the key_name to ensure uniqueness, since the key_name is
immutable after creation). How would you do ensure that? It would nice
to have a Django-like unique attribute added to property constructors
to ensure uniqueness (e.g., email = db.EmailProperty(unique = True,
required= True). Since email in the aforementioned examples becomes a
unique identifier for user instances, one should also be given dynamic
static methods for unique properties such as these,
User.get_by_email('someb...@somewhere.com') and
User.get_or_insert_by_email('someb...@somewhere.com', **kwattrs).

A problem still remains open though, I would one go about ensuring
uniqueness across multiple properties, like in sql composite unique
indices?

Overall, the mechanisms that we are given to ensure data integrity in
the datastore are still a bit lacking in my opinion.

See also my issue http://code.google.com/p/googleappengine/issues/detail?id=178
> Model.get_or_insert()http://code.google.com/appengine/docs/datastore/modelclass.html#Model...

Jared Greenwald

unread,
Apr 24, 2008, 11:52:56 PM4/24/08
to google-a...@googlegroups.com
Don't you need to fetch before you can use count?

I'm implementing an admin interface for my app (only a few models, so
not a huge deal). Part of the app has blog-like news feed, where the
page for each individual news item has the slug of the new item title
as the url, meaning the slug must be unique. There's probably a
better way to do this, so if anyone wants to critique my code feel
free, but it seems to work fine. Here's the code I use for
adding/editing news items...

def post(self):
content = self.request.get('content')
id = self.request.get('id')
title = self.request.get('title')
slug = slugify(title)
news = db.Query(News)
news.filter('slug =', slug)
foo = news.fetch(limit=1)
if foo:
redir = "/admin/news/error"
else:
if id:
news = News.get_by_id(int(id))
news.content = content
news.updated = datetime.datetime.now()
else:
news = News(author = users.get_current_user(),
content = content, slug = slug,
title = title)
news.put()
redir = "/admin/news/"
self.redirect(redir)

Edoardo Marcora

unread,
Apr 25, 2008, 12:01:37 AM4/25/08
to Google App Engine
What happens if, in the time between fetch and put, some other user
put a page with the same slug? The integrity of your data is
conpromised! This kind of updates to the datastore must be wrapped in
transactions. It is such a common scenario, that I think GAE ought to
provide a unique=True/False option to the property constructor.

Dado

On Apr 24, 8:52 pm, "Jared Greenwald" <greenwaldja...@gmail.com>
wrote:
> On Thu, Apr 24, 2008 at 9:43 PM, Dado <edoardo.marc...@gmail.com> wrote:
>
> >  Let me explain with an example:
>
> >  For example, say you have a User model (not the GAE's one) with an
> >  email property that you want to make sure is unique across your user
> >  base (but that can also change from time to time, so that it can't be
> >  embedded in the key_name to ensure uniqueness, since the key_name is
> >  immutable after creation). How would you do ensure that? It would nice
> >  to have a Django-like unique attribute added to property constructors
> >  to ensure uniqueness (e.g., email = db.EmailProperty(unique = True,
> >  required= True). Since email in the aforementioned examples becomes a
> >  unique identifier for user instances, one should also be given dynamic
> >  static methods for unique properties such as these,
> >  User.get_by_email('someb...@somewhere.com') and
> >  User.get_or_insert_by_email('someb...@somewhere.com', **kwattrs).
>
> >  A problem still remains open though, I would one go about ensuring
> >  uniqueness across multiple properties, like in sql composite unique
> >  indices?
>
> >  Overall, the mechanisms that we are given to ensure data integrity in
> >  the datastore are still a bit lacking in my opinion.
>
> >  See also my issuehttp://code.google.com/p/googleappengine/issues/detail?id=178

Jeff Hinrichs

unread,
Apr 25, 2008, 9:00:20 AM4/25/08
to Google App Engine


On Apr 24, 8:43 pm, Dado <edoardo.marc...@gmail.com> wrote:
> Let me explain with an example:
>
> For example, say you have a User model (not the GAE's one) with an
> email property that you want to make sure is unique across your user
> base (but that can also change from time to time, so that it can't be
> embedded in the key_name to ensure uniqueness, since the key_name is
> immutable after creation). How would you do ensure that? It would nice
> to have a Django-like unique attribute added to property constructors
> to ensure uniqueness (e.g., email = db.EmailProperty(unique = True,
> required= True). Since email in the aforementioned examples becomes a
> unique identifier for user instances, one should also be given dynamic
> static methods for unique properties such as these,
> User.get_by_email('someb...@somewhere.com') and
> User.get_or_insert_by_email('someb...@somewhere.com', **kwattrs).
>
> A problem still remains open though, I would one go about ensuring
> uniqueness across multiple properties, like in sql composite unique
> indices?
If these attributes, that should be unique, and only change
infrequently, why don't you create unique restraints on them, and then
when the user changes them, drop and reinsert the records with the
modified attributes?


Regards,
Jeff

> Overall, the mechanisms that we are given to ensure data integrity in
> the datastore are still a bit lacking in my opinion.
>
> See also my issuehttp://code.google.com/p/googleappengine/issues/detail?id=178

Edoardo Marcora

unread,
Apr 25, 2008, 11:34:16 AM4/25/08
to Google App Engine
How do you create a robust unique restraint? That's the question!

Dado

Jens Scheffler

unread,
Apr 26, 2008, 4:56:55 PM4/26/08
to Google App Engine
fwiw: I put two samples on how uniqueness can be enforced
programatically on

http://blog.appenginefan.com/2008/04/saving-user-specific-data.html
(for users objects as primary key)
http://blog.appenginefan.com/2008/04/unique-data-part-2.html (for
general key/value pairs with the key as a string)

I'm not a python expert, so ping me if there is an easier way of doing
this or if you find a bug :-)
These samples use the primary key of the models, so one weakness is
definitely that the element that I enforce uniqueness on becomes
immutable. If anyone has a better way, please link samples to this
thread (or email them to me directly).

Cheers,

Jens

Edoardo Marcora

unread,
Apr 26, 2008, 10:46:40 PM4/26/08
to Google App Engine
> These samples use the primary key of the models, so one weakness is
> definitely that the element that I enforce uniqueness on becomes
> immutable. If anyone has a better way, please link samples to this
> thread (or email them to me directly).

You said it. I too am ensuring uniqueness by embedding the unique
property value in the key_name... but that feels like a hack, and it
does not work for unique properties that are not immutable (and even
things that are said to be immutable may become mutable later on, see
the ISSN and ISBN numbers... that's why we have always been told to
not use these id schemes as primary keys).

I would hope some Google Datastore guy could post on this thread and
give us some best practice heads up. I wish they would implemente
unique=True/False on property constructor or at least provide a unique
constrain on indexes.

Dado
Reply all
Reply to author
Forward
0 new messages