Controlling user access to data with queries

851 views
Skip to first unread message

brendan

unread,
Feb 26, 2010, 7:20:39 AM2/26/10
to objectify-appengine
I've been thinking about how I'm going to control user access to my
data.

Problem
=======
I've got 3 main types of Entity objects. I'll call them A, B & C.
For each A, there are many B's. For each B, there are many C's.

Each user may have access rights to many A's, and each A may have many
users that can access it.
If a user has access to an A, then it has access rights to the related
B's and C's.

Queries
============
- List A's that a user has access to
- List B's that a user has access to (without filtering to a single B)
- List C's that a user has access to (without filtering to a single A
or B)
- List users that have access to an A

Proposed Entity Model
=====================

@Entity
public class A {
@Id Long id;

}

@Entity
public class B {
@Id Long id;
Key<A> a;
}

@Entity
public class C {
@Id Long id;
Key<B> b;
Key<C> a; // denormalized data
}

@Entity
public class UserAccount {
@Id Long id;
com.google.appengine.api.users.User user;
}

@Entity
public class UserAccess {
@Id Long id;
Key<A> a;
Key<UserAccount> userAccount;
}


List all A's that a user has access to
----------------------------------------

- query(UserAccess.class).filter("userAccount", userAccount).fetch(),
then for each item, get(A.class, item.a)
- add @Parent on UserAccess::a, then
query(UserAccess.class).filter("userAccount",
userAccount).fetchKeys(), then for each key, get(A.class,
key.getParent())
- Would instead putting @Parent on UserAccess::userAccount improve
performance, as all UserAccess entities for a user would be stored in
the same entity group.
- Perhaps eliminate UserAccess entity all together, and add
'Key<UserAccount>[] users' to A. Then I could go
query(A.class).filter("users", userAccount); There probably wont be
more that 5000 users per A.

List all B's that a user has access to
----------------------------------------

- query(UserAccess.class).filter("userAccount",
userAccount).fetchKeys(), then query(B.class).filter("a in", keys).
This will fail if a user has access to more than 30 A's (which is not
a normal use case, but could happen).
Instead could just go query(B.class).fetch(), and then check each B
and see if b.a is in the list keys. Doesn't sound scalable.

* List all C's that a user has access to
----------------------------------------
- pretty much the same queries as B

* List users that have access to an A
----------------------------------------
- query(UserAccess.class).filter("a", a).fetch()
- Could add "Key<A>[] as" to the UserAccount entity, meaning I don't
have to run so many queries about user access. Also, could put @Cache
on UserAccount as i'll probably be accessing it quite a bit.

Your comments please?

brendan

unread,
Feb 26, 2010, 11:31:51 AM2/26/10
to objectify-appengine
If the user does not have access to an A, then the B's related to an A
should not show up when a user performs a query.

In SQL I would do something like

SELECT ...
FROM B, A, USER_ACCESS
WHERE B.A = A.ID AND A.ID = USER_ACCESS.A AND USER_ACCESS.ID = ?

How do I perform that query on AppEngine with Objectify?

Scott Hernandez

unread,
Feb 26, 2010, 12:20:22 PM2/26/10
to objectify...@googlegroups.com
You could store ACLs (grant/revoke lists) on each object you want to
protect. You will have to make sure that you check that the user can
access that object (and then only get the linked/child data if so).

There is no support for joins in the datastore, but you can do this
stuff in your own code with multiple queries. There are some tricks
with parents (ancestors) but that may cause other problems if you ever
want to change parentage or with high volume writes in an entity
group.

class SecureData {
@Id Long id;
List<Key<UserAccount>> allowedUser = new ArrayList<Key<UserAccount>>();
String secureData;
}

class MoreSecureData {
@Id Long id;
Key<SecureData> secureRoot;
String moreSecureData;
}

//Querying
UserAccount me = ...;
Key<Account> meKey = dao.fact().getKey(me);
//returns SecureData instances you have access to
Iterable<SecureData> it =
dao.ofy().query(SecureData.class).filter("allowedUser =", meKey);


//checking MoreSecureData
Long id = someIdiWant;
//get the actual data, then check if the user has access
MoreSecureData msd = dao.ofy().get(MoreSecureData.class, id)
SecureData sd = dao.ofy().get(msd.getSecureRoot);
if(sd.getAllowedUsers().contains(meKey)) return msd;

You can optimize this a bit, but you get the idea.

Jeff Schnitzer

unread,
Feb 26, 2010, 12:25:41 PM2/26/10
to objectify...@googlegroups.com
This becomes even more practical if you create a Role concept such
that User has one or more Roles and you grant permission to Roles.
Hopefully you would cause less rewriting of data when security
permissions change.

I'm writing up a separate email that has some advice for doing it
strictly with queries, not precalculating anything.

Jeff

Jeff Schnitzer

unread,
Feb 26, 2010, 12:52:14 PM2/26/10
to objectify...@googlegroups.com
In SQL-land you would do this with a series of joins (two for listing
Bs, three for listing Cs), but AppEngine doesn't offer joins. This is
one of those things that RDBMSes make easy and AppEngine makes hard -
with some justification, since the scalability of joins (especially
multiple joins) is questionable. Still, that's not much consolation.

One helpful way to think about the problem is that it's not true that
AppEngine doesn't have joins. It just doesn't do joins for you. You
can do as many joins as you want, you just have to write them
yourself. How would an RDBMS implement that query? Just do the same.

You basically spelled it out:

* You start with a user id
* Get all A's that a user has access to
* For each A:
* Get all B's that are related to the A.
* For each B:
* Get all C's that are related to B.

Yeah, this might perform terribly depending on how many As and Bs and
Cs there are. The only improvement on your code I would make is to
put UserAccess in A's entity group:

public class UserAccess {
@Id Long id;

@Parent Key<A> a;
Key<UserAccount> userAccount;
}

Now your initial translation of userAccount -> Key<A> can be a
fetchKeys() query which will be more efficient than fetching the
UserAccess object.

You might consider caching the results, either in memcache or in the
database itself. This isn't all that different than Scott's
suggestion of ACLs, although I don't think you would want to store the
ACL in the A,B,C objects themselves since loading/saving them now
becomes very expensive. Create a separate index entity that holds the
permission:

class BAccess {
@Id Long id;
@Parent Key<B> b;
Key<UserAccount> userAccount;
}
class CAccess {
@Id Long id;
@Parent Key<C> c;
Key<UserAccount> userAccount;
}

Now, whenever a user is granted access to a new A (or access is
revoked from an A), fire off a batch job that creates or deletes all
the relevant Access entities.

This is the tradeoff of appengine - you can have fast queries by
precalculating, but this makes writes expensive. You can have
(relatively) cheap writes by leaving off indexes, but they your
queries are slow (or impossible). You might find a middle ground with
caching. But there's no magic bullet.

Jeff

brendan

unread,
Feb 27, 2010, 2:37:45 AM2/27/10
to objectify-appengine
Thank you. I'm just implementing some of your suggestions now and
seeing how it goes.

I'm wondering about the class BAccess that you suggested.

class BAccess {
        @Id Long id;
        @Parent Key<B> b;
        Key<UserAccount> userAccount;
}

I notice it's in the same same entity group as B. This could be good,
because then:
- I'll be able to save all the BAccess entities for a B in a single
transaction
- I'll be able to to query BAccess by just fetching keys and then
getting the parent

You said you don't think I would want to store the ACL in the A,B,C
objects themselves since loading/saving them would becomes very
expensive.

If the BAccess entities are stored in the same entity group, any
change to B will require and the entire entity group is rewritten for
each change. Is this an expense I need to worry about?

Brendan

Jeff Schnitzer

unread,
Feb 27, 2010, 1:54:09 PM2/27/10
to objectify...@googlegroups.com
On Fri, Feb 26, 2010 at 11:37 PM, brendan <bre...@doherty.net.nz> wrote:
>
> If the BAccess entities are stored in the same entity group, any
> change to B will require and the entire entity group is rewritten for
> each change.   Is this an expense I need to worry about?

This sounds like a conceptual misunderstanding - entity groups are
entities that live on the same server in the cluster and can be
transacted upon. They are still separate entities and can be written
independently.

Jeff

Scott Hernandez

unread,
Feb 27, 2010, 2:18:10 PM2/27/10
to objectify...@googlegroups.com
If you have "roles/groups" of users that are common to multiple
objects then it might be better to store them separately, as Jeff
suggested. If each object that has ACLs on it does not share those
with other objects then there is little benefit to storing the ACLs in
a separate object, as Jeff has suggested. It depends on your use
cases.

On Fri, Feb 26, 2010 at 11:37 PM, brendan <bre...@doherty.net.nz> wrote:

> Thank you.  I'm just implementing some of your suggestions now and
> seeing how it goes.

[snip]


> You said you don't think I would want to store the ACL in the A,B,C
> objects themselves since loading/saving them would becomes very
> expensive.

This depends on how many entries each ACL might contain. If it is
small, and unique to each A/B/C, then keeping the data there might
make more sense than attaching it to a separate object, which you will
have to get in addition datastore request. This all depends on your
usage patterns and query needs.

> If the BAccess entities are stored in the same entity group, any
> change to B will require and the entire entity group is rewritten for
> each change.   Is this an expense I need to worry about?

No, the whole entity group does not have to be re-written when any
member object changes. But as you have read, you can only have
transactions within an entity group (entities that all share the same
root ancestor -- the topmost parent).

brendan

unread,
Feb 28, 2010, 5:11:12 AM2/28/10
to objectify-appengine
Yep, conceptual understanding.

I went back and watched the videos on datastores and confirmed it was
my mistake.

After listening to Google I/O 2008 App Engine Datastore Under the
Covers, only the root entity and then BAccess entity modified is
written when a change to a BAccess occurs.

On Feb 28, 7:54 am, Jeff Schnitzer <j...@infohazard.org> wrote:

Reply all
Reply to author
Forward
0 new messages