Achieving fine-grained access control with TinkerPop

378 views
Skip to first unread message

bz57

unread,
Jul 6, 2018, 12:53:41 AM7/6/18
to Gremlin-users
Hi -

How is it possible to limit visibility of Vertex or Edge data in queries depending on user roles?

For example: let's say we have vertices with properties as shown below: 
  • Cars {American, German, Japanese}.
  • Planes {American, European}.
What is the most scalable way of restricting an user so s/he views only cars and planes made by Americans in the results? Ideally, I don't want to retrieve all the data and filter the response before it is returned by the API as this can be a performance hit, especially with large data. Does TinkerPop offer a edge, vertex or property level security?

Any ideas  or pointers will be appreciated!

Thanks in advance!

HadoopMarc

unread,
Jul 6, 2018, 5:47:59 AM7/6/18
to Gremlin-users

The easiest feature to limit visibility inside the query is:


An example with subgraphstrategy that goes to property level:



Without having tested this, I presume that you can not use this feature for RBAC access control (that is, expose the resulting TraversalSource gsub), because the TinkerPop API's are wide open. E.g.

gsub.V().getGraph.get().traversal().V()


would give you complete access again.

Cheers,     Marc



Op vrijdag 6 juli 2018 06:53:41 UTC+2 schreef bz57:

Stephen Mallette

unread,
Jul 6, 2018, 8:24:11 AM7/6/18
to Gremlin-users
HadoopMarc has the right ideas. I'd add to those by saying that it's not clear from your post as to how your system is being built, but you basically need to protect your "g" from unauthorized usage. If you can do that, then you probably have the basis for a fairly secure system with Partition/SubgraphStrategy. In Gremlin Server that would mean sandboxing the script environment to prevent users from calling unauthorized methods that would get them the underlying graph - see "Protecting Script Execution":


I think that this is a great example of why script execution really needs to be made a secondary function to bytecode based processing. With bytecode there is no way to execute unexpected functions (unless you are forced to use lambdas). Hopefully we can move closer and closer to that goal of being done with scripts for good.

You did ask for the most "scalable" approach also - as I alluded to in a separate thread a few moments ago, I think you'll find that there aren't many general purpose schema patterns in graphs (that seems to be my experience anyway). What may work in one domain really well, may not work well in a different domain. For your simple example, Partition/SubgraphStrategy would probably work pretty well assuming you had appropriate indices defined for those countries - appropriate being indices that could quickly return results with low-selectivity. Depending on your traversal patterns you might also question whether or not you need to push "country" property to edges and denormalize a bit so as to avoid traversing to the adjacent vertex to detect if the country is accessible or not. Or maybe based on your query types, you should dump the "country" property all together in favor of a "country" vertex with unidirectional edges (if your graph supports that). Or perhaps your graph natively supports some form of Graph Element-LAC?

I don't mean to create lots of questions with no answers, but I'm just pointing out that there are a lot of options and typically no general purpose silver bullet for every case of auditing and authorization. After chasing these two general purpose features for my near decade at TinkerPop and having no surefire solution for every case, my current thinking is that: (1) TinkerPop should not try to provide those abstractions beyond Partition/SubgraphStrategy. Those are tools to delivering that kind of functionality, have held up pretty well over time and deliver at the right layer of abstraction for TinkerPop allow adaptation to a wide range of usage. (2) Perhaps we will now see graphs evolve a bit to better support these functions especially as graphs reach into enterprise class business. Vendors are bound to hit clients with these requirements pressuring them to produce these kinds of features.



--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/d84711df-4e35-4b30-a1e3-5445ca34165f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

bz57

unread,
Jul 6, 2018, 11:32:09 PM7/6/18
to Gremlin-users
HadoopMarc & Stephen - thank you for your replies. I will look into SubgraphStrategy. 

Do you know of any other graph database that can offer RBAC? There was cell-level security in Apache Accumulo but not sure if that product is getting much adoption.


Stephen Mallette

unread,
Jul 9, 2018, 8:32:31 AM7/9/18
to Gremlin-users
I think OrientDB has some form of record level security. Neo4j has property level security I think. I think DSE Graph has some form of RLAC though I'm not sure what versions its enabled on. Not sure about other options off the top of my head.

On Fri, Jul 6, 2018 at 11:32 PM bz57 <za...@techskylabs.com> wrote:
HadoopMarc & Stephen - thank you for your replies. I will look into SubgraphStrategy. 

Do you know of any other graph database that can offer RBAC? There was cell-level security in Apache Accumulo but not sure if that product is getting much adoption.


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages