apply a static has step on all the visited nodes/edges for a given traversal gremlin query

70 views
Skip to first unread message

Sandip Paul

unread,
Sep 4, 2020, 6:31:47 AM9/4/20
to Gremlin-users
We have been discussing this usecase since few days on StackOverFlow.

as Stephen pointed out few suggestions which should work, but looks like to make that work, I need a deeper understanding about the workflow of how a given query gets executed end to end.

summarised suggested approach:
1/ develop a custom TraversalStrategy  i.e. CustomUserPermissionStrategy similar to SubgraphStrategy or PartitionStrategy which would take your user permissions on construction and then automatically inject the necessary has() steps after out() / in() sorts of steps.   
and use the same strategy to instantiate g in empty-sample.groovy something like this:
global <<[g: graph.buildTransaction().readOnly().start().traversal().withTraversal(CustomUserPermissionStrategy.build())] 
  
- Not sure if this will work?

Now we need to get access to the user principle from the HttpRequesta and that means the same details needs to make available in my CustomUserPermissionStrategy for us to them invoke the entitlement engine REST API for the user principle and apply the user's permissions on traversal strategy  or(has('permission', 'team1'), has('permission', 'team2')) on all visited nodes 

2/ For this to achieve, looks like we need to write our own HttpGremlinEndpointHandler and HttpChanelizer. 

Qns:
a) what is the best way to learn about the internal workflow of query execution pipeline?
b) How do I plug these in in the janus gremlin server config?
c) lastly.. are we going the right path to achieve this need?

any help would be highly appreciated. Have been struggling with this for quite sometime.

Thanks
Sandip 

Appreciate If I get some help/sample code snippet

Stephen Mallette

unread,
Sep 8, 2020, 7:20:50 AM9/8/20
to gremli...@googlegroups.com
a) what is the best way to learn about the internal workflow of query execution pipeline?

Unfortunately, you have to look at the code of Gremlin Server itself. Here's some pointers:

A Channelizer 


creates the execution pipeline. You would want to build your own. Since you are doing REST only, you would want to build your implementation from this:


From there you would likely replace the HttpGremlinEndpointHandler with your own:


The  HttpGremlinEndpointHandler is what processes a Gremlin request:


and by the time the request gets to this point it will have already been authenticated. You have access to the FullHttpRequest here:


and therefore have complete flexibility in deciding how to apply your strategy. In your case you want to apply a custom CustomUserPermissionStrategy and therefore care about the "bindings" passed to the GremlinScriptEngine. The bindings get created here:


and you can see that you would need to detect the TraversalSource requested and provide a new version of that TraversalSource to the bindings with your strategy added:


One thing that has always bothered me about these instructions (as I've given them before on multiple occasions that I can't seem to find now) is that we do the authentication and then just throw the "user" away:


It probably should be added to the netty Pipeline context so that it can be referenced later as needed. If that were implemented I think that simplifies this process a bit further as there is no need to pick through a request message to find the user again later in the pipeline. I've created an issue:


I'd be happy to review a pull request that if anyone feels like picking that up.
 
b) How do I plug these in in the janus gremlin server config?

You only need to worry about one configuration setting - the one for your Channelizer implementation. Just make the jar file containing your implementation is on Gremlin's path and then reference it here:

 
c) lastly.. are we going the right path to achieve this need?

Others have done this in their own way but the basic formula is the same and is what I've described here.  


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/fffe7eae-8466-4944-b2e2-5518090409ffn%40googlegroups.com.

Stephen Mallette

unread,
Sep 8, 2020, 7:26:48 AM9/8/20
to gremli...@googlegroups.com
As an aside - note that there is currently a pull request open for discussion on adding "authorization" to Gremlin Server:


This thread seems to have some relation to that concept in my mind.


Sandip Paul

unread,
Sep 9, 2020, 1:37:11 PM9/9/20
to Gremlin-users
Thanks Stephanne.. This is really helpful. 

Sandip Paul

unread,
Sep 10, 2020, 12:03:13 AM9/10/20
to Gremlin-users
It seems to work of having a custom HttpGremlinEndpointHandler ( HttpChanelizerCustom.class & HttpGremlinEndpointHandlerCustom.class)  by detecting the already avaialable TraversalSource ( this.graphManager.getTraversalSource("g") ) and add a SubgraphStrategy with has predicates of the permissions on every vertices visited by the requested gremlin query. 

Thanks again for the details explanation.

Is this the same way if we need to do for WsAndHttpChanelizer? Can you please share some pointers?

HadoopMarc

unread,
Sep 10, 2020, 2:14:46 AM9/10/20
to Gremlin-users
Hi Sandip and Stephen,

I would say that the existing SubgraphStrategy is already able to filter vertices on their properties using a has() step. The docs have an explicit example.

Code examples for the DSL approach can be found on:


with explanations on:


This work was scoped towards simplifying code reviews regarding the security of gremlin queries applied in trusted http endpoints.

Note that security for script-based requests from end user remains very fragile (so: unacceptable) with a DSL or SubgraphStrategy approach because the approach has to screen queries like g.getGgraph().traversal().V() to prevent full access to the graph. These backdoors are not present for bytecode requests towards gremlin server because g.getGraph() is not available for the AnonymousTraversalSource.

Work to serve different TraversalSource instances (diffent SubGraphStrategies) from the same Gremlin Server and authorize users is proposed here:

As you are going to dive into this, I would certainly be interested in any insights you will develop!

Best wishes,    Marc

Op dinsdag 8 september 2020 om 13:20:50 UTC+2 schreef spmal...@gmail.com:
Reply all
Reply to author
Forward
0 new messages