has() vs filter() and <graph-query> vs <vertex-query> confusion

613 views
Skip to first unread message

Adolfo Rodriguez

unread,
Dec 26, 2013, 3:31:59 PM12/26/13
to gremli...@googlegroups.com
Hi, I know this issue has been discussed before but I have got confused and would like to open this thread to clarify.

In my current understanding:
* we have, both:
* indexed steps (get the benefit of an indexing backend - if applies to an external indexed key - e.g. has()), and 
* not indexed steps (for this reason are discouraged by performance, e.g. filter())
* on the other hand, in a GremlinPipeline, we have both:
* a graph query -> having has() as method
* a vertex query -> having both has() and filter() as methods

All examples about indexing a backend looks to be made on a Graph Query. On the other hand, the recommendation is to use has() over filter() to take advantage of indexing. This is claimed here:

has: [...] Titan will actually try to use indices where it sees the opportunity to do so 

If in a vertex query, I browse the implementation of has() and filter(), I do not see big difference in code. They both call to processNextStart():
* has() executes the predicate -> this.predicate.evaluate(element.getProperty(this.key), this.value)
* filter() executes a call to compute ->  this.filterFunction.compute(s)
Rest of code in the vertex query is agnostic if either a has() or a filter() was in place so none of them looks really indexed. 

Actually I do not see in code any intend for indexing search in the vertex query. So:

* Are you referring exclusively to Graph Queries when claiming about indexing? (this looks the only possibility as I see in code)

* If so, when encouraging has() over filter(), are you perhaps referring to the has() in graph query vs the filter() in vertex query (i.e. disregarding the has() in vertex query)? (Therefore the discussion would be more about what query type to use rather than what step to use).

* If so, does this mean than expected performance in has() in vertex query and filter() in vertex query is about the same (considering that currently none of them is really indexed in code)?

Or, I am missing something, or documentation looks to be confusing has() in index query and has() in graph query and also I missing the mention about if vertex queries get the benefit of being indexed (something that I previously assumed) or not.

Thanks

Adolfo Rodriguez

unread,
Dec 26, 2013, 3:51:39 PM12/26/13
to gremli...@googlegroups.com
As amendment, a note what I understand by graph query and vertex query below:

new GremlinPipeline().start(g.query().has(blah)..........vertices()).
.has(blah)
.filter(new PipeFunction<Vertex, Boolean>() {
public Boolean compute(Vertex source) { 
 .....
return true;
}
})
.toLis();

Regards

Stephen Mallette

unread,
Dec 30, 2013, 7:33:43 AM12/30/13
to gremli...@googlegroups.com
If in a vertex query, I browse the implementation of has() and filter(), I do not see big difference in code. They both call to processNextStart():
* has() executes the predicate -> this.predicate.evaluate(element.getProperty(this.key), this.value)
* filter() executes a call to compute ->  this.filterFunction.compute(s)
Rest of code in the vertex query is agnostic if either a has() or a filter() was in place so none of them looks really indexed. 

Pipes simply uses what Blueprints exposes as an interface.  It is up to the underlying graph database to handle optimization of Predicate, so looking at the code in Pipes isn't going to show you how/when indexes are utilized.  If you want to see more details on the code around Titan's dealing with vertex/graph query you probably want to start looking here:



Stephen


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages