Finding a specific edge between two vertices: is there a more efficient way?

1,469 views
Skip to first unread message

Denis Papathanasiou

unread,
Oct 15, 2015, 7:52:22 PM10/15/15
to Gremlin-users

I'm using Titan v. 0.5.4 with gremlin-scala and I have this function
for determining a specific specific edge between two vertices (which,
prior to this call are confirmed to exist):


def getEdgeBetween (parent: Vertex, child: Vertex, edgeProperty: String): Edge =
  parent
.outE(edgeProperty).inV.filter{ v: Vertex => child == v }.inE.next


It works, but for large data sets, it can be slow, despite the fact that
I have created edge labels for the specific edge properties I search for.

Is there a way to improve its performance?

Daniel Kuppitz

unread,
Oct 15, 2015, 9:24:28 PM10/15/15
to gremli...@googlegroups.com
Use Titan's query API:

parent.query().direction(OUT).labels(edgeProperty).adjacent(child).edges()

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/8ef64a6c-19f7-4e9c-a445-5b55511d3756%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Denis Papathanasiou

unread,
Oct 18, 2015, 1:06:39 PM10/18/15
to Gremlin-users
Thank you for the suggestion.

The adjacent() method only exists within TitanVertex objects, and I want
to keep this within the generic blueprints framework, so I resorted to
doing it this way instead:

  child.query().direction(IN).labels(edgeProperty).vertices.asScala
    .filter{ v: Vertex => parent == v }.iterator.next
    .getEdges(OUT, edgeProperty).iterator.next

Even doing it this way registered a huge gain in performance, though,
so thank you again for the tip.

Daniel Kuppitz

unread,
Oct 18, 2015, 2:42:17 PM10/18/15
to gremli...@googlegroups.com
Aha. Kinda pointless. You just added more code and decreased the readability. For what? You were asking for ways to improve the performance; that's for sure not the right way.

Are you an application developer or a framework developer? If the latter, then I understand why you only want to use Blueprints features. But if you are an application developer (which, I think, is more likely), why did you decide to use Titan if you don't want to use its specific features which make it really fast?

Also, just to mention it, you wouldn't have to deal with this problem in TP3.

Cheers,
Daniel


Denis Papathanasiou

unread,
Oct 18, 2015, 6:36:18 PM10/18/15
to Gremlin-users
I agree with you about the readability, but this version *is* very much
improved in terms of performance.

WRT why not be more Titan-specific, the thinking here is that if we every
switch the underlying graph db to say Neo4j, then we wouldn't have to
make any code changes.

Having said that, though, there aren't many places that reference Vertex
and Edge objects now, so it would be worth examining the impact of moving
towards a Titan-specific implementation.

BTW, what did you mean when you said that none of this is an issue with
tinkerpop 3?

Daniel Kuppitz

unread,
Oct 18, 2015, 8:30:23 PM10/18/15
to gremli...@googlegroups.com
The TP3 query to solve the problem is:

g.V(parent).outE("label").filter(inV().is(child))

Titan's query optimizer is able to optimize the query under the hood (it basically does what I suggested in my first answer). That means the query would be the same for any TinkerPop3 enabled GraphDB, just pure Gremlin, but Titan would perform much better than most (all?) others.

Cheers,
Daniel


Reply all
Reply to author
Forward
0 new messages