Absence of a path / failure to traverse (3.0)

93 views
Skip to first unread message

Matt Frantz

unread,
Feb 3, 2015, 3:11:28 PM2/3/15
to gremli...@googlegroups.com
I have a scenario where I want to perform a sequence of traversals, each of which begins where the previous leaves off, but any of which can fail to find a path.

A =?=> B =?=> C

From "A", I may be able to reach "B", and from there, to reach "C".

If I get all the way to "C", I want the value of the traversal at that point.

If, however, I fail to reach "B" or "C", then I want to do something else.  For simplicity, let's say I can use the value of "B" if we fail to find "C", and the value of "A" if we fail to find "B".

As a SQL reference, this is a bit like the COALESCE logic of some dialects.  It also resembles an LEFT OUTER JOIN, in that I want to do something with "A" even if I can't find "B" or "C".

I thought about using choose, in the following way, where the possibly complex logic of finding "B" and "C" is simplified to an 'out().has()' traversal:

g.V().as('a').choose(
  __.out().has('b').hasNext(),
  __.out().has('b').as('b').choose(
    __.out().has('c').hasNext(),
    __.out().has('c').as('c'),
    __)
  __).select()

I feel like this may redundantly execute traversals, or else rely on optimization to avoid redundant execution.

Is there a better way?

Daniel Kuppitz

unread,
Feb 3, 2015, 4:03:09 PM2/3/15
to gremli...@googlegroups.com
Hi Matt,

the best solution that comes to my mind is to store level in a separate collection and then ultimately look for the last non-empty collection.

Sample using the modern graph.

gremlin> g.V(1).store("a").out("created").store("b").in("created").store("c").out("livesIn").store("d").cap("a","b","c","d")
==>[a:[v[1]], b:[v[3]], c:[v[1], v[4], v[6]], d:[]]

Your traversal:

g.V().as('a').out().has('b').store('b').out().has('c').store('c').cap('a','b','c')

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/9fb086bd-a930-492d-8683-9704398c8905%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Frantz

unread,
Feb 3, 2015, 5:07:56 PM2/3/15
to gremli...@googlegroups.com
Daniel,
Thank you for the suggestion.  My concern with using "store" is that it has global scope, and I hope to be able to scale this technique for OLAP.  (I initially hoped I could use "sack" because it would have local scope.)

Extending your example, here's how we could maintain the state for each partial sequence that we find.

g.V().as("a").store("a")
  .out("created").as("b").select("a", "b").store("ab").back("b")
  .in("created").as("c").select("a", "b", "c").store("abc").back("c")
  .out("livesIn").as("d").select("a", "b", "c", "d").store("abcd")
  .cap("a", "ab", "abc", "abcd")

This produces a map with four keys, each representing the partial path length.  We would then be left to interpret the results from longest to shortest, pruning out any shorter paths that eventually were lengthened.

This appears to be functional, but from a performance standpoint, I'm not sure it will scale.  Particularly, the "cap" step is an unnecessary barrier.  What we'd like to do is act on completed paths immediately, and when we understand that a path cannot be completed, act on the partial path.

Thanks for sharing your thoughts.

Daniel Kuppitz

unread,
Feb 3, 2015, 6:45:02 PM2/3/15
to gremli...@googlegroups.com
Yea, the last one can become super memory intensive very quickly. In this case better take your initial approach. Note that the choose-condition only checks whether the sub-traversal has a next element or not, which should always be pretty fast.


Cheers,
Daniel


Matt Frantz

unread,
Feb 3, 2015, 9:31:48 PM2/3/15
to gremli...@googlegroups.com
I'm refocusing on the fact that I need to express a traversal tree (not to be confused with the "tree" step, which materializes a tree data structure), with decisions at various points (a, b, c).

Consider a hypothetical step called "first" in the branch family.  It is used like so:

g.V().
  first(out('a'), out('not a')).
  first(out('b'), out('not b')).
  first(out('c'), out('not c'))

The "first" step would two traversals.  If the first traversal produces anything, then it is passed along.  If not, then the second traversal executes.

For syntactic sugar, "first" could accept more than two traversals.

If you don't want to pass along the output of the second traversal, then I assume you could just terminate with "iterate()", correct?

This is analogous to the shortcut logic in some programming languages, e.g. JavaScript and Python, where you can assign a value based on the first truthy value.

JavaScript: var foo = bar || defaultFoo;
Python: foo = bar or default_foo

It also mirrors the COALESCE feature of some SQL implementations.

Daniel Kuppitz

unread,
Feb 4, 2015, 11:50:41 AM2/4/15
to gremli...@googlegroups.com

Daniel Kuppitz

unread,
Feb 5, 2015, 5:27:10 PM2/5/15
to gremli...@googlegroups.com
There you go: Coalesce Step

Cheers.
Daniel

Matt Frantz

unread,
Feb 5, 2015, 7:03:15 PM2/5/15
to gremli...@googlegroups.com
Great!

You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/EFusukzQfvY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CA%2Bf9seV32kD7WpagPdJ5Pi0zsJwVHJ-1sSjQPntSj5-pUNuVrQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages