Graph traversal and subgraphs

192 views
Skip to first unread message

Roman

unread,
Jul 14, 2016, 3:16:53 AM7/14/16
to ArangoDB
Hi, I have graph (see attachment for example) with edges which have property type. I want to traverse graph, but use only specific edges (for example type == "cdp"). I tried following query:

for d in vDevice filter d.hostname == "lab54unl85AC172"
FOR v,e,p IN 1..10000 any d GRAPH 'linkGraph' OPTIONS { 'uniqueVertices': 'global', 'uniqueEdges': 'global'} 
filter e.type == "cdp"
return {v:v.hostname, e:e.type, pe: p.edges[*].type, pv:p.vertices[*].hostname}

or with filter on path

filter p.edges[-1].type == "cdp"

But with no correct results. It seems that filters are only applied on results and traverse itself is not affected by these filters. Same situation applies if I tried to filter on vertices.

Is there a way how to run traverse on subgraph (define subset of edges and vertices and then run traverse on them).

Thanks

Roman




graph_example.png

Wilfried Gösgens

unread,
Jul 14, 2016, 4:23:15 AM7/14/16
to ArangoDB
Hi Roman,
FILTERs are currently only pushed into the traversal engine when you filter simple conditions on the path; Filtering on v & e are not intended to abort traversals.
With ArangoDB 3.0 we introduced ANY and ALL in that context:

  FILTER p.edges[*].type ALL == "cdp"

which will do what you want. However, performance wise it will gain the ability to abort traversals in future ArangoDB releases.

In general you can use db._explain() to inspect the generated plan; If the FOR statement has a copy of the FILTER statements (or parts of it) they're executed inside of the traversal instead of on the result set of the traversal.


Cheers,
Willi

Roman

unread,
Jul 14, 2016, 5:09:05 AM7/14/16
to ArangoDB
Hi Wilfried, thank you for your help. But I still can't get result I want. When I run following query without filters:

for d in vDevice filter d.hostname == "lab54unl85AC172"
FOR v,e,p IN 1..10000 any d GRAPH 'linkGraph' OPTIONS { 'uniqueVertices': 'global', 'uniqueEdges': 'global'} 
return {v:v.hostname, e:e.type, pe: p.edges[*].type, pv:p.vertices[*].hostname}

[
  {
    "v": "lab54unl85AC174",
    "e": "cdp",
    "pe": [
      "cdp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174"
    ]
  },
  {
    "v": "lab54unl85EXR13",
    "e": "arp",
    "pe": [
      "cdp",
      "arp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13"
    ]
  },
  {
    "v": "lab54unl85AC173",
    "e": "arp",
    "pe": [
      "cdp",
      "arp",
      "arp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13",
      "lab54unl85AC173"
    ]
  },
  {
    "v": "lab54unl85AC171",
    "e": "cdp",
    "pe": [
      "cdp",
      "arp",
      "arp",
      "cdp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13",
      "lab54unl85AC173",
      "lab54unl85AC171"
    ]
  },
  {
    "v": "lab54unl85EXR14",
    "e": "cef",
    "pe": [
      "cdp",
      "arp",
      "cef"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13",
      "lab54unl85EXR14"
    ]
  },
  {
    "v": "lab54unl85H179",
    "e": "cef",
    "pe": [
      "cdp",
      "arp",
      "cef"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13",
      "lab54unl85H179"
    ]
  },
  {
    "v": "lab54unl85AC175",
    "e": "arp",
    "pe": [
      "cdp",
      "arp",
      "arp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174",
      "lab54unl85EXR13",
      "lab54unl85AC175"
    ]
  }
]

I can see that traverse visited all vertices using some edges (with different types). Now I apply filter you suggest:

for d in vDevice filter d.hostname == "lab54unl85AC172"
FOR v,e,p IN 1..10000 any d GRAPH 'linkGraph' OPTIONS { 'uniqueVertices': 'global', 'uniqueEdges': 'global'} 
FILTER p.edges[*].type ALL == "cdp" 
return {v:v.hostname, e:e.type, pe: p.edges[*].type, pv:p.vertices[*].hostname}

[
  {
    "v": "lab54unl85AC174",
    "e": "cdp",
    "pe": [
      "cdp"
    ],
    "pv": [
      "lab54unl85AC172",
      "lab54unl85AC174"
    ]
  }
]

So the filter took results a filtered out all path where all types are not cdp. But this is not correct result. If you look on my graph example picture, all vertices (except one lab54unl85H179) are reachable via edges with type cdp. If I tried to change uniqueness  from global to path OPTIONS { 'uniqueVertices': 'path', 'uniqueEdges': 'path'} then I got correct result but with duplicit vertices.

If I run explain:

Execution plan:
 Id   NodeType                  Est.   Comment
  1   SingletonNode                1   * ROOT
  2   EnumerateCollectionNode   1487     - FOR d IN vDevice   /* full collection scan */
  3   CalculationNode           1487       - LET #6 = (d.`hostname` == "lab54unl85AC172")   /* simple expression */   /* collections used: d : vDevice */
  4   FilterNode                1487       - FILTER #6
  5   TraversalNode                1       - FOR v  /* vertex */, p  /* paths */ IN 1..10000  /* min..maxPathDepth */ ANY d /* startnode */  GRAPH 'linkGraph'
  6   CalculationNode              1       - LET #8 = (p.`edges`[*].`type` all == "cdp")   /* simple expression */
  7   FilterNode                   1       - FILTER #8
  8   CalculationNode              1       - LET #10 = v.`hostname`   /* attribute expression */
  9   ReturnNode                   1       - RETURN #10

What you mean by "If the FOR statement has a copy of the FILTER statements"? 

Roman

unread,
Jul 14, 2016, 5:15:33 AM7/14/16
to ArangoDB
If I looked to documentation


there should be something like this 

Traversals on graphs:
 Id   Depth   Vertex collections   Edge collections   Filter conditions
  2   1..3    circles              edges              `Path`.`edges`[0] -> v.`label` == "right_foo"

But in my explain it's not:

Traversals on graphs:
 Id   Depth      Vertex collections   Edge collections   Filter conditions
  5   1..10000   vDevice              eLink              

Wilfried Gösgens

unread,
Jul 14, 2016, 5:39:46 AM7/14/16
to ArangoDB
Hi,
Can you gist or pastebin a dump of your example graph so that I can have a closer look at it myselves?

Cheers,
Willi

Roman

unread,
Jul 14, 2016, 7:17:29 AM7/14/16
to ArangoDB
Sure, I will give you dump of database, I just don't want to put link here publicly, where can I send it to you?

But I think problem is with filter propagation into traversal. When I use simple filter:

filter p.edges[1].type == 'cdp'

then I can see in explain that it's send to traversal

Traversals on graphs:
 Id   Depth      Vertex collections   Edge collections   Filter conditions
  5   1..10000   vDevice              eLink              `Path`.`edges`[1] -> v.`type` == "cdp"

But when I use your suggested filter:

FILTER p.edges[*].type ALL == "cdp" 

Traversals on graphs:
 Id   Depth      Vertex collections   Edge collections   Filter conditions
  5   1..10000   vDevice              eLink              

Filter conditions is empty.

Wilfried Gösgens

unread,
Jul 14, 2016, 8:50:25 AM7/14/16
to ArangoDB
Hi,
On Thursday, July 14, 2016 at 1:17:29 PM UTC+2, Roman wrote:
Sure, I will give you dump of database, I just don't want to put link here publicly, where can I send it to you?


You can send it to hackers at arangodb dot com ;-)
 

But I think problem is with filter propagation into traversal. When I use simple filter:

filter p.edges[1].type == 'cdp'

then I can see in explain that it's send to traversal

Traversals on graphs:
 Id   Depth      Vertex collections   Edge collections   Filter conditions
  5   1..10000   vDevice              eLink              `Path`.`edges`[1] -> v.`type` == "cdp"

But when I use your suggested filter:

FILTER p.edges[*].type ALL == "cdp" 

Traversals on graphs:
 Id   Depth      Vertex collections   Edge collections   Filter conditions
  5   1..10000   vDevice              eLink              

Filter conditions is empty.

 Yes, this is not yet possible as I was trying to point out with 'which will do what you want. However, performance wise it will gain the ability to abort traversals in future ArangoDB releases.'

Roman

unread,
Jul 14, 2016, 9:06:43 AM7/14/16
to ArangoDB
I send you dump. Hmm my bad english :), when I read Arango 3.0 I thought it's already there. Any idea when you will add it? It's quite important feature for us. Currently there's no other way how to do it?

Wilfried Gösgens

unread,
Jul 14, 2016, 10:37:00 AM7/14/16
to ArangoDB
Hi,
no problem - yes, got it.
Its currently being worked on. We hope to have it available as part of the next release - but you never know.

Currently the traverser can only stop on direct path FILTER expressions.

Not a native speaker as well, no problem ;-)

Cheers,
Willi

Roman

unread,
Sep 21, 2016, 10:29:53 AM9/21/16
to ArangoDB
Hi Willi, today I tried 3.1alpha2 and it seems that you didn't add ability to traverse on subgraphs which would allow what we discussed before?

Thanks

Roman
Reply all
Reply to author
Forward
0 new messages