Filtering on node properties within path - Cypher

1,128 views
Skip to first unread message

Rob Goldsmith

unread,
Jul 23, 2012, 11:20:03 AM7/23/12
to ne...@googlegroups.com
Hi

I have a Cypher query which looks like this:

START n=node(some_node_id) 
MATCH p = n-[*1..10]-m
RETURN p

The db consists primarily of user nodes and company nodes with one main relationship type memberOf.
Most users are members of 5 or fewer companies but there are some users who are members of 100s of companies. I'd like to try and find a way of excluding these users from the search results i.e. have a WHERE clause to filter the intermediate nodes based on the number of relationships those nodes have. Is this possible with Cypher? If not, is there any other way I could attempt it?

As a work-around, I could change the relationship type to multiple_memberOf or similar for these users and then filter the Cypher query on the relationship type, but this wouldn't be ideal.



Andres Taylor

unread,
Jul 26, 2012, 10:00:27 AM7/26/12
to ne...@googlegroups.com
Hi Rob,

What does n and m mean? Are you really looking for all paths that start from n and go max 10 steps out like that? In most graphs, a query like that will never end running, or kill the JVM. With no direction and no relationship type declaration, all relationships will be followed. Is that what you want?

Andrés
--
The best way to ask for Cypher help: http://console.neo4j.org/usage.html 

Michael Hunger

unread,
Jul 26, 2012, 5:00:57 PM7/26/12
to ne...@googlegroups.com
Rob,

with cypher you could do a two step process.

I a first (long running) query you'd mark the users that have more than x outgoing relationships

start n=node:user("name:*")
match n-[:memberof]-company
with n, count(*) as cnt
where cnt > 10
set n.poweruser = true
return n

and in you second query (the real one)

you can filter those out
start n=node:user("name:*")
where not (n.poweruser)
with n
match n-[:memberof]-company
... do your business logic ...
return n, company

you can of course combine the two queries but then it would be probably slower all the time.
Another idea is to add the poweruser flag as an index key:value and use an combined lucene index query to filter those out

Otherwise with a Java traversal you could use a relationship-expander that checks the rel-counts and doesn't follow those big nodes.

HTH

Michael

Rob Goldsmith

unread,
Jul 27, 2012, 2:47:31 AM7/27/12
to ne...@googlegroups.com
Thanks Michael and Andres.

Andres - the idea of the query is just to see whether two nodes (user/user, user/company, company/company) are connected to each other.  I was originally using the allShortestPaths algorithm, but kept getting paths that included these "superusers". So I was looking for a way to exclude these superusers from the cypher search. You're right that 10 steps is too far, but even if I was limiting the depth to, say, 6 steps, I'd still like to exclude the superusers.

Michael - thanks for the ideas. I like the idea of setting properties on the powerusers. But can I then filter the paths to exclude these users? In my example, n and m are known nodes, I am looking for a path between them - from which I would like to exclude any powerusers e.g.

path = start_node-[*1..6]-end_node WHERE (not node.poweruser for node in path)

Rob Goldsmith

unread,
Jul 28, 2012, 2:06:34 AM7/28/12
to ne...@googlegroups.com
Agh... I'm really sorry guys, I've just read my original post and I missed a pretty vital part of the Cypher query, which should have read:

START n=node(some_node_id), m=node(some_other_node_id)

as both n and m are known at the beginning. 

Apologies for wasting your time on this one.

Michael Hunger

unread,
Jul 28, 2012, 8:47:24 AM7/28/12
to ne...@googlegroups.com
Where none(n in nodes(path) : n.degree > 100)

Sent from mobile device
Reply all
Reply to author
Forward
0 new messages