Cypher: How to get top 1 record for each X?

51 views
Skip to first unread message

Paul Lam

unread,
Nov 14, 2012, 10:10:29 AM11/14/12
to ne...@googlegroups.com
This came about from looking at using Cypher as a recommender. If person 1 likes A; person 2 likes A and B; then you would recommend B to person 1. I can do the first part in Cypher to list top common shared likes items, but how do I use that resulting table to get a list of user names that I should recommend?

Using this graph as an example -- http://console.neo4j.org/r/2096v6

This cypher query returns the top shared likes products.

START   food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user")
MATCH   food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user
WITH    person, x
MATCH   person-[:LIKE]->y-[:IS_A]->cc
WHERE   NOT x = y
WITH    x, y, COUNT(*) AS cnt
RETURN  x.name, y.name, cnt
ORDER BY cnt DESC
LIMIT 10; 

How do I use WITH to get for each X, return the top Y. Then I can do,
MATCH person-[:LIKE]->x
WHERE NOT(person-[:LIKE]->y)
RETURN x, y, person.name

One expected result using the linked example is:

X, Y, names
apple, orange, Bob

etc. for each X and its corresponding top Y.

Peter Neubauer

unread,
Nov 15, 2012, 1:51:58 PM11/15/12
to Neo4j User

Hi Paul,
you can fold a lot of this into comma separated MATCH paths and multiple WHERE filters. I did now

START   food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user") 
MATCH   food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user, person-[:LIKE]->y-[:IS_A]->cc, person-[:LIKE]->x 
WHERE   person-[:LIKE]->y-[:IS_A]->cc AND NOT x = y AND NOT(person-[:LIKE]->y) 
RETURN  x.name, y.name, person.name, count(*) as cnt 
ORDER BY cnt DESC 
LIMIT 10;



Not sure that is what you are expecting?

/peter

Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Neo4j 1.8 GA - http://www.dzone.com/links/neo4j_18_release_fluent_graph_literacy.html



--
 
 

Paul Lam

unread,
Nov 16, 2012, 5:40:16 AM11/16/12
to ne...@googlegroups.com
Thanks for cleaning that up. What I need is the next step though. Using this on the web console,

START   food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user") 
MATCH   food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user,
       person-[:LIKE]->y-[:IS_A]->food
WHERE   NOT x = y
RETURN  x.name, y.name, count(*) as cnt 
ORDER BY cnt DESC 
LIMIT 10;

We get that people likes apple prefer orange the most as a second item. I want to get all the people that likes apple but not orange. Which I can do like this:

START   apple = node:node_auto_index(name = "apple"), orange = node:node_auto_index(name = "orange")
MATCH   person-[:LIKE]->apple
WHERE   NOT(person-[:LIKE]->orange)
RETURN  person.name

My question though, is how to get a generalised query of the above automatically from the first query so that for each food X, it returns that best Y (e.g. for apple, it's orange only; for bread, it's apple; and etc) and the list of people that likes X but not yet Y.
Reply all
Reply to author
Forward
0 new messages