Cypher: How to get top 1 record for each X?

Paul Lam

unread,

Nov 14, 2012, 10:10:29 AM11/14/12

to ne...@googlegroups.com

This came about from looking at using Cypher as a recommender. If person 1 likes A; person 2 likes A and B; then you would recommend B to person 1. I can do the first part in Cypher to list top common shared likes items, but how do I use that resulting table to get a list of user names that I should recommend?

Using this graph as an example -- http://console.neo4j.org/r/2096v6

This cypher query returns the top shared likes products.

START food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user")

MATCH food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user

WITH person, x

MATCH person-[:LIKE]->y-[:IS_A]->cc

WHERE NOT x = y

WITH x, y, COUNT(*) AS cnt

RETURN x.name, y.name, cnt

ORDER BY cnt DESC

LIMIT 10;

How do I use WITH to get for each X, return the top Y. Then I can do,

MATCH person-[:LIKE]->x

WHERE NOT(person-[:LIKE]->y)

RETURN x, y, person.name

One expected result using the linked example is:

X, Y, names

apple, orange, Bob

etc. for each X and its corresponding top Y.

Peter Neubauer

unread,

Nov 15, 2012, 1:51:58 PM11/15/12

to Neo4j User

Hi Paul,

you can fold a lot of this into comma separated MATCH paths and multiple WHERE filters. I did now

START food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user")

MATCH food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user, person-[:LIKE]->y-[:IS_A]->cc, person-[:LIKE]->x

WHERE person-[:LIKE]->y-[:IS_A]->cc AND NOT x = y AND NOT(person-[:LIKE]->y)

RETURN x.name, y.name, person.name, count(*) as cnt

ORDER BY cnt DESC

LIMIT 10;

http://console.neo4j.org/r/fakfci

Not sure that is what you are expecting?

/peter

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Neo4j 1.8 GA - http://www.dzone.com/links/neo4j_18_release_fluent_graph_literacy.html

--

Paul Lam

unread,

Nov 16, 2012, 5:40:16 AM11/16/12

to ne...@googlegroups.com

Thanks for cleaning that up. What I need is the next step though. Using this on the web console,

START food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user")

MATCH food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user,

person-[:LIKE]->y-[:IS_A]->food

WHERE NOT x = y

RETURN x.name, y.name, count(*) as cnt

ORDER BY cnt DESC

LIMIT 10;

We get that people likes apple prefer orange the most as a second item. I want to get all the people that likes apple but not orange. Which I can do like this:

START apple = node:node_auto_index(name = "apple"), orange = node:node_auto_index(name = "orange")

MATCH person-[:LIKE]->apple

WHERE NOT(person-[:LIKE]->orange)

RETURN person.name

My question though, is how to get a generalised query of the above automatically from the first query so that for each food X, it returns that best Y (e.g. for apple, it's orange only; for bread, it's apple; and etc) and the list of people that likes X but not yet Y.

Reply all

Reply to author

Forward