Reduce runtime for building many relationships

14 views
Skip to first unread message

Maaz Mohamedy

unread,
Jul 13, 2018, 1:35:39 AM7/13/18
to Neo4j

So I have loaded the movieLens dataset into my graph. It has three nodes: USER, MOVIE and GENRE. I am trying to build a recommendation engine like in this tutorial (https://neo4j.com/graphgist/movie-recommendations-with-k-nearest-neighbors-and-cosine-similarity#_initial_data_model). I have copied and pasted the code from Query 3 (from the article) and adjusted it for my graph:

MATCH (p1:USER)-[x:Has_rated]->(m:MOVIE)<-[y:Has_rated]-(p2:USER)
WITH SUM(x.rating * y.rating) AS xyDotProduct,
SQRT(REDUCE(xDot = 0.0, a IN COLLECT(x.rating) | xDot + a^2)) AS xLength,
SQRT(REDUCE(yDot = 0.0, b IN COLLECT(y.rating) | yDot + b^2)) AS yLength,
p1, p2
MERGE (p1)-[s:SIMILARITY]-(p2)
SET s.similarity = xyDotProduct / (xLength * yLength)

This code above builds a relationship called 'Similarity' (with a similarity score) between all pairs of users.


However, when I execute the code (I am using the python driver) the program just keeps running. I have opened the neo4j web interface and I see that the server keeps losing and re-establishing connection. Eventually I am logged out of my account.


A colleague of mine ran the exact same code on his machine and it executed the query in a few minutes. I also set dbms.memory.heap.max_size=3G in neo4j/conf. What should I do? Why is taking so long to run?

Maaz Mohamedy

unread,
Jul 13, 2018, 3:48:08 AM7/13/18
to Neo4j
Screen Shot 2018-07-13 at 12.45.41 AM.png
Reply all
Reply to author
Forward
0 new messages