Temporary storage in Cypher query.

600 views
Skip to first unread message

Kevin Burton

unread,
Apr 5, 2013, 10:21:01 AM4/5/13
to ne...@googlegroups.com
I have a query:

start line=node:SKUIndex("VariantID:168846")
match line<-[:ORDERED]-(lineorder)-[:ORDERED]->(withline)
return withline, count(withline)
order by count(withline) desc
limit 10;


This works fine but what I would really like to do is divide the count(withline) by the number of ORDERED relationships in the start node. Any suggestions on how this would be done? It seems that the 'match' essentially wipes out the knowledge of the starting node. There are two kinds of nodes a product node and an order node. The order node has 1 to n ORDERED relationships for each product that is ordered. So the idea is to start with a product node and traverse each of the ORDERED relationships collecting the products that were ordered with the product in the start node.I would like to "normalize" the results by dividing out the number of times that the product itself was ordered (which would be indicated by the count of ORDERED relationships on the start node). I see no other way other than storing the count of the ORDERED relationships in a temporary variable before the match but I don't know how to do that or if even it is the right approach. Suggestions?

Thank you.

Wes Freeman

unread,
Apr 5, 2013, 10:37:38 AM4/5/13
to ne...@googlegroups.com
This might work (I think I got the division order right--or was it supposed to be lineOrderedCount/withlineCount?)--this is untested code:


start line=node:SKUIndex("VariantID:168846")
match line<-[:ORDERED]-(lineorder)-[:ORDERED]->(withline)
with withline, count(withline) as withlineCount, length(line<-[:ORDERED]-()) as lineOrderedCount
return withline, withlineCount/lineOrderedCount as ratio
order by ratio desc
limit 10;


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Michael Hunger

unread,
Apr 5, 2013, 10:38:38 AM4/5/13
to ne...@googlegroups.com
Could you draw a picture?

match doesn't remove anything, WITH does.

you can do something like length(line-[:ORDERED]->())

path expression return a collection of paths
and length gives you the size of a collection.

Or you do two separate match/with parts and compute the first number in the first part and the second number in the second part and then divide them in return.

start line ...
match line-[:ORDERED]...
with line, count(*) as cnt1
match line-[:ORDERED]...
with line, cnt1, count(*) as cnt2
return line cnt1/cnt2



Kevin Burton

unread,
Apr 5, 2013, 12:20:43 PM4/5/13
to ne...@googlegroups.com

I have attached a very crude attempt a a picture. The start node would be the circle on the left. It has 1 to n ORDERED relationships to an order which is represented as the rectangular box. Each order in turn contains 1 to n ORDERED relationships. The query I started this question with counts the number of products (circles) on the right and returns the number of products and the products. But this could be skewed if the start node has a lot of relationships. By dividing out the number of relationships it would be "normalizing" the count so I could compare the counts. I basically want to store the number of relationships (which is the number of times the product was ordered) and use it for the denominator of the division. I am not sure the exact syntax of the query.

Kevin Burton

unread,
Apr 5, 2013, 12:51:16 PM4/5/13
to ne...@googlegroups.com
This query

start line=node:SKUIndex("VariantID:168846")
match line<-[:ORDERED]-(lineorder)-[:ORDERED]->(withline)
with withline, count(withline) as withlineCount, length(line<-[:ORDERED]-()) as lineOrderedCount
return withline, (withlineCount*1.0)/(lineOrderedCount*1.0) as ratio
order by ratio desc
limit 10;

takes a very long time to execute. Is it the floating point ratio? The 'with'? When I execute

start line=node:SKUIndex("VariantID:168846")
match line<-[:ORDERED]-(lineorder)-[:ORDERED]->(withline)
return withline, count(withline)
order by count(withline) desc
limit 10;

It returns results in less than a second.

Wes Freeman

unread,
Apr 5, 2013, 12:56:35 PM4/5/13
to ne...@googlegroups.com
It's because it must calculate the number of relationships each result record.

Try Michael's way:

start line=node:SKUIndex("VariantID:168846")
match line<-[:ORDERED]-()
with line, count(*) as lineOrderedCount
match line<-[:ORDERED]-(lineorder)-[:ORDERED]->(withline)
with withline, count(withline) as withlineCount, lineOrderedCount
return withline, (withlineCount*1.0)/(lineOrderedCount*1.0) as ratio
order by ratio desc
limit 10;

Wes
Reply all
Reply to author
Forward
0 new messages