calculating clustering coefficient with pure Cypher query

611 views
Skip to first unread message

nikogamulin

unread,
Jun 8, 2012, 8:44:53 AM6/8/12
to Neo4j
Hi,

I would like to calculate a clustering coefficient of a node with pure
cypher query.

The clustering coefficient is defined as the probability that the two
randomly selected neighbors of the observed node are connected to each
other.

So for example if we have a network
A--B
A--C
A--D
A--E
B--F
C--D

the clustering coefficient of A is 1/6 because A has 4 neighbors (B,
C, D, E) and the two neighbors are connected (C--D)

number of possible combinations of connections of n nodes is n!/(2!
(n-2)!) which is 6 and therefore the clustering coefficient is 1/6

to count neighbors I created the query
start a = node(4) match a<-->b return count(b);

and to count connected neighbors
start a = node(4) match (a)--(n1)--()--(a) return count(n1);

When I tried to combine the matches
start a = node(4) match (a)--(n1)--()--(a), (a)--(b) return count(n1),
(b);
the query returned 4, 4 and don't know why.

Does anyone know how to calculate the clustering coefficient with a
single query or create a custom cypher function that would be able to
do that?

I would be also thankful if anyone could explain why the above query
returned 4, 4 and not 2, 4

Michael Hunger

unread,
Jun 8, 2012, 9:05:22 AM6/8/12
to ne...@googlegroups.com
In cypher each node-identifiery that is referred in the mtach can only hold disjunct nodes.

So if you do me--friend-friendoffriend
the friendoffriend will never contain me nor a friend

Kind of uniqueness = Node-Global, each node occurs only once in the subgraph.

I think you can achieve your goal with two query segments connect by with

> start a = node(4) match (a)--(b)
with count(b) as neighbours
match (a)--(n1)--()--(a)
> return count(n1) as connected_neighbours, neighbours
>
see: http://docs.neo4j.org/chunked/snapshot/query-with.html

HTH

Michael
Reply all
Reply to author
Forward
0 new messages