create new relationship in a single cypher query based on some condition

abhi

unread,

Oct 31, 2013, 9:54:02 AM10/31/13

to ne...@googlegroups.com

I have around 3000 relationships between d -> c ..like : PAYS, WirePAYS, WireBTPAYS, Apr_2012PAYS etc.
I want to create relationships like PAYS_I, WirePAYS_I etc if they have commnon pn1 and pn2 as per match and where clause.

if i execute like below . But if i run like this it takes a lot of time for individual..Any otherway without timeout error.

START d = node:node_auto_index('companyName:*')
MATCH pn1<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn2
WHERE pn1 = pn2
CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

any alternative way so that i pass all rel name in MATCH clause and CREATE new relationships with _I in a single cypher query.

If I tried like below I got timeout error .

start d=node:node_auto_index('companyName:*')
match pn1<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn2
where pn1=pn2 return id(d), id(c), rel.totalAmount, rel.totalCount

abhi

unread,

Oct 31, 2013, 10:21:43 AM10/31/13

to ne...@googlegroups.com

i tried like below . but no success.

START d = node:node_auto_index('companyName:*')

MATCH pn1<-[:SUB_OF]-d-[rel:WirePAYS| WireBTPAYS| Apr_2012PAYS| Trade BIRDec_2011PAYS| WireBTApr_2012PAYS]->c-[:SUB_OF]->pn2

Michael Hunger

unread,

Oct 31, 2013, 10:28:15 AM10/31/13

to ne...@googlegroups.com

how many companies do you have?

try this:

START d = node:node_auto_index('companyName:*')

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn

CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

Michael

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

abhi

unread,

Oct 31, 2013, 10:36:18 AM10/31/13

to ne...@googlegroups.com

Hi Michael. Thanks

see here :

pn = parentNode which are more then 5 millions.

d and c are = debitNode and credit Node which are also around 15-20 millions.

now the edges between d and c are around 3000 types.

So, i want to create new edges based on earlier query. In the match clause I want to pass all exsisting relation names except 3 rels like SUB_OF, BELONGS_TO and LOCATED_IN. and then want to create new relations like WirePAYS_I etc.

As per your help : here again in match and create syntax line. I have to create individiaul new relationships_I.

Thanks .:)

Michael Hunger

unread,

Oct 31, 2013, 10:41:51 AM10/31/13

to ne...@googlegroups.com

If you have that many nodes you have to batch through the results

something like this, and increase skip from 0 in 10k steps until you're done.

START d = node:node_auto_index('companyName:*')

WITH d

SKIP 10000 LIMIT 10000

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn

CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

another option for the match is (but I doubt that it is faster).

START d = node:node_auto_index('companyName:*')

WITH d

SKIP 10000 LIMIT 10000

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c

WHERE (c-[:SUB_OF]->pn)

CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

abhi

unread,

Oct 31, 2013, 10:53:38 AM10/31/13

to ne...@googlegroups.com

but Michael,

if I run this query then I have to run this many times.. See in ,

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn

CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

for another rel like WireBTPAYS I have to change the above query and run it again. For 3000 relations between d & c I have to run this query 3000 times.

can I do it using your batch code help + below one :

// your batch code goes here

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS | WireBTPAYS]->c-[:SUB_OF]->pn

CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

CREATE d-[rel:WireBTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

//creation of other relationships out of 3000 rels.

Thanks. :)

abhi

unread,

Oct 31, 2013, 3:01:55 PM10/31/13

to ne...@googlegroups.com

I tried to find all rels between d -> c.

START d = node:node_auto_index('companyName:*')

WITH d

SKIP 10000 LIMIT 10000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return distinct type(rel)

total relationship type in neo4j dashboard = 3882
total nodes = 18809856
total edges = 2456711234
total properties = 22567849911

Again return timeout error with skip and limit clause. Unable to create new relationship in a single shot as per my last thread query .

Thanks
Abhi

abhi

unread,

Nov 1, 2013, 3:33:56 PM11/1/13

to ne...@googlegroups.com

any suggestions. because multiple pipe clause with relations is not possible with create clause.

Cheers. :)

Michael Hunger

unread,

Nov 1, 2013, 8:09:54 PM11/1/13

to ne...@googlegroups.com

Can you run this for me?

I still have no good understanding of your data model and what kind of amounts you're touching with this query.

And there is always the option to reduce the skip/limit window to 1000.

START d = node:node_auto_index('companyName:*')

WITH d

SKIP 10000 LIMIT 10000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn

return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

Abhi Garg

unread,

Nov 2, 2013, 5:33:20 AM11/2/13

to ne...@googlegroups.com

Hi Michael,

I ran the query as follows with skip/limit value as 1000.

START d = node:node_auto_index('companyName:*')
WITH d

SKIP 0 LIMIT 1000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 14121 146 425 14121 480

START d = node:node_auto_index('companyName:*')
WITH d

SKIP 1000 LIMIT 1000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 12540 153 330 12540 476

START d = node:node_auto_index('companyName:*')
WITH d

SKIP 2000 LIMIT 1000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 13815 123 315 13815 421

START d = node:node_auto_index('companyName:*')
WITH d

SKIP 3000 LIMIT 1000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 13679 132 306 13679 404

START d = node:node_auto_index('companyName:*')
WITH d

SKIP 4000 LIMIT 1000

MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 21741 131 368 21741 535

And yes for new edge name creation i have to build the query for each rel like below by first collecting all relationship names ? is it feasible solution ?

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn
CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

MATCH pn<-[:SUB_OF]-d-[rel:BTPAYS]->c-[:SUB_OF]->pn
CREATE d-[rel:BTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c
MATCH pn<-[:SUB_OF]-d-[rel:WireBTPAYS]->c-[:SUB_OF]->pn

CREATE d-[rel:WireBTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c