create new relationship in a single cypher query based on some condition

468 views
Skip to first unread message

abhi

unread,
Oct 31, 2013, 9:54:02 AM10/31/13
to ne...@googlegroups.com

I have around 3000 relationships between d -> c ..like : PAYS, WirePAYS, WireBTPAYS, Apr_2012PAYS etc.
I want to create relationships like PAYS_I, WirePAYS_I etc if they have commnon pn1 and pn2 as per match and where clause.

if i execute like below . But if i run like this it takes a lot of time for individual..Any otherway without timeout error.

START d = node:node_auto_index('companyName:*')
  MATCH pn1<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn2
  WHERE pn1 = pn2
  CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c


any alternative way so that i pass all rel name in MATCH clause and CREATE new relationships with _I in a single cypher query.


If I tried like below I got timeout error .

start d=node:node_auto_index('companyName:*')
match pn1<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn2
where pn1=pn2 return id(d), id(c), rel.totalAmount, rel.totalCount

abhi

unread,
Oct 31, 2013, 10:21:43 AM10/31/13
to ne...@googlegroups.com
i tried like below . but no success.
 
START d = node:node_auto_index('companyName:*')

MATCH pn1<-[:SUB_OF]-d-[rel:WirePAYS| WireBTPAYS| Apr_2012PAYS| Trade BIRDec_2011PAYS| WireBTApr_2012PAYS]->c-[:SUB_OF]->pn2

Michael Hunger

unread,
Oct 31, 2013, 10:28:15 AM10/31/13
to ne...@googlegroups.com
how many companies do you have?

try this:

START d = node:node_auto_index('companyName:*')

  MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn 


  CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

Michael


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

abhi

unread,
Oct 31, 2013, 10:36:18 AM10/31/13
to ne...@googlegroups.com
Hi Michael. Thanks
 
see here :
 pn = parentNode which are more then 5 millions.
 d and c are = debitNode and credit Node which are also around 15-20 millions.
 now the edges between d and c are around 3000 types.
 
So, i want to create new edges based on earlier query. In the match clause I want to pass all exsisting relation names except 3 rels like SUB_OF, BELONGS_TO and LOCATED_IN. and then want to create new relations like WirePAYS_I etc.
 
As per your help : here again in match and create syntax line. I have to create individiaul new relationships_I.
 
Thanks .:)
 

Michael Hunger

unread,
Oct 31, 2013, 10:41:51 AM10/31/13
to ne...@googlegroups.com
If you have that many nodes you have to batch through the results

something like this, and increase skip from 0 in 10k steps until you're done.

START d = node:node_auto_index('companyName:*')

WITH d
SKIP 10000 LIMIT 10000

  MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn 

  CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

another option for the match is (but I doubt that it is faster).

START d = node:node_auto_index('companyName:*')

WITH d
SKIP 10000 LIMIT 10000

  MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c

WHERE (c-[:SUB_OF]->pn)

 
  CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c

abhi

unread,
Oct 31, 2013, 10:53:38 AM10/31/13
to ne...@googlegroups.com
but Michael,
 
if I run this query then I have to run this many times.. See in ,
 
MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn 
CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c
 
for another rel like WireBTPAYS I have to change the above query and run it again. For 3000 relations between d & c I have to run this query 3000 times.
 
can I do it using your batch code help + below one :
 
// your batch code goes here
MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS | WireBTPAYS]->c-[:SUB_OF]->pn 
CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c
CREATE d-[rel:WireBTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c
//creation of other relationships out of 3000 rels.
 
 
Thanks. :)

abhi

unread,
Oct 31, 2013, 3:01:55 PM10/31/13
to ne...@googlegroups.com
I tried to find all rels between d -> c.

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 10000 LIMIT 10000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return distinct type(rel)

total relationship type in neo4j dashboard = 3882
total nodes = 18809856
total edges = 2456711234
total properties = 22567849911


Again return timeout error with skip and limit clause. Unable to create new relationship in a single shot as per my last thread query .

Thanks
Abhi




abhi

unread,
Nov 1, 2013, 3:33:56 PM11/1/13
to ne...@googlegroups.com
any suggestions. because multiple pipe clause with relations is not possible with create clause. 

Cheers. :)


Michael Hunger

unread,
Nov 1, 2013, 8:09:54 PM11/1/13
to ne...@googlegroups.com
Can you run this for me?

I still have no good understanding of your data model and what kind of amounts you're touching with this query.

And there is always the option to reduce the skip/limit window to 1000.

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 10000 LIMIT 10000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)



Abhi Garg

unread,
Nov 2, 2013, 5:33:20 AM11/2/13
to ne...@googlegroups.com
Hi Michael,

I ran the query as follows with skip/limit value as 1000. 

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 0 LIMIT 1000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 14121   146   425   14121   480


START d = node:node_auto_index('companyName:*')
WITH d
SKIP 1000 LIMIT 1000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 12540   153   330   12540   476

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 2000 LIMIT 1000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 13815   123   315   13815   421

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 3000 LIMIT 1000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 13679   132   306   13679   404

START d = node:node_auto_index('companyName:*')
WITH d
SKIP 4000 LIMIT 1000
MATCH pn<-[:SUB_OF]-d-[rel]->c-[:SUB_OF]->pn
return count(*), count(distinct pn), count(distinct d), count(distinct rel), count(distinct c)

=> 21741   131   368   21741   535

And yes for new edge name creation i have to build the query for each rel like below by first collecting all relationship names ? is it feasible solution ? 

MATCH pn<-[:SUB_OF]-d-[rel:WirePAYS]->c-[:SUB_OF]->pn 
CREATE d-[rel:WirePAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c 
MATCH pn<-[:SUB_OF]-d-[rel:BTPAYS]->c-[:SUB_OF]->pn 
CREATE d-[rel:BTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c 
MATCH pn<-[:SUB_OF]-d-[rel:WireBTPAYS]->c-[:SUB_OF]->pn 
CREATE d-[rel:WireBTPAYS_I { totalCount : rel.totalCount, totalAmount : rel.totalAmount }]->c 
...
...


Cheers. :) 




Reply all
Reply to author
Forward
0 new messages