Creating many relationships with one query

3,131 views
Skip to first unread message

Gleb Chermennov

unread,
Nov 18, 2013, 5:49:50 AM11/18/13
to ne...@googlegroups.com
I'm trying to create multiple relationships at once, using CREATE UNIQUE clause (I was advised to stick with Cypher query).
Is that scenario possible at all? Coudn't find an answer in google or in docs.
I'm importing a sql database, so creating relationships one by one is not an option for me.
My initial attempt was this query:
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right ORDER BY 1 SKIP 0 LIMIT 1 RETURN left, right
UNION
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right ORDER BY 1 SKIP 3 LIMIT 1 RETURN left, right 
UNION
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right ORDER BY 1 SKIP 6 LIMIT 1 RETURN left, right 
CREATE UNIQUE left-[rel:FRIEND]->right RETURN rel;
The idea is, I'm assembling a dataset of nodes I want to connect, then creating relationships between them.
This query doesn't work because there can't be multiple results statements.
Then I tried another query:
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right LIMIT 1 UNION
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right SKIP 4 LIMIT 1 UNION
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right SKIP 8 LIMIT 1 
CREATE UNIQUE left-[rel:FRIEND]->right RETURN rel;
but that doesn't work either (query analyser throws an error).
My final approach was this (same as number 2 but without unions):
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right LIMIT 1
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right SKIP 4 LIMIT 1
MATCH left, right WHERE (ID(right) IN [1, 2, 3] AND ID(left) IN [4, 5, 6]) WITH left, right SKIP 8 LIMIT 1 
CREATE UNIQUE left-[rel:FRIEND]->right RETURN rel;
but that just returns 0 rows - so it doesn't work as well.
Any suggestions? There's something I'm doing wrong here, obviously.

Nigel Small

unread,
Nov 18, 2013, 5:54:59 AM11/18/13
to Neo4J
You can include multiple CREATE UNIQUE statements within a single query, something like this:

MERGE (a:Person { name:'Alice' })
MERGE (b:Person { name:'Bob' })
CREATE UNIQUE (a)-[:KNOWS]->(b)
CREATE UNIQUE (b)-[:KNOWS]->(a)

Nige


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Gleb Chermennov

unread,
Nov 18, 2013, 6:11:53 AM11/18/13
to ne...@googlegroups.com
Can this query be applied to more than 2 nodes?

понедельник, 18 ноября 2013 г., 14:54:59 UTC+4 пользователь Nigel Small написал:

Nigel Small

unread,
Nov 18, 2013, 6:33:34 AM11/18/13
to Neo4J
Yes, I just gave a small example. You can have as many as you like.

Gleb Chermennov

unread,
Nov 18, 2013, 6:38:48 AM11/18/13
to ne...@googlegroups.com
but if I can only do exact comparison (i.e. on node properties), that doesn't sound very good. 

понедельник, 18 ноября 2013 г., 15:33:34 UTC+4 пользователь Nigel Small написал:

Nigel Small

unread,
Nov 18, 2013, 6:42:38 AM11/18/13
to Neo4J
Then I'm not sure I understand your use case. I thought you wanted to be able to execute multiple CREATE UNIQUE clauses in one statement.

Gleb Chermennov

unread,
Nov 18, 2013, 7:54:43 AM11/18/13
to ne...@googlegroups.com
yes I do.

понедельник, 18 ноября 2013 г., 15:42:38 UTC+4 пользователь Nigel Small написал:

Gleb Chermennov

unread,
Nov 18, 2013, 8:15:20 AM11/18/13
to ne...@googlegroups.com
oh, my mistake - I didn't say it explicitly, but I need to create multiple relationships at once, but it may not necessarily be the same nodes (i.e. not two-way relation between nodes a and b, but relation between b and c, b and d, e and f, etc.)

понедельник, 18 ноября 2013 г., 15:42:38 UTC+4 пользователь Nigel Small написал:

Gleb Chermennov

unread,
Nov 18, 2013, 8:25:06 AM11/18/13
to ne...@googlegroups.com
I experimented a bit more and came up with this version of your query:
MERGE (a:User { Id:1 })
MERGE (d:User { Id:2 })
MERGE (b:User { Id:100 })
MERGE (c:User { Id:101 })
CREATE UNIQUE (a)-[:FRIEND]->(b), (c)-[:FRIEND]->(d)

It can be, I guess, extended to support fairly large number of nodes/relationships. 
Thanks for your help, appreciate it.

понедельник, 18 ноября 2013 г., 17:15:20 UTC+4 пользователь Gleb Chermennov написал:

Gustavo Tandeciarz

unread,
Feb 26, 2014, 7:22:42 AM2/26/14
to ne...@googlegroups.com
I just wanted to quickly chime in here as there seems to be a max limit (although, I don't know what that is but it seems to be based on memory allocation) to creating relationships in bulk.
Graph DB node counts:
Contact nodes: 42k
ContactMembership nodes: 52k
ContactMembershipType nodes: 6k

When I try the following query the system errors out with "Unknown Error":

MATCH (s:ContactMembership), 
(contact:Contact {ContactId : s.ContactId}) , 
(contactmembershiptype:ContactMembershipType {ContactMembershipTypeId : s.ContactMembershipTypeId})

MERGE (contact)-[:CONTACT_CONTACTMEMBERSHIPTYPE {ContactId : s.ContactId, ContactMembershipTypeId : s.ContactMembershipTypeId}]->(contactmembershiptype)

However, the same query works when I do Contacts and Addresses.
Contacts: 42k
Addresses: 50k
ContactAddresses: 50k

MATCH (s:ContactAddress), 
(contact:Contact {ContactId : s.ContactId}) , 
(address:Address {AddressId : s.AddressId})

MERGE (contact)-[:CONTACT_ADDRESS {ContactId : s.ContactId, AddressId : s. AddressId}]->(address)

I'm not sure if this is a memory allocation issue or not.  I posted a question on StackOverflow about this.
And one about memory not being allocated anymore after multiple service restarts:

Michael Hunger

unread,
Feb 26, 2014, 8:33:09 AM2/26/14
to ne...@googlegroups.com
I think creating huge statements like that torment the parser.

Do small enough ones, i.e. at most 15 (in extreme cases 50) element
clauses per statement.

Otherwise split it up in smaller statements each of which creates a
smaller subgraph that you then connect.

Gustavo Tandeciarz

unread,
Feb 26, 2014, 11:15:42 AM2/26/14
to ne...@googlegroups.com
Yeah, it's very clear that the parser does not like that.  I will run that statement and the processor will stay locked at 30-40% even after the console returns the incredibly useful, "Unknown Error"


You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/M9R0VqR-CVs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

Gustavo Tandeciarz

unread,
Feb 26, 2014, 11:22:35 AM2/26/14
to ne...@googlegroups.com
Ah, weird!  It completed the process even though the console returned "Unknown Error", which I'm assuming is just a timeout.

The really odd thing is that the same exact query running against Contact - > ContactAddress - > Address works fine and takes about 12 seconds to complete for 50k+ node/relationships.  But for ContactMembership (50k node/relationships) it times out (takes much longer).  

This completes after a timeout (so more than 30 seconds I guess):

MATCH (contactmembership:ContactMembership)
MATCH (contact:Contact {ContactId : contactmembership.ContactId})
MATCH (contactmembershiptype:ContactMembershipType {ContactMembershipTypeId : contactmembership.ContactMembershipTypeId})
MERGE (contact)-[r:CONTACT_CONTACTMEMBERSHIPTYPE]->(contactmembershiptype)

This completes in 12 seconds (roughly the same number of relationships being created:

MATCH (ca:ContactAddress)
MATCH (c:Contact {ContactId:ca.ContactId})
MATCH (a:Address {AddressId:ca.AddressId})
MERGE p = (c)-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->(a)

The difference is that the ContactMembership node contains other properties like ContactMembershipStatusId and ContactMembershipSubStatusId whereas the ContactAddress nodes only contain ContactId and AddressId.

ContactMembership is schema indexed on all properties. ContactAddress is not.  For comparison, I've tried ContactMembership with schema indexing and without - with no difference in the result.

Michael Hunger

unread,
Feb 26, 2014, 12:05:53 PM2/26/14
to ne...@googlegroups.com
Yep, browser timeout after 30s.

I'd still go with smaller statements that are driven from a script and
use parameters.

The problem is, I think, that you create a cross-product between all
those different entities you look up.

So it might be sensible to limit that per clause. Also note that there
is an issue in Neo4j 2.0 which makes tx sizes of > 1k that do updates
and lookups at the same time perform not so well.
Might be better in 2.1-M01

Could you try how long it takes if you can limit your tx size to around 1k ?

Gustavo Tandeciarz

unread,
Feb 26, 2014, 12:13:07 PM2/26/14
to ne...@googlegroups.com
I was looking to see what the syntax would be to try exactly that.  I get an error with the following statement:


MATCH (contactmembership:ContactMembership)
MATCH (contact:Contact {ContactId : contactmembership.ContactId})
MATCH (contactmembershiptype:ContactMembershipType {ContactMembershipTypeId : contactmembership.ContactMembershipTypeId})
MERGE (contact)-[r:CONTACT_CONTACTMEMBERSHIPTYPE]->(contactmembershiptype)
SKIP 100 LIMIT 100

How would I batch this out?  Once I have the cypher query working, I can try to find a way to do this from the c# application automatically, rather than having to crunch the numbers and do it manually.

When I run the count for each entity, I get 50981 across the board. So I think it's matching 1:1:1 since there are 50981 ContactMembership nodes altogether.

If I'm going to do this by passing in parameters, I'm going to need to modify the c# code to account for that, and I guess, put everything in a foreach loop?  with the list being ContactMembership items passed in?  The issue I ran into doing it this way before is that I had to write it out as foreach (x in y | Merge c:Contact {id = x.id} , Merge cmt:ContactMembershipType {id=x.cmtId} and then a final merge to create the relationship between c and cmt.  The foreach method was actually much slower than the match method above (most likely because of the 3 merge statements within it.

Gustavo Tandeciarz

unread,
Feb 26, 2014, 12:22:17 PM2/26/14
to ne...@googlegroups.com
It looks like I might need to use WITH before the merge statement where I put my limit / skip logic

MATCH (contactmembership:ContactMembership)
MATCH (contact:Contact {ContactId : contactmembership.ContactId})
MATCH (contactmembershiptype:ContactMembershipType {ContactMembershipTypeId : contactmembership.ContactMembershipTypeId})
WITH contactmembership, contact, contactmembershiptype SKIP 100 LIMIT 100 
MERGE (contact)-[r:CONTACT_CONTACTMEMBERSHIPTYPE]->(contactmembershiptype)

BTW, doing some math:
100 takes 561ms
500 takes 2003ms
1000 takes 3690ms
5000 takes 17092ms

So I get a better ratio of time/node the higher the batch (obviously limited by the timeout).

Gustavo Tandeciarz

unread,
Feb 26, 2014, 3:54:26 PM2/26/14
to ne...@googlegroups.com
FYI, I got this working now :)
I'm batching my cypher queries on the C# side and using a FOREACH with RANGE loop to manage the relationships.  Average time to create each relationship is 78ms per batch of 500 :)

Michael Hunger

unread,
Feb 27, 2014, 3:53:11 AM2/27/14
to ne...@googlegroups.com
Ok, cool.

What I'm really wondering about is that the operation you're doing really looks a lot like a relational database join on foreign keys? Is that because this was your raw data and you now recreate the rels?
Why not creating the rels when you insert the data into Neo4j in the first place?

ContactMembership sounds like a join table name? What's its usage? Perhaps you can also share a drawn model of your graph? (or perhaps even create a graph-gist -> gist.neo4j.org)
Also could you represent ContactMembershipType as labels on the Contact nodes directly?

Michael

Gustavo Tandeciarz

unread,
Feb 27, 2014, 7:28:55 AM2/27/14
to ne...@googlegroups.com
First, sorry for the long email...I have a lot of explaining to do so you have a good idea as to the project goals:

Your hunches are correct.  Our data is currently housed in a SQL database.  My goal with Neo4j is to create a READ database for read-only queries, pulling those away from our transactional database.  Then, build an ad-hoc reporting engine (data mining) on top of that.  For that reason, I want to try doing this by maintaining the pseudo-normalized data structure, and have the joins be the relationships.   
ContactMembership is one of the more basic examples, and while I could use the ContactMembership matrix to apply labels to the Contact nodes, it can get very tricky very quickly.  Right now, I'm trying to build an automated way of importing the data in bulk (and I think I'm at a good place with that so I can move to step 2 - update specific properties on nodes when data is updated in the transactional database). 

Step 2 is where things can start to get tricky. If a contact's membership changes, I would need to remove the label from the contact (which I think, can be done without deleting the contact node altogether?).  I think in the case of ContactMembership, it makes more sense to have that data be a label on the Contact entity.  It get's a little tricky when contacts have multiple memberships (and those memberships have statuses and substatuses (like Active or inactive, etc).  They are still part of that membership role, but may not be currently active in that role.  Once I get the data updating using generic methods, I will start thinking more heavily about the model and what I want to store, where and how.  

I'm going to build out graph-gist for you (had no idea this existed!).

As for creating the relationships when inserting the data in the first place, I would LOVE to do that, but there are a few caveats to that:
In the example of ContactMembership (which is just a join table), Not every contact is assigned a membership role, so I may miss some contact imports into the system, which could lead to problems later.
I would also have to create a query that passes in the contacts as a list param, the ContactMembership as another list param and any other related sub-tables like ContactMembershipType and ContactMembershipStatus.  Then process all of these by looping through the ContactMembership table?  I guess that would really only be 3 calls to the SQL database and then a lot of looping in the application.  For performance, do you thin that would be faster than first doing a bulk insert of all related tables and sub-tables in the ContactMembership database?  Right now, this is how I'm parsing things out (I'm going to draw this graph in ascii first, then figure out how to make a graph gist and send that later if that's alright):

ContactMembershipTable       -[import]->        ContactTable        -[import]->     ContactOccupationTable
- ContactId                                                   - ContactOccupationId               - OccupationName
- ContactMembershipTypeId                          - ContactPrefixId                       - RenderingOrder
- ContactMembershipStatusId
- ContactMembershipSubStatusId

Above is a basic (first round) example of what my process does right now - described below
It takes the ContactMembershipTable - 
1. Finds all the related sub-tables and their children 
2. Imports all subtables
3. Relates the children to the parent tables
4. Imports the ContactMembership nodes (This is MUCH faster than passing in the ContactMembership table as a param and doing a foreach loop with multiple merges)
5. Iterates through each property that has a related node type and creates those relations to the primary table (in this case, Contact).
    - So, skip ContactId (because we don't want to relate to ourselves...that's dumb)
    - Relate Contact nodes by Id to the appropriate ContactMembershipType (and so on)

During the import, if it sees that the child table (POCO object) being imported has the property "Name", it assigns that as a property to the Relationship, so that I can query 

     (n)-[r:CONTACTMEMBERSHIPSTATUS]->()
     where r.Name = 'Active'

Now, if I were to add the ContactMembershipType as a label to the contact, I could query (n:Player)-[r:CONTACTMEMBERSHIPSTATUS {Name:'Active'}]->() to return all active players.  I have to be very careful doing this though, because there is a potential for label overlap.  I need to be sure that there are no ContactMembershipTypes that have the same name as another primary node type.  This is the primary reason why I haven't added the labeling logic based on xType table imports. 

This entire process takes a little under 2 minutes (118 seconds) for ContactMembership and ContactAddress together, give or take, and is 100% automated now.  All I do is pass in the matrix (i.e. ContactMembership or ContactAddress) and the PrimaryRelationNodeType (i.e Contact)

192320    Unique Nodes
1709926  Properties
7             RelationshipTypeId's
123909    Relationships


Right now, I'm building this as a C# console app, so that later I can pass in arguments to run targeted imports.  

c:\ToNeo4j.exe -importMatrix ContactMembership -importChildren 
or 
c:\ToNeo4j.exe -relateOnly -source ContactPlayer -target Contact -sourceId ContactPlayerId -targetId ContactId

After all is said and done, I might build it as a windows form later so that I can have a GUI, listing all the POCO objects that can be imported and add flags to the import/relation queries.  Then, after THAT is done, add a config option and release it as open source so others can migrate their own data to Neo4j (or make mine better).  I'm primarily a web-front end developer so my c# code may not be as awesome as it could be.

If you can recommend a good method for importing the data and relating the nodes at the same time (using c# because I have never developed in JAVA), I'm all ears.  I don't know what I don't know...but am more than happy to learn!

Here is the Console output:
(after reading through the output, I realized that I could also track relationships I've already created (not just POCO's inserted) so as to not do duplicate relations, shaving off more time)

This gets execute by running:

Program.cs     
    Migrator.processContactMembership();
    Migrator.processContactAddresses();

Migrator.cs
    public static class Migrator
    {
        private static INeo4jRepository neoRepo;
        private static List<Type> alreadyInsertedTypes;
        public static void initialize()
        {
            neoRepo = ObjectFactory.GetNamedInstance<INeo4jRepository>("NEO_RAIDER");
        }
        public static void processContactAddresses()
        {
           alreadyInsertedTypes = neoRepo.processRelationshipMatrix(new ContactAddress(), new Contact(), alreadyInsertedTypes);
        }
        public static void processContactMembership()
        {
           alreadyInsertedTypes = neoRepo.processRelationshipMatrix(new ContactMembership(), new Contact(), alreadyInsertedTypes);
        }
        public static void resetInsertTypesList()
        {
            alreadyInsertedTypes = null;
        }
    }


/* Console Output */


Process Starting


Starting to insert Contact items
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++]
________________________________________________________________________________

- Total time for 42149 Contact items: 19303.7806ms
- Average time per batch (100): 45.3104265402844ms



Starting to insert ContactPrefixType items
[+]
________________________________________________________________________________

- Total time for 5 ContactPrefixType items: 29.0203ms
- Average time per batch (100): 29ms



Starting to relate Contact to ContactPrefixTypeId
________________________________________________________________________________

- Total time taken to relate Contact to ContactPrefixTypeId: 5163.7479ms



Starting to insert ContactOccupationCategory items
[+]
________________________________________________________________________________

- Total time for 29 ContactOccupationCategory items: 39.0269ms
- Average time per batch (100): 39ms



Starting to relate Contact to ContactOccupationCategoryId
________________________________________________________________________________

- Total time taken to relate Contact to ContactOccupationCategoryId: 6081.6246ms




Starting to insert ContactMembershipType items
[+]
________________________________________________________________________________

- Total time for 19 ContactMembershipType items: 30.9198ms
- Average time per batch (100): 30ms



Starting to insert ContactMembershipStatus items
[+]
________________________________________________________________________________

- Total time for 6 ContactMembershipStatus items: 21.6644ms
- Average time per batch (100): 21ms


//ContactMembershipSubStatusId relates to ContactMembershipStatus table so no need to import it twice
Already inserted data for ContactMembershipStatus.



Starting to insert ContactMembership items
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++]
________________________________________________________________________________

- Total time for 50191 ContactMembership items: 13146.4121ms
- Average time per batch (100): 25.7111553784861ms



Processing (50191) relations for Contact --> ContactMembershipType
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++]
________________________________________________________________________________

- Total time: 8171.7013
- Average time per batch (500): 80.8613891089109ms



Processing (50191) relations for Contact --> ContactMembershipStatus
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++]
________________________________________________________________________________

- Total time: 7630.0961
- Average time per batch (500): 75.5192732673267ms



Processing (50191) relations for Contact --> ContactMembershipSubStatus
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++]
________________________________________________________________________________

- Total time: 7440.6537
- Average time per batch (500): 73.6403237623762ms


//Starting ContactAddress Matrix Import
Already inserted data for Contact.



Already inserted data for ContactPrefixType.



Starting to relate Contact to ContactPrefixTypeId
________________________________________________________________________________

- Total time taken to relate Contact to ContactPrefixTypeId: 1314.7656ms



Already inserted data for ContactOccupationCategory.



Starting to relate Contact to ContactOccupationCategoryId
________________________________________________________________________________

- Total time taken to relate Contact to ContactOccupationCategoryId: 3706.0297ms




Starting to insert Address items
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++]
________________________________________________________________________________

- Total time for 50243 Address items: 15247.0439ms
- Average time per batch (100): 29.8588469184891ms



Starting to insert Country items
[+]
________________________________________________________________________________

- Total time for 14 Country items: 20.4331ms
- Average time per batch (100): 20ms



Starting to relate Address to CountryId
________________________________________________________________________________

- Total time taken to relate Address to CountryId: 3989.2875ms



Starting to insert ContactAddress items
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++]
________________________________________________________________________________

- Total time for 49664 ContactAddress items: 11474.8396ms
- Average time per batch (100): 22.6197183098592ms



Processing (49664) relations for Contact --> Address
[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++]
________________________________________________________________________________

- Total time: 7388.1813
- Average time per batch (500): 73.8421ms


Process Complete
________________________________________________________________________________

Total time 118642.5606ms



Gustavo Tandeciarz

unread,
Feb 27, 2014, 9:35:47 PM2/27/14
to ne...@googlegroups.com
Quick question:
You mentioned setting the relationship when I insert the data. 
How would I do that using cypher and params? 1 param as a list for left nodes and 1 param as a list for the right? How would I write that out in cypher so that it loops through them correctly?
Btw, thanks for all the feedback. It's helping a lot.

Gustavo Tandeciarz

unread,
Feb 28, 2014, 7:21:19 AM2/28/14
to ne...@googlegroups.com
Here's a gist... Kind of models what I'm trying to do, but the random number generator makes things weird.

Michael Hunger

unread,
Feb 28, 2014, 9:42:42 AM2/28/14
to ne...@googlegroups.com
You would create the rels in the query while creating the nodes.

Michael

Gustavo Tandeciarz

unread,
Feb 28, 2014, 10:17:45 AM2/28/14
to ne...@googlegroups.com
Sorry, that part I understand... but what would the query look like using parameters for both source and target nodes?
because I'm passing in a list of source nodes and a list of target nodes, I need to iterate through each source node, then through each target node and then do the merge?
i.e. 

FOREACH (x in [listTarget] |
  FOREACH (n in [listSource] |
  MERGE (s:SourceLabel {sourceId : x.sourceId} )-[r:RELATIONTYPE]->(t:TargetLabel)
  SET s = n, t = x,  r.sourceId = s.sourceId , r.targetId = t.targetId
))

This wouldn't be something I could batch because s.targetId might reference an Id that hasn't been inserted yet via a previous batch unless I batch in the first foreach loop list only?  This is why I'm asking :P

Do you know of any performance issues is, let's say listSource is 100,000 long?

Or maybe a different question:  Can you match on a param passed in as a list?

MATCH [sourceList], [targetList] ... etc  (I'm not even sure how this would end up getting written out)



Reply all
Reply to author
Forward
0 new messages