Running a Cypher Query to add a property to all existing relationships

1,852 views
Skip to first unread message

Henry Hwangbo

unread,
Jul 12, 2013, 8:06:31 AM7/12/13
to ne...@googlegroups.com
Hi All -

I've been attempting to run a cypher query against an existing graph to add a new property to all relationships.  I'm not having any success:
START link = rel(*) 
WHERE NOT(HAS(link.group_list)) 
SET link.group_list = ['1']

Some graph numbers:
1.2M nodes
7.5M properties
3.7M relationships

Here's the error from the console.log:

Used these sizing suggestions from a prior post: https://groups.google.com/forum/#!topic/neo4j/Pz-zVegxwNk

I'm wondering if I should just run it directly in a neo4j-shell instead of the rest interface.  I had it running as a job, but the HTTPclient actually timed although it was still running on the neo4j server when I went to bed.

Any suggestions would be appreciated on how to best get this property added.

Thanks so much,
Henry

Michael Hunger

unread,
Jul 12, 2013, 8:21:53 AM7/12/13
to ne...@googlegroups.com
I think you're running out of memory for the transaction size, try to page your data:

START link = rel(*) 
WHERE NOT(HAS(link.group_list)) 
    WITH link
    LIMIT 100000
SET link.group_list = ['1']

run this query 37 times from a script

What does group_list mean and why is it an array?

Michael



--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Henry Hwangbo

unread,
Jul 12, 2013, 8:40:09 AM7/12/13
to ne...@googlegroups.com

Ah thanks - I'll give it a shot.

I'm making the graph multi-tenant and using group_list to let reads only pull relationships for a specific user group.  Relationships can be shared across groups.  Does that make sense?

Henry Hwangbo

unread,
Jul 12, 2013, 8:41:45 AM7/12/13
to ne...@googlegroups.com
One more thing - it looks like my neo4j instance won't restart due to out of memory error now.  Not sure how that happened, but might be related to the settings I added to neo4j.properties from the other thread:



On Friday, July 12, 2013 7:21:53 AM UTC-5, Michael Hunger wrote:

Michael Hunger

unread,
Jul 12, 2013, 8:48:11 AM7/12/13
to ne...@googlegroups.com
Make sure that your total heap is twice as large as the configured MMIO. So that you have comfortable 4-8 G for the database besides the fixed mmio file mapping.

On windows the memory-mapped-files space is kept within the heap.

Michael

henry74

unread,
Jul 12, 2013, 8:52:13 AM7/12/13
to ne...@googlegroups.com
I'll continue this on the other thread to keep these topics separate.


--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/J0smooDdVTY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

Henry Hwangbo

unread,
Jul 15, 2013, 10:27:57 PM7/15/13
to ne...@googlegroups.com
So Michael - this helped me quickly add the group_list property to all relationships but now I'm wondering if that was property was indexed...I'm guessing it wasn't.  If not, how can I quickly get an index added for all the group_list properties on the relationships?


On Friday, July 12, 2013 7:21:53 AM UTC-5, Michael Hunger wrote:

Michael Hunger

unread,
Jul 16, 2013, 1:57:40 AM7/16/13
to ne...@googlegroups.com
You can auto index them but you have to re-set them to be added to the index

But arrays are indexed with all values individually so you can only find by a single of the array values and have to use and index query with AND to simulate the array lookup

Michael

Sent from mobile device

henry74

unread,
Jul 16, 2013, 9:28:07 AM7/16/13
to ne...@googlegroups.com

Hi Michael,

When you get s chance can you post a query example of what you explained?

If I want to use the IN syntax within a WHERE statement to check if a value exists within the array property, will an index help speed up that scenario?

Thanks,
Henry

You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/J0smooDdVTY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

Michael Hunger

unread,
Jul 16, 2013, 9:49:45 AM7/16/13
to ne...@googlegroups.com
if you index with the relationship-auto-index the groups property on your relationship and you have a rel with groups = [1,2,3]

then you can do

start r=relationship:rel_auto_index(group="1")
return r

and it returns your relationship

if you want to match the full array you'll have to do:

start r=relationship:rel_auto_index("group:1 AND group:2 AND group:3")
return r


Michael

henry74

unread,
Jul 16, 2013, 12:48:00 PM7/16/13
to ne...@googlegroups.com
Ah thanks.  I guess for my case, I'm starting with an node, but only pulling in other nodes if they are connected with a relationship which has the group_id in the group_list array property.  I don't start with the relationship index.  Does the index still help in this situation?

Henry Hwangbo

unread,
Jul 16, 2013, 12:53:30 PM7/16/13
to ne...@googlegroups.com, hen...@gmail.com
Tangential node - I added the group_list to the relationship_auto_index and it increased query time by about 10x which I assume is due to the added overhead of indexing the property.  Here's a couple of results through neo4j-shell (could use some help on the last one which ended in an error):

neo4j-sh (0)$ START link = rel(*) WHERE NOT(HAS(link.group_list)) WITH link LIMIT 10000 SET link.group_list = [1];
+-------------------+
| No data returned. |
+-------------------+
Properties set: 10000

44029 ms
neo4j-sh (0)$ START link = rel(*) WHERE NOT(HAS(link.group_list)) WITH link LIMIT 10000 SET link.group_list = [1];
+-------------------+
| No data returned. |
+-------------------+
Properties set: 10000

20400 ms
neo4j-sh (0)$ START link = rel(*) WHERE NOT(HAS(link.group_list)) WITH link LIMIT 10000 SET link.group_list = [1];
+-------------------+
| No data returned. |
+-------------------+
Properties set: 10000

15483 ms
neo4j-sh (0)$ START link = rel(*) WHERE NOT(HAS(link.group_list)) WITH link LIMIT 10000 SET link.group_list = [1];
+-------------------+
| No data returned. |
+-------------------+
Properties set: 10000

neo4j-sh (0)$ START link = rel(*) WHERE NOT(HAS(link.group_list)) WITH link LIMIT 50000 SET link.group_list = [1];
SystemException: TM has encountered some problem, please perform neccesary action (tx recovery/restart)
neo4j-sh (0)$    

Henry Hwangbo

unread,
Jul 16, 2013, 1:06:59 PM7/16/13
to ne...@googlegroups.com, hen...@gmail.com
The error before the server stopped responding from console.log:

Peter Neubauer

unread,
Jul 16, 2013, 1:28:08 PM7/16/13
to Neo4j User, hen...@gmail.com
Henry,
you are running this still through the REST API from the WebUI or via the bin/neo4j-shell utility?

/peter


Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Kids in Malmö this summer?        - http://www.kidscraft.se
Neo4j questions? Use GraphGist. - http://gist.neo4j.org

Michael Hunger

unread,
Jul 16, 2013, 1:34:04 PM7/16/13
to ne...@googlegroups.com
No it doesn't help there, you'll have to use IN or a direct comparison. Might be slow though.

What abt different reltypes per tennant?

Sent from mobile device

Henry Hwangbo

unread,
Jul 16, 2013, 1:44:23 PM7/16/13
to ne...@googlegroups.com, hen...@gmail.com
bin/neo4j-shell after sshing into the server running neo4j

Henry Hwangbo

unread,
Jul 16, 2013, 1:44:58 PM7/16/13
to ne...@googlegroups.com, hen...@gmail.com
Here's the console.log when attempting to restart the server after this error.  Going to have to go back to a backup of the data folder.

Reply all
Reply to author
Forward
0 new messages