Memory Configuration

74 views
Skip to first unread message

Chris G

unread,
Aug 26, 2014, 5:33:22 PM8/26/14
to ne...@googlegroups.com
Group, I'm trying to wrap me head around the memory configuration for Neo4j.

I've got ~4 million parts that I have loaded and indexed via cypher and have these indexes:

Indexes
  ON :GraphPart(mfr_id)  ONLINE                             
  ON :GraphPart(part_id) ONLINE (for uniqueness constraint) 

Constraints
  ON (graphpart:GraphPart) ASSERT graphpart.part_id IS UNIQUE


Now I want to import my vendors via this cypher:

USING PERIODIC COMMIT 1
LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS csvLine
    FIELDTERMINATOR '\t'
CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID), vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name, vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status })
WITH vendor
MATCH p = (GraphPart {mfr_id: vendor.vendor_id})
FOREACH (n IN nodes(p) | MERGE (n)-[r:MANUFACTURED_BY]->(vendor))


I have configured the conf files:

neo4j.properties:
neostore.nodestore.db.mapped_memory=50M
neostore.relationshipstore.db.mapped_memory=500M
neostore.propertystore.db.mapped_memory=100M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=0M

neo4j-wrapper.conf:

wrapper.java.initmemory=4096
wrapper.java.maxmemory=12288


even with 12G heap and PERIODIC COMMIT 1 messages.log looks like this:
2014-08-26 21:14:08.936+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 719ms [total block time: 16.227s]
2014-08-26 21:14:10.874+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 1630ms [total block time: 17.857s]
2014-08-26 21:14:12.377+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 673ms [total block time: 18.53s]
2014-08-26 21:14:13.715+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 719ms [total block time: 19.249s]
2014-08-26 21:14:15.424+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 1400ms [total block time: 20.649s]
2014-08-26 21:14:16.924+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 754ms [total block time: 21.403s]
2014-08-26 21:14:18.146+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 908ms [total block time: 22.311s]
2014-08-26 21:14:19.881+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 1207ms [total block time: 23.518s]
2014-08-26 21:14:21.551+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 1033ms [total block time: 24.551s]
2014-08-26 21:14:22.801+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 827ms [total block time: 25.378s]
2014-08-26 21:14:49.154+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26040ms [total block time: 51.418s]
2014-08-26 21:14:49.524+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 270ms [total block time: 51.688s]
2014-08-26 21:15:24.662+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 32772ms [total block time: 84.46s]
2014-08-26 21:15:51.122+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26039ms [total block time: 110.499s]
2014-08-26 21:16:24.233+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 32902ms [total block time: 143.401s]
2014-08-26 21:16:50.232+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 25898ms [total block time: 169.299s]
2014-08-26 21:17:20.085+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 29753ms [total block time: 199.052s]
2014-08-26 21:17:46.225+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26040ms [total block time: 225.092s]
2014-08-26 21:21:04.960+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 29433ms [total block time: 254.525s]


Could anyone suggest what I can try next, or some alternative memory settings?

I'm trying to get proof of concept up and running so I can present this to my bosses.

I hope I am missing something simple, if not I think it's time for Neo4j to invest in some canonical documentation on how to configure neo4j memory usage, There are sparse mentions in the user guide, but most of what I find related to performance comes from blog posts, stack overflow questions, and mailing list posts (most of which Michael Hunger is answering). I also hope once I get past these initial memory settings the rest of neo4j will just work.

Thanks for reading,

Chris





Chris G

unread,
Aug 26, 2014, 5:35:03 PM8/26/14
to ne...@googlegroups.com
I want to add that I am using 2.1.3 Enterprise, running on Ubuntu 12.04, 8 cpu's, 16GB memory, virtual machine.

Chris Roberts

unread,
Aug 26, 2014, 9:31:00 PM8/26/14
to ne...@googlegroups.com

I'm going to use the Talend connector for my initial load, then the rest api to keep my graph in sync.

--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/rOr8tL1r-R8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

david fauth

unread,
Aug 27, 2014, 3:31:54 PM8/27/14
to ne...@googlegroups.com
Chris,

You want to set these to the same value. 
wrapper.java.initmemory=8192
wrapper.java.maxmemory=8192

I would also change the neo4j.properties settings to something like:

neo4j.properties:
neostore.nodestore.db.mapped_memory=500M
neostore.relationshipstore.db.mapped_memory=1G
neostore.propertystore.db.mapped_memory=2G
neostore.propertystore.db.strings.mapped_memory=250M
neostore.propertystore.db.arrays.mapped_memory=0M

that should help.

Chris Roberts

unread,
Aug 27, 2014, 4:33:41 PM8/27/14
to ne...@googlegroups.com
It seems to help for a short time,  BUT still no new nodes appear in my graph, no new relationships are created,  and it gets stuck in endless GC after about a minute of seeing all cpus working, the neo4j browser also loses it's connection with the server.

2014-08-27 20:26:45.528+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 721ms [total block time: 8.226s]
2014-08-27 20:26:46.599+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 648ms [total block time: 8.874s]
2014-08-27 20:26:47.888+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 541ms [total block time: 9.415s]
2014-08-27 20:26:49.003+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 691ms [total block time: 10.106s]
2014-08-27 20:26:50.121+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 694ms [total block time: 10.8s]
2014-08-27 20:26:51.832+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 1181ms [total block time: 11.981s]
2014-08-27 20:26:53.409+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 614ms [total block time: 12.595s]
2014-08-27 20:26:54.456+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 625ms [total block time: 13.22s]
2014-08-27 20:26:55.609+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 730ms [total block time: 13.95s]
2014-08-27 20:26:56.700+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 669ms [total block time: 14.619s]
2014-08-27 20:26:57.933+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 594ms [total block time: 15.213s]
2014-08-27 20:26:59.013+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 594ms [total block time: 15.807s]
2014-08-27 20:27:00.197+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 762ms [total block time: 16.569s]
2014-08-27 20:27:01.300+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 681ms [total block time: 17.25s]
2014-08-27 20:27:02.508+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 569ms [total block time: 17.819s]
2014-08-27 20:27:03.555+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 624ms [total block time: 18.443s]
2014-08-27 20:27:04.721+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 742ms [total block time: 19.185s]
2014-08-27 20:27:35.901+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 30755ms [total block time: 49.94s]
2014-08-27 20:28:01.632+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 25202ms [total block time: 75.142s]
2014-08-27 20:28:33.544+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 31396ms [total block time: 106.538s]
2014-08-27 20:28:34.366+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 722ms [total block time: 107.26s]
2014-08-27 20:29:07.678+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 32900ms [total block time: 140.16s]
2014-08-27 20:29:33.301+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 25522ms [total block time: 165.682s]

Chris


--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/rOr8tL1r-R8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
CR

Michael Hunger

unread,
Aug 27, 2014, 5:15:02 PM8/27/14
to ne...@googlegroups.com
Chris,

your cypher query seems to be wrong:

1. split it up into node creation and relationship creation
2. use bigger transaciton sizes
3. you forgot a colon before :GraphPart so it doesn't use an index for that one
4. you don't have do use the path and foreach a simple match is good enough

USING PERIODIC COMMIT 10000

LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS csvLine
    FIELDTERMINATOR '\t'
CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID), vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name, vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status });

create index on :GraphVendor(vendor_id);

USING PERIODIC COMMIT 10000

LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS csvLine
    FIELDTERMINATOR '\t'
WITH toInt(csvLine.Mfr_ID) as vendor_id
MATCH (vendor:GraphVendor { vendor_id: vendor_id})
MATCH (part:GraphPart {mfr_id: vendor_id})
MERGE (part)-[:MANUFACTURED_BY]->(vendor);


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.

Chris Roberts

unread,
Aug 28, 2014, 10:31:05 AM8/28/14
to ne...@googlegroups.com
My revised script:


USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "file://localhost/home/deployer/tblMfr.csv" AS csvLine
    FIELDTERMINATOR '\t'
CREATE (vendor:GraphVendor { vendor_code_id: toInt(csvLine.Mfr_Code_ID), vendor_id: toInt(csvLine.Mfr_ID), vendor_name: csvLine.Mfr_Name, vendor_abbreviation: csvLine.Mfr_Abbr, vendor_status: csvLine.Mfr_Status })
WITH toInt(csvLine.Mfr_ID) as vendor_id
MATCH (vendor:GraphVendor { vendor_id: vendor_id})
MATCH (part:GraphPart {mfr_id: vendor_id})
MERGE (part)-[:MANUFACTURED_BY]->(vendor);

:schema
Indexes
  ON :GraphPart(mfr_id)      ONLINE                             
  ON :GraphPart(part_id)     ONLINE (for uniqueness constraint) 
  ON :GraphVendor(vendor_id) ONLINE (for uniqueness constraint) 

Constraints
  ON (graphpart:GraphPart) ASSERT graphpart.part_id IS UNIQUE
  ON (graphvendor:GraphVendor) ASSERT graphvendor.vendor_id IS UNIQUE


The import still spends significant time in GC:
2014-08-28 14:18:30.559+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 27270ms [total block time: 111.254s]
2014-08-28 14:18:58.105+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 25724ms [total block time: 136.978s]
2014-08-28 14:19:19.571+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 19982ms [total block time: 156.96s]
2014-08-28 14:19:47.826+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26533ms [total block time: 183.493s]
2014-08-28 14:19:48.088+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 162ms [total block time: 183.655s]
2014-08-28 14:20:16.149+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26880ms [total block time: 210.535s]
2014-08-28 14:20:37.432+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 20102ms [total block time: 230.637s]
2014-08-28 14:21:06.477+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 27755ms [total block time: 258.392s]
2014-08-28 14:21:06.907+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 330ms [total block time: 258.722s]
2014-08-28 14:21:35.483+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 27721ms [total block time: 286.443s]
2014-08-28 14:21:57.764+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 21430ms [total block time: 307.873s]
2014-08-28 14:22:27.172+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 28122ms [total block time: 335.995s]
2014-08-28 14:22:27.613+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 340ms [total block time: 336.335s]
2014-08-28 14:22:56.549+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 28191ms [total block time: 364.526s]
2014-08-28 14:23:18.865+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 21591ms [total block time: 386.117s]
2014-08-28 14:23:44.941+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 25330ms [total block time: 411.447s]
2014-08-28 14:23:45.415+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 374ms [total block time: 411.821s]
2014-08-28 14:24:13.505+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 27449ms [total block time: 439.27s]
2014-08-28 14:24:33.630+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 19595ms [total block time: 458.865s]
2014-08-28 14:25:00.748+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 26693ms [total block time: 485.558s]
2014-08-28 14:25:01.247+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 398ms [total block time: 485.956s]

I've got 8GB memory set for the JVM, should I increase this to 12? Also would it help if I turned on GC loggin and posted those logs?


--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/rOr8tL1r-R8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
CR

Chris Roberts

unread,
Aug 28, 2014, 2:18:43 PM8/28/14
to ne...@googlegroups.com
It finished. 32GB memory in the VM and 28GB of JVM heap.

In total the import created: Created 4595170 relationships, returned 0 rows in 390028 ms

Chris


On Wed, Aug 27, 2014 at 5:14 PM, Michael Hunger <michael...@neotechnology.com> wrote:

--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/rOr8tL1r-R8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
CR

Michael Hunger

unread,
Aug 28, 2014, 4:27:20 PM8/28/14
to ne...@googlegroups.com
You didn't like my suggestions to split it up?

I should probably have explained that cypher will pull all input data through the first stage (create) allocating a lot of memory only to continue then to the next part (match).

This is because of the match my own creates issue which otherwise could lead to an infinite loop (matching on data that you just created creating more data to match on etc.)

So my split up suggestion would have avoided that.

Feel free to try it out and report if it behaves better.

Cheers,

Michael

Chris Roberts

unread,
Aug 29, 2014, 8:02:02 AM8/29/14
to ne...@googlegroups.com
Splitting it up worked well, I still had to give my VM 32 GB of memory and 28 GB heap, but the files I was importing were more than 50MB each, the largest being 165MB with about 5 million rows, which took maybe 4 minutes to import. I just didn't expect to need so much memory, there is a table in the docs that lead me to believe 8 GB should be fine but that table must be geared towards cypher queries that work against the existing graph. It looks like load csv can require more memory than the amount needed to query against a given number of primitives that are already in the graph.

Today/this weekend I'm planning some even larger import with csv's ~300MB, possibly larger, exciting! I guess that strategy when I can't add more memory to the VM is to split the files into smaller csv's?

Chris

Michael Hunger

unread,
Aug 29, 2014, 9:24:59 AM8/29/14
to ne...@googlegroups.com
There is sth wrong then it should work with little memory

Can you share your full, split up import script that you used?

Michael

Sent from mobile device
Reply all
Reply to author
Forward
0 new messages