Neo4j logging GB of errors when Neo4j cannot get enough memory

111 views
Skip to first unread message

rfc...@gmail.com

unread,
Nov 13, 2017, 3:08:57 PM11/13/17
to Neo4j
Hi,

One of our Linux test servers has 192 GB RAM and the following disk partitions (of some time earlier when / was not full). We ran Neo4j and Elasticsearch on this server. We set both heap initial_size and heap max size to 40GB.

 

Filesystem      Size  Used Avail Use% Mounted on

/dev/sda1       546G  5.1G  513G   1% /

tmpfs            95G   16K   95G   1% /dev/shm

/dev/sdb1       6.5T  2.1T  4.1T  34% /data

 

We set neo4j home of the Neo4j Community Edition 3.0.3 to root partition under  /opt/… 

 

The problem we had was neo4j.log caused the 546GB root partition full. After we truncated neo4j.log, neo4j could not start until we manually removed transaction log files.

 

I have the following questions, can anyone help?

1)       Is there a way to set a maximum file size or file rotation like debug.log on neo4j.log from neo4j.conf in Enterprise version?

2)       In more than one incidents Neo4j could not restart successfully from transaction log files, and it could not restart until I manually removed transaction log files such as neostore.transaction.db.348. Can Enterprise version restart without human intervention if it cannot recover from transaction log files?

3)        Is there a way to slow down reporting of 'Cannot allocate memory' (errno=12) errors? In our case, the errors were dumped to neo4j.log almost like an infinite loop.


Error messages are as follows: (Neo4j.log has many millions of these lines:)

# There is insufficient memory for the Java Runtime Environment to continue.

# Native memory allocation (mmap) failed to map 41943040000 bytes for committing reserved memory.

# An error report file with more information is saved as:

# /tmp/hs_err_pid74296.log

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f6b2c000000, 41943040000, 0) failed; error='Cannot allocate memory' (errno=12)

#

# There is insufficient memory for the Java Runtime Environment to continue.

# Native memory allocation (mmap) failed to map 41943040000 bytes for committing reserved memory.

# An error report file with more information is saved as:

# /tmp/hs_err_pid75375.log

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f3fb4000000, 41943040000, 0) failed; error='Cannot allocate memory' (errno=12)

#

# There is insufficient memory for the Java Runtime Environment to continue.

# Native memory allocation (mmap) failed to map 41943040000 bytes for committing reserved memory.

# An error report file with more information is saved as:

# /tmp/hs_err_pid76380.log

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f8d14000000, 41943040000, 0) failed; error='Cannot allocate memory' (errno=12)

 

The above repeated errors were followed by repeated errors below:

 

2017-10-20 13:31:49.509+0000 INFO  Starting...

2017-10-20 13:31:52.125+0000 INFO  Bolt enabled on localhost:7687.

2017-10-21 10:23:41.773+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5ab0b0fa' was successfully initialized, but failed to start. Please see attached cause exception. Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5ab0b0fa' was successfully initialized, but failed to start. Please see attached cause exception.

org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5ab0b0fa' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:68)

                at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:217)

                at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:87)

                at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:66)

                at org.neo4j.server.CommunityEntryPoint.main(CommunityEntryPoint.java:28)

Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5ab0b0fa' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:189)

                ... 3 more

Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, /data/ddna/neo4j/data/databases/graph.db

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:144)

                at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade(CommunityFacadeFactory.java:40)

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:108)

                at org.neo4j.server.CommunityNeoServer.lambda$static$31(CommunityNeoServer.java:55)

                at org.neo4j.server.CommunityNeoServer$$Lambda$43/1644443712.newGraphDatabase(Unknown Source)

                at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:89)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                ... 5 more

Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine@38ef24c2' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:503)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:99)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:140)

                ... 11 more

Caused by: java.lang.OutOfMemoryError: Java heap space

                at org.neo4j.kernel.impl.store.record.DynamicRecord.clone(DynamicRecord.java:189)

                at org.neo4j.kernel.impl.store.PropertyStore.ensureHeavy(PropertyStore.java:186)

                at org.neo4j.kernel.impl.store.PropertyStore.getArrayFor(PropertyStore.java:348)

                at org.neo4j.kernel.impl.store.PropertyType$10.getValue(PropertyType.java:230)

                at org.neo4j.kernel.impl.transaction.state.NeoStoreIndexStoreView.nodeAsUpdates(NeoStoreIndexStoreView.java:141)

                at org.neo4j.kernel.impl.api.index.IndexingService$1.visited(IndexingService.java:578)

                at org.neo4j.collection.primitive.hopscotch.AbstractLongHopScotchCollection.visitKeys(AbstractLongHopScotchCollection.java:48)

                at org.neo4j.kernel.impl.api.index.IndexingService.readRecoveredUpdatesFromStore(IndexingService.java:573)

                at org.neo4j.kernel.impl.api.index.IndexingService.applyRecoveredUpdates(IndexingService.java:562)

                at org.neo4j.kernel.impl.api.index.IndexingService.start(IndexingService.java:237)

                at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.start(RecordStorageEngine.java:421)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                ... 19 more

2017-10-21 10:24:43.742+0000 INFO  Starting...

2017-10-21 10:24:45.515+0000 INFO  Bolt enabled on localhost:7687.

2017-10-22 14:53:20.892+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@bf3d841' was successfully initialized, but failed to start. Please see attached cause exception. Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@bf3d841' was successfully initialized, but failed to start. Please see attached cause exception.

org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@bf3d841' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:68)

                at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:217)

                at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:87)

                at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:66)

                at org.neo4j.server.CommunityEntryPoint.main(CommunityEntryPoint.java:28)

Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase@bf3d841' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:189)

                ... 3 more

Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, /data/ddna/neo4j/data/databases/graph.db

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:144)

                at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade(CommunityFacadeFactory.java:40)

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:108)

                at org.neo4j.server.CommunityNeoServer.lambda$static$31(CommunityNeoServer.java:55)

                at org.neo4j.server.CommunityNeoServer$$Lambda$43/1644443712.newGraphDatabase(Unknown Source)

                at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:89)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                ... 5 more

Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine@7beacd29' was successfully initialized, but failed to start. Please see attached cause exception.

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:503)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:99)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)

                at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:140)

                ... 11 more

Caused by: java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects

                at org.neo4j.kernel.api.index.NodePropertyUpdate.add(NodePropertyUpdate.java:222)

                at org.neo4j.kernel.impl.transaction.state.NeoStoreIndexStoreView.nodeAsUpdates(NeoStoreIndexStoreView.java:142)

                at org.neo4j.kernel.impl.api.index.IndexingService$1.visited(IndexingService.java:578)

                at org.neo4j.collection.primitive.hopscotch.AbstractLongHopScotchCollection.visitKeys(AbstractLongHopScotchCollection.java:48)

                at org.neo4j.kernel.impl.api.index.IndexingService.readRecoveredUpdatesFromStore(IndexingService.java:573)

                at org.neo4j.kernel.impl.api.index.IndexingService.applyRecoveredUpdates(IndexingService.java:562)

                at org.neo4j.kernel.impl.api.index.IndexingService.start(IndexingService.java:237)

                at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.start(RecordStorageEngine.java:421)

                at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:434)

                ... 19 more

2017-10-22 14:53:51.391+0000 INFO  Starting...

2017-10-22 14:53:53.672+0000 INFO  Bolt enabled on localhost:7687.

2017-10-24 00:17:35.682+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3dd46893' was successfully initialized, but failed to start. Please see attached cause exception. Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3dd46893' was successfully initialized, but failed to start. Please see attached cause exception.

org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3dd46893' was successfully initialized, but failed to start. Please see attached cause exception.

 


Thanks.

Michael Hunger

unread,
Nov 14, 2017, 12:08:20 AM11/14/17
to ne...@googlegroups.com, Mikhaylo Demianenko
Hi,

afaik it was a bug in an older version that it used too much memory during recovery. (Please upgrade to 3.0.0)
How big were the transaction logs? I.e. how much data had you inserted?

What is your configuration for the page-cache?
Neo4j assumes it runs alone on the machine and grabs a percentage of RAM - heap as page-cache when not configured.
Please configure the page-cache explicitely in the config.

For these memory sizes you shouldn't run community but on Enterprise.

The error is from the operating system trying to allocate memory for us, there is not much else we can do about it.

You should also configure tx-log retention e.g. to 10 G

HTH

Michael


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ray Cheng

unread,
Nov 14, 2017, 3:57:28 PM11/14/17
to ne...@googlegroups.com
> afaik it was a bug in an older version that it used too much memory during recovery. (Please upgrade to 3.0.0)
We are using 3.0.3, but a Community version.

> How big were the transaction logs? I.e. how much data had you inserted?
I don't recall this one, but transaction logs were limited to 10G in our neo4j.conf. We have inserted Tera bytes to Neo4j.

> What is your configuration for the page-cache?
> Neo4j assumes it runs alone on the machine and grabs a percentage of RAM - heap as page-cache when not configured.
> Please configure the page-cache explicitely in the config.
This page-cache setting may be a major cause. We are using the default, i.e. (192GB - 40GB) / 2
I will set the page-cache explicitly to 10GB.

> For these memory sizes you shouldn't run community but on Enterprise.
Will do so.

> The error is from the operating system trying to allocate memory for us, there is not much else we can do about it.
I see

> You should also configure tx-log retention e.g. to 10 G
We have set dbms.tx_log.rotation.retention_policy=10G size before this incident.

Thanks Michael.
Ray

--
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/WJyjZnSHN9w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages