Disable any Caching in Neo4j

796 views
Skip to first unread message

Curtis Mosters

unread,
Nov 19, 2014, 8:36:39 AM11/19/14
to ne...@googlegroups.com
Well for my Benchmark I need a clear Cache because otherwhise Neo4j is always caching and faking my results in a bad way.

So I tried:
cache_type=none
cache
.memory_ratio=0.0



in the neo4j.properties. Had no real impact still caching.

Also found something from Michael with jconsole: http://stackoverflow.com/questions/26189351/neo4j-server-clear-the-cache-in-ram
But really no idea what he means. =( I meant the JMX call I found http://stackoverflow.com/questions/12621963/clear-ehcache-of-remote-server
But also not sure if it is the right thing to clear.

Here I found infos about the caching: http://neo4j.com/docs/stable/configuration-caches.html
But nothing worked or was in my mind useful to test.

Then I gone through all settings here: http://neo4j.com/docs/stable/kernel-configuration.html
But yeah except the two lines above nothing looking satisfying. Even tried query_cache_size=0 but here I got a funny error message then, so yeah dunno what that setting means.

Then I was thinking that setting the neo_mapping stuff to 0 might help. But then I found http://grokbase.com/t/gg/neo4j/1312y592r4/caching-the-whole-graph which says that setting to 0 is like limiting to infinity.

Ohman so nothing worked to disable the cache. Why is it so hard to give a setting disable cache or clearing the cache after server is shut down.

Really need this =/

Thank you

Jacob Hansson

unread,
Nov 19, 2014, 12:35:38 PM11/19/14
to ne...@googlegroups.com
Curtis,

can you clarify what you mean by caching, and how you are determining that things are getting cached?

If you are talking about caching of the actual data, note that there are several layers of caching - the OS will cache files in its page cache, for instance. If you want to work around that in your testing you need to ask the operating system to flush its caches.

/j

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Curtis Mosters

unread,
Nov 19, 2014, 1:25:16 PM11/19/14
to ne...@googlegroups.com
Hi Jacob, well yeah here is an example:

START n=node:titles("title:solar") RETURN count(*)

If I do this right after the import. It needs 180 sec. But doing this a second time needs 3 sec and after that again just needs 1,5 sec, but after that there is no more improvement. Also after doing some stuff in the settings, I never get to the 180 sec again.=/

But if you think now it is just that word. No it's not. Also tested 9 others and here it was lasting round about 60 sec, too. But searching again on them was 1-2 sec.

Also after restarting the Neo4j server it is still that fast (1-3 sec). And yeah I need the uncached result time for my comparison. It is not correct if I continue now with cached things.

Maybe now it's way clearer? If not just let me know. Thank you.

Jacob Hansson

unread,
Nov 19, 2014, 3:37:57 PM11/19/14
to ne...@googlegroups.com
Hey mate,

yeah, this does clarify it. It is very likely that what is happening is that the OS is caching the index files in RAM, so the second time you run the database (even after a restart), it does not have to hit disk. You could verify that this is the case by evicting the OS page cache between your benchmarks. What OS are you using?

On mac, you should be able to run 'purge' to clear the caches, see: http://www.cnet.com/news/purge-the-os-x-disk-cache-to-analyze-memory-usage/

For windows there does not seem to be a vendor-provided mechanism to do this, see here for alternatives: http://stackoverflow.com/questions/7405868/how-to-invalidate-the-file-system-cache

/jake

Michael Hunger

unread,
Nov 19, 2014, 4:55:23 PM11/19/14
to ne...@googlegroups.com
But for realistic use-cases you will have exactly that setup, that you have hot data in your OS' file-system and also database caches,

that is the state you want to reach for real benchmarks as this represents the real world usage, not the cold caches after a computer and database start.

The cold cache numbers can be completely ignored imho as they only measure the speed of the disk and the loading mechanism of the FS and database to get data loaded.

Same goes for JVM JIT and other optimizations that happen behind the scenes (by OS, JVM, DB)

Michael

Curtis Mosters

unread,
Nov 20, 2014, 3:29:04 AM11/20/14
to ne...@googlegroups.com
Yeah but the issue is that I have 10 comparisons. Each having let's say the 10 same words to search for.

So after the first search Neo4j already knows where they are and is faster now. That's the issue I have. Real word is something else, where you would change those words. But I don't want to think always about new words to search for.

So clearing cache would be way better. IS there really no way to do it?

Also tried killall -9 java, but not working.
Also followed On linux, see: http://linux-mm.org/Drop_Caches but nothing happens?

Michael Hunger

unread,
Nov 20, 2014, 5:54:24 AM11/20/14
to ne...@googlegroups.com
As this is operating system caches we're talking about, you can try "sync" on unix.

Or reboot.

It might also work to run Neo4j in a container (docker, vagrant) and restart / resume that container.

Curtis Mosters

unread,
Nov 20, 2014, 8:37:23 AM11/20/14
to ne...@googlegroups.com
Yeah reboot is working fine. Now I get the uncached time again with it.

But rebooting takes so many time and I have to reconnect again and again via shell.

I also tried sync. But when I type it nothing happens. Also searched for a way to use it, but it seems that noone has a good solution there.

Docker and vagrant I never tried, but sounds like a good solution.

And btw I know that this test here might not be the very best. But since Neo4j does not kill the cache I have to do it this way. Otherwhise I cannot compare it to my other database. The one is clearing cache by shutdown. That's how it should be I think.

Thanks Michael

Curtis Mosters

unread,
Nov 20, 2014, 8:56:19 AM11/20/14
to ne...@googlegroups.com
I have no idea what this line is doing in detail, but I got the same speed now as if I restart

echo 3 | sudo tee /proc/sys/vm/drop_caches


Just want to share it with you.
Reply all
Reply to author
Forward
0 new messages