Hi Neo4j devs,
My application does the following: constantly do some Lucene index lookups, then loop over the result nodes and get the IDs:
ResourceIterator<Node> nodes = graphDb.findNodes(
label, "name" + attr, search);
Set<Long> userIds = new HashSet<Long>();
while (nodes.hasNext()) {
userIds.add(nodes.next().getId());
}
Environment. Linux box, 15GB RAM, 2GB JVM heap. The Neo4j store files total 29GB on-disk; the Lucene indexes total 6GB. Using Neo4j 2.2 embedded; cache_type is set to none.
Symptom 1. When the Neo4j page cache size (dbms.pagecache.memory) is set to low enough (<= 8.5GB) -- hence leaving enough space for the Lucene indexes -- the latency looks good enough.
Symptom 2. However, when it is set slightly larger -- to 9.5GB or 10GB -- the following starts to happen during the queries. Constant high IO wait; the OS constantly reads in tens of MBs; constant stream of 3k+ maj_flt for the Java process. It seems as if the indexes could not evict the Neo4j pages, or in other words, as if the index pages were being independently LRU-cached. The CPU constantly waits for IO to bring in some pages (I'd guess most likely all Lucene pages) to do any work (1% usr usage every ~10 seconds).
This is very surprising to me, as I'd expect even in memory-constrained cases like this the following would happen: the Lucene indexes would compete against and eventually win over the Neo4j store pages (brought into memory by full warmup done at start time) in the OS page cache, and hence the high IO would occur initially but decrease to none later (5.8 GB of indexes should fit comfortably in 15GB RAM).
Could someone explain why the above would be happening?
Zongheng