Hi Vineeth,
I tried your example and did run into the OOM issue with the 2G heap. This is actually normal and expected. Let me explain
After running './scripts/post-element-list.sh person3 dat/persons.xml', it is unable to create the new group person4 due to OOM. But the other four groups default, person1, person2, person3 is fine.
You can run the following command
find data | grep \.seg$ | xargs ls -lrt | awk '{ print $9 " " $5}' | sort
You have results like following.
data/default/connections-store/index/segs/0.seg 8388608
data/default/connections-store/store-ext/segs/0.seg 134217728
data/default/connections-store/store/segs/0.seg 33554432
data/default/element-store/segs/0.seg 33554432
data/person1/connections-store/index/segs/0.seg 8388608
data/person1/connections-store/store-ext/segs/0.seg 134217728
data/person1/connections-store/store/segs/0.seg 33554432
data/person1/connections-store/store/segs/1.seg 33554432
data/person1/connections-store/store/segs/2.seg 33554432
data/person1/connections-store/store/segs/3.seg 33554432
data/person1/connections-store/store/segs/4.seg 33554432
data/person1/element-store/segs/0.seg 33554432
data/person1/element-store/segs/1.seg 33554432
data/person2/connections-store/index/segs/0.seg 8388608
data/person2/connections-store/store-ext/segs/0.seg 134217728
data/person2/connections-store/store/segs/0.seg 33554432
data/person2/connections-store/store/segs/1.seg 33554432
data/person2/connections-store/store/segs/2.seg 33554432
data/person2/connections-store/store/segs/3.seg 33554432
data/person2/connections-store/store/segs/4.seg 33554432
data/person2/element-store/segs/0.seg 33554432
data/person2/element-store/segs/1.seg 33554432
data/person3/connections-store/index/segs/0.seg 8388608
data/person3/connections-store/store-ext/segs/0.seg 134217728
data/person3/connections-store/store/segs/0.seg 33554432
data/person3/connections-store/store/segs/1.seg 33554432
data/person3/connections-store/store/segs/2.seg 33554432
data/person3/connections-store/store/segs/3.seg 33554432
data/person3/connections-store/store/segs/4.seg 33554432
data/person3/element-store/segs/0.seg 33554432
data/person3/element-store/segs/1.seg 33554432
Since MemorySegmentFactory is used to load all .seg files into JVM memory. The application used roughly 1.4 GB for these segments.
find data | grep indexes.dat | xargs ls -lrt | awk '{print $9" "$5}'
data/default/connections-store/store/indexes.dat 2098176
data/default/connections-store/index/indexes.dat 8389632
data/default/connections-store/store-ext/indexes.dat 8389632
data/person2/connections-store/store-ext/indexes.dat 8389632
data/person2/connections-store/index/indexes.dat 8389632
data/person2/connections-store/store/indexes.dat 2098176
data/person3/element-store/indexes.dat 8001024
data/person3/connections-store/store/indexes.dat 2098176
data/person3/connections-store/index/indexes.dat 8389632
data/person3/connections-store/store-ext/indexes.dat 8389632
data/person1/element-store/indexes.dat 8001024
data/person1/connections-store/store/indexes.dat 2098176
data/person1/connections-store/index/indexes.dat 8389632
data/person1/connections-store/store-ext/indexes.dat 8389632
data/default/element-store/indexes.dat 8001024
data/person2/element-store/indexes.dat 8001024
These files used roughly 110MB JVM memory.
At this point, the above files from four groups (default, person1, person2, person3) consumed roughly 1.5GB memory. There are some other overhead for loading elements from all four groups. The memory left is not enough to create the 4th group.
du -sh data
696M data
You only see there are actually 696 MB data. This means that a lot of space from segment allocated in JVM are not used because segments are fixed-size and allocated upon requests.
OK, we now restart the application and then see failure caused by OOM.
find data | grep \.seg$ | xargs ls -lrt | awk '{ print $9 " " $5}' | sort
data/default/connections-store/index/segs/0.seg 8388608
data/default/connections-store/store-ext/segs/0.seg 134217728
data/default/connections-store/store/segs/0.seg 33554432
data/default/element-store/segs/0.seg 33554432
data/person1/connections-store/index/segs/0.seg 8388608
data/person1/connections-store/index/segs/1.seg 8388608
data/person1/connections-store/store-ext/segs/0.seg 134217728
data/person1/connections-store/store-ext/segs/1.seg 134217728
data/person1/connections-store/store/segs/0.seg 33554432
data/person1/connections-store/store/segs/1.seg 33554432
data/person1/connections-store/store/segs/2.seg 33554432
data/person1/connections-store/store/segs/3.seg 33554432
data/person1/connections-store/store/segs/4.seg 33554432
data/person1/element-store/segs/0.seg 33554432
data/person1/element-store/segs/1.seg 33554432
data/person1/element-store/segs/2.seg 33554432
data/person2/connections-store/index/segs/0.seg 8388608
data/person2/connections-store/store-ext/segs/0.seg 134217728
data/person2/connections-store/store/segs/0.seg 33554432
data/person2/connections-store/store/segs/1.seg 33554432
data/person2/connections-store/store/segs/2.seg 33554432
data/person2/connections-store/store/segs/3.seg 33554432
data/person2/connections-store/store/segs/4.seg 33554432
data/person2/element-store/segs/0.seg 33554432
data/person2/element-store/segs/1.seg 33554432
data/person3/connections-store/index/segs/0.seg 8388608
data/person3/connections-store/index/segs/1.seg 8388608
data/person3/connections-store/store-ext/segs/0.seg 134217728
data/person3/connections-store/store-ext/segs/1.seg 134217728
data/person3/connections-store/store/segs/0.seg 33554432
data/person3/connections-store/store/segs/1.seg 33554432
data/person3/connections-store/store/segs/2.seg 33554432
data/person3/connections-store/store/segs/3.seg 33554432
data/person3/connections-store/store/segs/4.seg 33554432
data/person3/element-store/segs/0.seg 33554432
data/person3/element-store/segs/1.seg 33554432
data/person3/element-store/segs/2.seg 33554432
The total memory used by these .seg files are approximately 1.7GB. With those indexes.dat, you are looking at 1.8GB plus other overhead. So you are out of memory now.
Now there are more .seg files than before restart. Why? For the time being, the indexes are always opened for read-write purpose. Read-only is not supported. So some new segments are allocated in JVM for new/update traffic. At this point, since you do not have any new/update traffic, these segments are wasted though they still occupy memory. It is especially wasteful for small indexes.
So an easy solution will be to increase the JVM size. On production for millions of items, it may be good to follow a simple formula JVM = Max(dataSizeGB + 2GB, dataSizeGB * 1.4) where dataSizeGB is the sum of .seg file sizes and indexes.dat file sizes.
Hope this make sense.
Thanks.
Jingwei