OOM for few MB of data

64 views
Skip to first unread message

Vineeth Mohan

unread,
Oct 24, 2012, 1:48:32 PM10/24/12
to cleo-ty...@googlegroups.com
Hello ,

I created 3 indexes and indexed few records in it.
Parameters are as follows

  • default - empty
  • company - Input XML size = 884K , connections-store - 24 MB , element-store -1.1 MB
  • congress - Input XML size = 212K , connections-store - 9.2 MB , element-store -196K

Cleo instance during and after the index is pretty fine but then once after a restart and running a search , its hitting a heap overload issue

LOG - https://gist.github.com/3947596

Memory setting - -Xms1g -Xmx1g

Can i know what is the reason for this ?

Is there something i have missed out to do ?

Thanks

                 Vineeth

Jingwei

unread,
Oct 24, 2012, 4:25:32 PM10/24/12
to cleo-ty...@googlegroups.com
Hi Vineeth,

Could you try a larger JVM size say 2g?

Could you do "du -sh /home/vineeth/git/cleo-primer/data/" to see how much data you have there?

The configurations for your three typeaheads have a a lot of headroom for 3 million elements even though you do not have that many elements.

Thanks.

Jingwei

Vineeth Mohan

unread,
Oct 24, 2012, 9:56:47 PM10/24/12
to cleo-ty...@googlegroups.com
vineeth@vineeth-XPS-L501X:~/git/cleo-primer$ du -sh /home/vineeth/git/cleo-primer/data/
35M    /home/vineeth/git/cleo-primer/data/

vineeth@vineeth-XPS-L501X:~/git/cleo-primer$ du -sch data/*/*
17M    data/company/connections-store
1.1M    data/company/element-store
4.4M    data/congress/connections-store
204K    data/congress/element-store
72K    data/default/connections-store
28K    data/default/element-store
23M    total

Hello Jingwei ,

cleo is working fine with 2g of memory but then i ma trying to understand the reason why it broke at the first place due to just few mb of data.

Thanks
           Vineeth

vineethmohan

unread,
Oct 25, 2012, 3:15:14 AM10/25/12
to cleo-ty...@googlegroups.com
Hi ,

Am still facing the issue with 2 GB ram. This time i kep on loading data to see where it will break.
Following are the parameters.

LOG - https://gist.github.com/3951110

algotree@DELTA:~/cleo/cleo-primer$ du -sch data/*/*
22M    data/company/connections-store
1.4M    data/company/element-store
72K    data/default/connections-store
24K    data/default/element-store
192M    data/person/connections-store
41M    data/person/element-store
228K    data/region/connections-store
28K    data/region/element-store
255M    total

Kindly help me out here. I am trying to make a production quality auto complete product out of cleo.

Thanks
           Vineeth

Vineeth Mohan

unread,
Oct 25, 2012, 5:36:36 AM10/25/12
to cleo-ty...@googlegroups.com
Few more observations

  • When indexing a huge amount of data , cleo doesn’t usually give heap OOM
  • The same index when restarted , cleo gives a OOM
  • Cleo need not give OOM every time the index is restarted with the same data

Thanks

             Vineeth

Vineeth Mohan

unread,
Oct 25, 2012, 5:42:13 AM10/25/12
to cleo-ty...@googlegroups.com
heap memory graph on restart of cleo with following parameter


algotree@DELTA:~/cleo/cleo-primer$ du -sch data/*/*
72K    data/default/connections-store
24K    data/default/element-store
252M    data/double/connections-store
81M    data/double/element-store
252M    data/person/connections-store
81M    data/person/element-store
665M    total

RAM - 2 GB

Observation - memory shoots till 1.6 GB and gives a heap OOM.
IMAGE - http://twitpic.com/b77c7e/full
LOG - https://gist.github.com/3951675

Thanks
            Vineeth

Vineeth Mohan

unread,
Oct 25, 2012, 11:29:04 AM10/25/12
to cleo-ty...@googlegroups.com
Hi ,

I have created a testable version of cleo-federated here

CODE - https://github.com/Vineeth-Mohan/cleo-primer

I have written minimum instruction to get started.
Feel free to use the code if required.

Thanks
          Vineeth

Jingwei Wu

unread,
Oct 25, 2012, 11:42:46 AM10/25/12
to cleo-ty...@googlegroups.com
Hi Vineeth,

Could you post target/logs/cleo.log upon restart?

Thanks.

Jingwei

Vineeth Mohan

unread,
Oct 25, 2012, 11:46:02 AM10/25/12
to cleo-ty...@googlegroups.com
Please find the logs attached.

Thanks
          Vineeth
cleo.log

Vineeth Mohan

unread,
Oct 25, 2012, 12:47:49 PM10/25/12
to cleo-ty...@googlegroups.com, Jingwei Wu
Hello Jingwei ,

To reproduce the heap dump , kindly follow the following steps

git clone 'https://github.com/Vineeth-Mohan/cleo-primer.git'
MAVEN_OPTS="-Xms2g -Xmx2g " mvn clean install jetty:run

./scripts/post-element-list.sh person1 dat/persons.xml
./scripts/post-element-list.sh person2 dat/persons.xml
./scripts/post-element-list.sh person3 dat/persons.xml
./scripts/post-element-list.sh person4 dat/persons.xml

Restart cleo.

And run a curl to bootstrap the generictypeahead instance

curl http://localhost:8080/cleo-primer/rest/elements/search?query=s

Thanks
            Vineeth

Jingwei Wu

unread,
Oct 25, 2012, 5:46:49 PM10/25/12
to cleo-ty...@googlegroups.com
Hi Vineeth,

I tried your example and did run into the OOM issue with the 2G heap. This is actually normal and expected. Let me explain

After running './scripts/post-element-list.sh person3 dat/persons.xml', it is unable to create the new group person4 due to OOM. But the other four groups default, person1, person2, person3 is fine.

You can run the following command

find data | grep \.seg$ | xargs ls -lrt  | awk '{ print $9 " " $5}' | sort

You have results like following.

data/default/connections-store/index/segs/0.seg 8388608
data/default/connections-store/store-ext/segs/0.seg 134217728
data/default/connections-store/store/segs/0.seg 33554432
data/default/element-store/segs/0.seg 33554432
data/person1/connections-store/index/segs/0.seg 8388608
data/person1/connections-store/store-ext/segs/0.seg 134217728
data/person1/connections-store/store/segs/0.seg 33554432
data/person1/connections-store/store/segs/1.seg 33554432
data/person1/connections-store/store/segs/2.seg 33554432
data/person1/connections-store/store/segs/3.seg 33554432
data/person1/connections-store/store/segs/4.seg 33554432
data/person1/element-store/segs/0.seg 33554432
data/person1/element-store/segs/1.seg 33554432
data/person2/connections-store/index/segs/0.seg 8388608
data/person2/connections-store/store-ext/segs/0.seg 134217728
data/person2/connections-store/store/segs/0.seg 33554432
data/person2/connections-store/store/segs/1.seg 33554432
data/person2/connections-store/store/segs/2.seg 33554432
data/person2/connections-store/store/segs/3.seg 33554432
data/person2/connections-store/store/segs/4.seg 33554432
data/person2/element-store/segs/0.seg 33554432
data/person2/element-store/segs/1.seg 33554432
data/person3/connections-store/index/segs/0.seg 8388608
data/person3/connections-store/store-ext/segs/0.seg 134217728
data/person3/connections-store/store/segs/0.seg 33554432
data/person3/connections-store/store/segs/1.seg 33554432
data/person3/connections-store/store/segs/2.seg 33554432
data/person3/connections-store/store/segs/3.seg 33554432
data/person3/connections-store/store/segs/4.seg 33554432
data/person3/element-store/segs/0.seg 33554432
data/person3/element-store/segs/1.seg 33554432

Since MemorySegmentFactory is used to load all .seg files into JVM memory. The application used roughly 1.4 GB for these segments.

find data | grep indexes.dat   | xargs ls -lrt  | awk '{print $9" "$5}'

data/default/connections-store/store/indexes.dat 2098176
data/default/connections-store/index/indexes.dat 8389632
data/default/connections-store/store-ext/indexes.dat 8389632
data/person2/connections-store/store-ext/indexes.dat 8389632
data/person2/connections-store/index/indexes.dat 8389632
data/person2/connections-store/store/indexes.dat 2098176
data/person3/element-store/indexes.dat 8001024
data/person3/connections-store/store/indexes.dat 2098176
data/person3/connections-store/index/indexes.dat 8389632
data/person3/connections-store/store-ext/indexes.dat 8389632
data/person1/element-store/indexes.dat 8001024
data/person1/connections-store/store/indexes.dat 2098176
data/person1/connections-store/index/indexes.dat 8389632
data/person1/connections-store/store-ext/indexes.dat 8389632
data/default/element-store/indexes.dat 8001024
data/person2/element-store/indexes.dat 8001024

These files used roughly 110MB JVM memory.

At this point, the above files from four groups (default, person1, person2, person3) consumed roughly 1.5GB memory. There are some other overhead for loading elements from all four groups. The memory left is not enough to create the 4th group.

du -sh data

696M data

You only see there are actually 696 MB data. This means that a lot of space from segment allocated in JVM are not used because segments are fixed-size and allocated upon requests.

OK, we now restart the application and then see failure caused by OOM. 

find data | grep \.seg$ | xargs ls -lrt  | awk '{ print $9 " " $5}' | sort

data/default/connections-store/index/segs/0.seg 8388608
data/default/connections-store/store-ext/segs/0.seg 134217728
data/default/connections-store/store/segs/0.seg 33554432
data/default/element-store/segs/0.seg 33554432
data/person1/connections-store/index/segs/0.seg 8388608
data/person1/connections-store/index/segs/1.seg 8388608
data/person1/connections-store/store-ext/segs/0.seg 134217728
data/person1/connections-store/store-ext/segs/1.seg 134217728
data/person1/connections-store/store/segs/0.seg 33554432
data/person1/connections-store/store/segs/1.seg 33554432
data/person1/connections-store/store/segs/2.seg 33554432
data/person1/connections-store/store/segs/3.seg 33554432
data/person1/connections-store/store/segs/4.seg 33554432
data/person1/element-store/segs/0.seg 33554432
data/person1/element-store/segs/1.seg 33554432
data/person1/element-store/segs/2.seg 33554432
data/person2/connections-store/index/segs/0.seg 8388608
data/person2/connections-store/store-ext/segs/0.seg 134217728
data/person2/connections-store/store/segs/0.seg 33554432
data/person2/connections-store/store/segs/1.seg 33554432
data/person2/connections-store/store/segs/2.seg 33554432
data/person2/connections-store/store/segs/3.seg 33554432
data/person2/connections-store/store/segs/4.seg 33554432
data/person2/element-store/segs/0.seg 33554432
data/person2/element-store/segs/1.seg 33554432
data/person3/connections-store/index/segs/0.seg 8388608
data/person3/connections-store/index/segs/1.seg 8388608
data/person3/connections-store/store-ext/segs/0.seg 134217728
data/person3/connections-store/store-ext/segs/1.seg 134217728
data/person3/connections-store/store/segs/0.seg 33554432
data/person3/connections-store/store/segs/1.seg 33554432
data/person3/connections-store/store/segs/2.seg 33554432
data/person3/connections-store/store/segs/3.seg 33554432
data/person3/connections-store/store/segs/4.seg 33554432
data/person3/element-store/segs/0.seg 33554432
data/person3/element-store/segs/1.seg 33554432
data/person3/element-store/segs/2.seg 33554432

The total memory used by these .seg files are approximately 1.7GB. With those indexes.dat, you are looking at 1.8GB plus other overhead. So you are out of memory now.

Now there are more .seg files than before restart. Why? For the time being, the indexes are always opened for read-write purpose. Read-only is not supported. So some new segments are allocated in JVM for new/update traffic. At this point, since you do not have any new/update traffic, these segments are wasted though they still occupy memory. It is especially wasteful for small indexes. 

So an easy solution will be to increase the JVM size. On production for millions of items, it may be good to follow a simple formula JVM = Max(dataSizeGB + 2GB, dataSizeGB * 1.4) where dataSizeGB is the sum of .seg file sizes and indexes.dat file sizes. 

Hope this make sense.

Thanks.

Jingwei

Vineeth Mohan

unread,
Oct 26, 2012, 6:32:22 AM10/26/12
to cleo-ty...@googlegroups.com
Hello Jing ,
Moving to ScannerTypeAhead solved all my problems.
I would like to know what i am missing by moving to ScannerTypeAhead ?
Is there any trade off to achieve this stability for small index ?

Thanks
           Vineeth
Reply all
Reply to author
Forward
0 new messages