java.nio.BufferUnderflowException

210 views
Skip to first unread message

tpoljak

unread,
Jun 6, 2011, 2:05:23 PM6/6/11
to Krati
Hi,
I'm using Krati as a persistent store/storage for (string) phrases and
their occurrence counts map. Now, when there are over 1500 000 000 of
entries I'm getting exceptions (below) when trying to access counters
for string keys. For some extraction I get:

Caused by: java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:497)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:355)
at
krati.store.DefaultDataStoreHandler.extractByKey(DefaultDataStoreHandler.java:
117)
at krati.store.DynamicDataStore.get(DynamicDataStore.java:389)
at krati.store.DynamicDataStore.get(DynamicDataStore.java:33)

and for others I get:

Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:235)
at
krati.store.DefaultDataStoreHandler.extractByKey(DefaultDataStoreHandler.java:
130)
at krati.store.DynamicDataStore.get(DynamicDataStore.java:389)
at krati.store.DynamicDataStore.get(DynamicDataStore.java:33)

I'm using krati-0.3.5.jar and store definition:

_store = new DynamicDataStore(storeDir, new ChannelSegmentFactory());

because key/value data doesn't fit into the available memory (there is
~ 4,5 Gb of data on disk and we can dedicate only 3 Gb of RAM to the
Krati process). With other DataStore implementations I get OOME.

In this case, when keys are taking more space then values (since keys
are strings and values are only counters) and when there is not enough
RAM to fit all keys what would be your recommendation for data store?

Also, regarding the exception above: why this exception occurs now (is
index or segments corrupted or ..) and how can I fix the issue? Can I
fix the issue by changing some of DynamicDataStore params ?

Thanks,
Tomislav

Jingwei

unread,
Jun 8, 2011, 6:35:45 PM6/8/11
to Krati
Hi Tomislav,

There is a bug in krati, which prevents krati from handling the number
of keys greater than 1 << 27 (128 million) in DynamicDataStore and
DynamicDataSet. You have keys more than 1500 million. This perhaps
caused java.nio.BufferUnderflowException. I am currently working on
this issue and trying to scale the store to handle the number of keys
up to Integer.MAX_VALUE.

In your case, if the length of key is small, you can use
StaticDataStore with a fixed capacity such as 100,000,000

_store = new StaticDataStore(storeDir, 100000000, new
ChannelSegmentFactory())

Thanks.

Jingwei

Tomislav Poljak

unread,
Jun 9, 2011, 8:17:46 AM6/9/11
to kr...@googlegroups.com
Hi,
thanks for the response. More inline.

2011/6/9 Jingwei <jingw...@gmail.com>:


> Hi Tomislav,
>
> There is a bug in krati, which prevents krati from handling the number
> of keys greater than 1 << 27 (128 million) in DynamicDataStore and
> DynamicDataSet. You have keys more than 1500 million. This perhaps
> caused java.nio.BufferUnderflowException. I am currently working on
> this issue and trying to scale the store to handle the number of keys
> up to Integer.MAX_VALUE.

OK. I understand.


>
> In your case, if the length of key is small, you can use
> StaticDataStore with a fixed capacity such as 100,000,000
>
> _store = new StaticDataStore(storeDir, 100000000, new
> ChannelSegmentFactory())

I've tried your suggestion and got

Caused by: java.io.IOException: Capacity expected: 67108864 not 100000000
at krati.store.StaticDataStore.<init>(StaticDataStore.java:197)
at krati.store.StaticDataStore.<init>(StaticDataStore.java:59)

Does this mean that length of the keys is too big?

Tomislav

Jingwei

unread,
Jun 13, 2011, 4:28:58 PM6/13/11
to Krati
Hi Tomislav,

In Krati, StaticDataStore, DynamicDataStore and IndexedDataStore are
not interchangeable at present. This means that once you have chosen a
store type, you will have to keep using that type of store.

The store you have created via DynamicDataStore is corrupted due to
the bug mentioned in my previous reply. If you switch to
StaticDataStore, you will need to recreate the store from scratch.

Over the last weekend, I was trying to fix the bug related to
DynamicDataStore capacity. In the meanwhile, I was thinking about your
problem, i.e. only 3GB ram for 1.5 billion unique keys. It is an
interesting problem. Krati at its present stage does not have a good
solution. The main reason is that Krati needs to allocate a large in-
memory long array for hashing purpose. To reduce hash conflicts, the
long array need to be large enough to hold the total number of keys
eventually added to the store.

In your case, both StaticDataStore and DynamicDataStore can only have
a long array of capacity 300,000,000 (2.4GB in memory). At this scale,
there are 5 hash conflicts on average for each of 1.5 billion keys.
This won't give a good performance. However, it may still work. If you
like to give it a try, you will need to get the latest snapshot from
git://github.com/jingwei/krati.git

A reasonable solution is for Krati to allocate the long array on disk
rather than in memory. This will require some code change. I plan to
take a dive this week.

Thanks.

-jingwei

On Jun 9, 5:17 am, Tomislav Poljak <tpol...@gmail.com> wrote:
> Hi,
> thanks for the response. More inline.
>
> 2011/6/9 Jingwei <jingwei...@gmail.com>:

Tomislav Poljak

unread,
Jun 15, 2011, 7:15:57 AM6/15/11
to kr...@googlegroups.com
Hi,

2011/6/13 Jingwei <jingw...@gmail.com>:


> Hi Tomislav,
>
> In Krati, StaticDataStore, DynamicDataStore and IndexedDataStore are
> not interchangeable at present. This means that once you have chosen a
> store type, you will have to keep using that type of store.
>
> The store you have created via DynamicDataStore is corrupted due to
> the bug mentioned in my previous reply. If you switch to
> StaticDataStore, you will need to recreate the store from scratch.

Yes, I've figured as much. So I've started rebuilding of store from
scratch, but rebuilding was much slower then with DynamicDataStore
(rebuilding includes getting count objects for keys, increasing
counter value and putting it back to krati store) which is a problem
in 'live'/interactive mode of usage.

Tomislav

Reply all
Reply to author
Forward
0 new messages