possible bug in HBaseWd

36 views
Skip to first unread message

Ionut

unread,
Sep 12, 2011, 12:10:04 PM9/12/11
to HBaseWD - Distribute Sequential HBase Writes
Hi all HBaseWD users,

I used HBaseWD in order to create distributed keys over a range of
partitions.
I chose the RowKeyDistributorByHashPrefix option to create key bucket
and I set maxBuckets to 32.
I created a histogram to see key distribution and I saw that values
for range 0 and 16 are counted together at range 16 and range 0 has 0
values. Could you confirm there is an issue or I used HBaseWD in a
wrong way?

Regards,
Ionut

Alex Baranau

unread,
Sep 13, 2011, 5:36:48 AM9/13/11
to HBaseWD - Distribute Sequential HBase Writes
Hi lonut,

I guess you are using OneByteSimpleHash. Confirmed the problem with
the unit-test. Fixed and committed: https://github.com/sematext/HBaseWD/issues/7.

Thank you for reporting the issue!
Alex.

Ionut

unread,
Sep 20, 2011, 6:57:47 AM9/20/11
to HBaseWD - Distribute Sequential HBase Writes
Hi!

I think the problem was not solved yet.
I had a original key like
[0,0,0,0,1,80,-111,1,-35,-28,93,120,-123,-40,-63,-113,5,56,126,-87,-90,-44,-125,107,-105,53,-43,-48]
and mod=48
The bucket for this seems to be negative.
I think the mask is wrong.
(b & 0xff) = b
I think that having (b & 0x7f) is correct, because in this manner you
have only positive values.

Also, I did the performance test what I had talk about and it seems
that your solution is 2 time faster that using Arrays.hashcode.

Regards,
Ionut

Alex Baranau

unread,
Oct 4, 2011, 2:52:58 AM10/4/11
to HBaseWD - Distribute Sequential HBase Writes
Hi!

Sorry for the delay. Fixed this one and committed.

Alex.
Reply all
Reply to author
Forward
0 new messages