performance comparison with HBase and a possible bug

23 views
Skip to first unread message

Mikhail

unread,
Feb 13, 2008, 8:36:26 AM2/13/08
to Hypertable User, apon...@gmail.com
Hi Everyone,

First of all, I'd like to thank Hypertable developers for releasing
such a promising high-performance product as open source.

I am currently evaluating Hypertable (and HBase) for use in my
project, and I've been trying to compare the performance of Hypertable
and HBase by implementing the random reads / random writes benchmark
from the BigTable paper. The performance numbers I got in the local
mode on my 2.2 GHz Centrino Duo laptop under Ubuntu Feisty:
Random writes: 300000 1000-byte rows in 19.278 seconds, 15561.6 rows
per second
Random reads: 300000 1000-byte rows in 250.801 seconds, 1196.2 rows
per second

But I found out that on the order of 1,500 rows were routinely missing
from the table at the reading phase. This was also confirmed by the
dump command:
$ echo "select * from PerfEval;" | bin/hypertable --batch | wc -l
298332

I'll post the code I used for this evaluation to the files section of
this Google group.

Could this be a bug in Hypertable or in the way I'm using the API?
Also, what is the clean way to shut down a local instance of
Hypertable? If I just use bin/kill-servers.sh (which as far as I see
just sends a SIGKILL which does not look like a good way to shut down
a database), I do not see any tables next time I start servers with
bin/start-all-servers.sh local.

Thanks,
Mikhail Bautin
Department of Computer Science
Stony Brook University, Stony Brook, NY

Doug Judd

unread,
Feb 13, 2008, 8:56:52 AM2/13/08
to hyperta...@googlegroups.com
HI Mikhail,

Could you include everything that I need to reproduce the problem in the file that you upload?  Include any source code for your test, the random data generator, and any scripts that you used.  I'll take a look at this today.

As far as clean shutdown goes, there currently isn't one.  This is why we released the software as "alpha".  We're currently working on recovery and as soon as it's in place, we'll move from "alpha" to "beta".

- Doug

Doug Judd

unread,
Feb 15, 2008, 2:52:55 AM2/15/08
to hyperta...@googlegroups.com, apon...@gmail.com
Hi Mikhail,

I've successfully recreated the problem that you report here.  I'm actively working on it now and as soon as I get to the bottom of it, I'll check in a fix and push out a new release.  Two things I want to clarify, 1) you're using the 'local' broker, not the 'hadoop' broker, correct?  In other words, you're starting the servers with something like:

./bin/star-all-servers.sh local

Also, I'm creating your table with:

create table PerfEval (
  Field
);

Is this the same create table statement that you're using?

Thank you very much for reporting this error.  Your help is much appreciated.

- Doug

Doug Judd

unread,
Feb 16, 2008, 7:49:20 PM2/16/08
to hyperta...@googlegroups.com, apon...@gmail.com
Hi Mikhail,

The problem that you're seeing here has to do with an error in the perf_eval.cc code that you wrote.  The code sets the timestamp to 28376234L with the following line:

  uint64_t timestamp = 28376234L;  // A common arbitrary timestamp for all rows.

and then calls the TableMutator::set method with this value without ever changing it:

        mutator_ptr->set(0, key, value.c_str(), value.length());

This is a problem.  The system relies heavily on timestamps to handle concurrent updates during compaction.  The system expects supplied timestamps to be in increasing order with no collisions.  The best way to fix this problem is to supply the value 0 for the timestamp, like the following:

        mutator_ptr->set(0, key, value.c_str(), value.length());

The system will autogenerate a non-coliding timestamp for each inserted value.  This should prevent updates from getting dropped.  I've added a new TableMutator::set method that does not take a timestamp for autogenerated timestamps.  I've also put a big comment at the top of the original set method indicating that its use is discouraged.  This will show up in the next release.

One other thing, the Bigtable paper makes no mention of the use of compression during the test.  I'm guessing that compression was disabled.  To disable compression in Hypertable, you can do the following:

1. Add the following property to conf/hypertable.cfg:
Hypertable.RangeServer.CommitLog.Compressor=none

2. Added a COMPRESSOR="none" option to your CREATE TABLE statement:
create table COMPRESSOR="none" PerfEval (
  Field
);

Please let me know if you see anymore problems.

- Doug

Mikhail

unread,
Mar 12, 2008, 11:53:41 PM3/12/08
to Hypertable User
Hi Doug,

Thank you for your answer and sorry taking so much time to respond.
I re-ran the test with the corrections you suggested, and everything
is working correctly.

$ ./perf_eval write
Evaluating random writes performance
Random writes: 300000 1000-byte rows in 21.391 seconds, 14024.9 rows
per second
$ ./perf_eval read
Evaluating random reads performance
Random reads: 300000 1000-byte rows in 250.601 seconds, 1197.1 rows
per second

After disabling compression as you described, I got the following
results:
$ ./perf_eval write
Evaluating random writes performance
Random writes: 300000 1000-byte rows in 25.668 seconds, 11687.5 rows
per second
$ ./perf_eval read
Evaluating random reads performance
Random reads: 300000 1000-byte rows in 248.167 seconds, 1208.9 rows
per second

When I removed the compression settings again (supposedly turning
compression back on) and re-run the write test again, I got
$ ./perf_eval write
Evaluating random writes performance
Random writes: 300000 1000-byte rows in 17.947 seconds, 16715.5 rows
per second

I guess the random write test runs faster with compression than
without it because the bottleneck is in log writing speed and
compression allows to write less data.

Also, is there currently a Java API for Hypertable or plans of
creating one? I could not find anything relevant in src/java in the
source tree. It would be good to have a Java API that is compatible
with HBase so that it can be easily replaced with Hypertable and vice
versa.

Thanks,
Mikhail Bautin

Doug Judd

unread,
Mar 13, 2008, 12:09:32 AM3/13/08
to hyperta...@googlegroups.com
Hi Mikhail,

Thanks for posting these numbers.  The Bigtable paper includes the following statement:

"We wrote a single string under each row key.  Each string was generated randomly and was therefore uncompressible."

I believe the value generated in your code was a string of all the same character, which would be highly compressible.  I think your disabled compression numbers are a little more apples-to-apples with the published Bigtable numbers.

Also, we're definitely planning to add a Java API.  It will be available in the 1.0 release.

BTW, would you mind if we included your code in the source tree under examples?  You'd just have to agree to release it GPL 2.0.

And if you have any interest in pitching into this project further, let me know.

- Doug

Luke

unread,
Mar 13, 2008, 12:19:22 AM3/13/08
to Hypertable User
Thanks for the note, Mikhail,

I suspect these numbers are from an older release. The current release
(0.9.0.4) or git master should see much higher random read performance
(up to 3 to 4x speedup)

As Doug pointed out, it seems you're using the same character for
values, while the Bigtable paper used random values. No wonder it
compresses so well. You might want to revise it a bit to match the
benchmark settings of the Bigtable paper? Also HBase has compression
turned off by default (it actually uses Hadoop's MapFile compression
settings)

Yes, we're planning on releasing a Thrift interface which has multiple
language bindings (including Java) similar to what HBase has now.

__Luke


On Mar 12, 8:53 pm, Mikhail <MBau...@gmail.com> wrote:
> Hi Doug,
>
> Thank you for your answer and sorry taking so much time to respond.
> I re-ran the test with the corrections you suggested, and everything
> is working correctly.
>
> $ ./perf_eval write
> Evaluating random writes performanceRandom writes: 300000 1000-byte rows in 21.391 seconds, 14024.9 rows
>
> per second
> $ ./perf_eval read
> Evaluating random reads performance
> Random reads: 300000 1000-byte rows in 250.601 seconds, 1197.1 rows
> per second
>
> After disabling compression as you described, I got the following
> results:
> $ ./perf_eval write
> Evaluating random writes performanceRandom writes: 300000 1000-byte rows in 25.668 seconds, 11687.5 rows
>
> per second
> $ ./perf_eval read
> Evaluating random reads performance
> Random reads: 300000 1000-byte rows in 248.167 seconds, 1208.9 rows
> per second
>
> When I removed the compression settings again (supposedly turning
> compression back on) and re-run the write test again, I got
> $ ./perf_eval write
> Evaluating random writes performanceRandom writes: 300000 1000-byte rows in 17.947 seconds, 16715.5 rows

esvee

unread,
Apr 9, 2008, 1:39:07 AM4/9/08
to Hypertable User
Hi Doug,

Is the code and data published? Link is appreciated.

Thanks!

On Mar 13, 9:09 am, "Doug Judd" <d...@zvents.com> wrote:
> Hi Mikhail,
>
> Thanks for posting these numbers. The Bigtable paper includes the following
> statement:
>
> "We wrote a single string under each row key. Each string was generated
> randomly and was therefore uncompressible."
>
> I believe the value generated in your code was a string of all the same
> character, which would be highly compressible. I think your disabled
> compression numbers are a little more apples-to-apples with the published
> Bigtable numbers.
>
> Also, we're definitely planning to add a Java API. It will be available in
> the 1.0 release.
>
> BTW, would you mind if we included your code in the source tree under
> examples? You'd just have to agree to release it GPL 2.0.
>
> And if you have any interest in pitching into this project further, let me
> know.
>
> - Doug
>
Reply all
Reply to author
Forward
0 new messages