|Maximizing throughput||Bryan Keller||1/10/13 8:31 PM|
I am attempting to configure HBase to maximize throughput, and have noticed some bottlenecks. In particular, with my configuration, write performance is well below theoretical throughput. I have a test program that inserts many rows into a test table. Network I/O is less than 20% of max, and disk I/O is even lower, maybe around 5% max on all boxes in the cluster. CPU is well below than 50% max on all boxes. I do not see any I/O waits or anything in particular than raises concerns. I am using iostat and iftop to test throughput. To determine theoretical max, I used dd and iperf. I have spent quite a bit of time optimizing the HBase config parameters, optimizing GC, etc., and am familiar with the HBase book online and such.
|RE: Maximizing throughput||Anoop Sam John||1/10/13 8:40 PM|
|Re: Maximizing throughput||anil gupta||1/10/13 8:42 PM|
Is flushCommits true or false?
Thanks & Regards,
|Re: Maximizing throughput||anil gupta||1/10/13 8:47 PM|
Sorry, I meant to ask about "setAutoFlush". Is setAutoFlush true or false?
|Re: Maximizing throughput||Asaf Mesika||1/11/13 5:03 AM|
I've done similar work couple of months ago. Start by sharing more
details on your program, hbase setup, and the way you measure network
and disk bottlenecks.
Also, have you isolated network and disk on all nodes and between all
nodes? (Each two nodes)) Test them separately and give us those
Next do a copyFromLocal to hdfs from master on a file which at least
the size of your machine memory (to make sure you write to disk and
not Linux memory). Tell us the copy throughput.
Sent from my iPhone
|Re: Maximizing throughput||Bryan Keller||1/11/13 9:37 AM|
Thanks for the responses. I'm running HBase 0.92.1 (Cloudera CDH4).
The program is very simple, it inserts batches of rows into a table via multiple threads. I've tried running it with different parameters (column count, threads, batch size, etc.), but throughput didn't improve. I've pasted the code here: http://pastebin.com/gPXfdkPy
I have auto flush on (default) as I am inserting rows in batch so don't need to use the internal HTable write buffer.
I've posted my config as well: http://pastebin.com/LVG9h6Z4
The regionservers have 12 cores (24 with HT), 128 GB RAM, 6 SCSI drives Max throughput is 90-100mb/sec on a drive. I've also tested this on an EC2 High I/O instance type with 2 SSDs, 64GB RAM, and 8 cores (16 with HT). Both the EC2 and my colo cluster have the same issue of seemingly underutilizing resources.
I measure disk usage using iostat and measured the theoretical max using hdparm dd. I use iftop to monitor network bandwidth usage, and used iperf to test theoretical max. CPU usage I use top and iostat.
The maximum write performance I'm getting is usually around 20mb/sec on a drive (this is my colo cluster) on each of the 2 data nodes. That's about 20% of the max, and it is only sporadic, not a consistent 20mb/sec per drive. Network usage also seems to top out around 20% (200mbit/sec) to each node. CPU usage on each node is around 10%. The problem is more pronounced on EC2 which has much higher theoretical limits for storage and network I/O.
Copying a 133gb file to HDFS looks like it gives similar performance as HBase (sporadic disk usage topping out at 20%, low CPU, 30-40% network I/O) so it seems this is more of an HDFS issue than an HBase issue.
|Re: Maximizing throughput||Bryan Keller||1/15/13 9:28 AM|
I'll follow up on this in case it is useful to anyone. It seems I was network I/O limited. The switch I was using was in managed mode which decreased throughput to 1gbit/sec within the switch, not just on the wire. So with replication set to 2, throughput was about half of the theoretical max on a given box (client -> switch -> datanode 1 -> switch -> datanode 2). It was an eye opener that I was network I/O limited. I will probably move to a 10gbit/sec switch and/or use bonded NICs.
|Re: Maximizing throughput||Andrew Purtell||1/15/13 9:48 AM|
Thanks Bryan, really appreciate you letting us know the outcome. I'm sure
it will be useful to others.
Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
|Re: Maximizing throughput||anil gupta||1/15/13 12:04 PM|
Nice that you figured out the bottleneck.
However, with your current Hardware configuration, Disk I/O might become
your bottleneck in future since you have only 6 disk and 12 core. Try to
bring the Cores to Disk(No. of Cores/No. of Disk) ratio closer to 1 for a
greater throughput and better utilization of hardware resources.
I have a 7 node (2 admin and 5 worker)HBase cluster with each node having
12 cores, 11 disk, 48 GB ram . After tuning the GC, and other parameters i
have easily achieved write load of 2200 request per second per
node(consistently for 5 hours load test loading 200 million rows.). Yet, i
need to test the upper limit of write load. I haven't done read load test
yet. I have used PerformanceEvaluation utility of HBase for this.