Opentsdb/hbase write performance

Steve Bischoff

unread,

Jun 20, 2017, 6:28:54 PM6/20/17

to OpenTSDB

I am running a three node cluster with opentsdb and we are pushing around 15K points a second through a TSD running on the hbase master. I have heard that a single TSD can easily handle that but we are running into some issues.

after about 50 to 70 million points written (an hour or so, fairly stable) I start seeing this on the regionserver hbase logs ->

2017-06-20 21:04:44,319 WARN [B.defaultRpcServer.handler=4,queue=1,port=16020] regionserver.MultiVersionConcurrencyControl: STUCK: MultiVersionConcurrencyControl{readPoint=18008, writePoint=18038}

The master server then starts cpu spiking and memory usage grows quickly. Hbase shows a spike of 25K points written per sec. After a couple minutes of this the regionserver memstores overload and the writes go down to zero with this in the hbase regionserver log

2017-06-20 22:03:37,655 WARN [B.defaultRpcServer.handler=22,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 65020ms

I pre-split the regions to about 15 on each regionserver and the writes are fairly evenly distributed.

Am i at the limits of my cluster or is something configured incorrectly?

Any help would be appreciated.

Avind

unread,

Jun 22, 2017, 1:36:09 AM6/22/17

to OpenTSDB

lengthy discussion on the same here ..

http://apache-hbase.679495.n3.nabble.com/Hbase-regionserver-MultiVersionConcurrencyControl-Warning-td4081591.html

could be due to compaction triggered along with ongoing writes on same region.

Steve Bischoff

unread,

Jun 22, 2017, 9:55:27 AM6/22/17

to OpenTSDB

Is there work being done to minimize this? It seems like the cluster can handle quite a bit more writes if it wasn't for this compaction load.

I checked out the link, looks like there are some options for disabling compaction. I will try that.

Thanks for the help

Avind

unread,

Jun 23, 2017, 1:37:48 AM6/23/17

to OpenTSDB

we did try with append mode (available since 2.2) which is to avoid the compaction .. but it was performing worse so we switched back (did not check in detail at that time by bumping hardware for the boxes)

looks like nothing specific on the horizon on that front http://opentsdb.net/docs/build/html/new.html#planned

Ultimately we avoided the problem by avoiding http for writes and writing to hbase using rpc leveraging the TSDB snapshot in our spark programs (some thing similar to what was attempted in the link discussion link posted earlier).

ManOLamancha

unread,

Jul 6, 2017, 5:54:21 PM7/6/17

to OpenTSDB

On Thursday, June 22, 2017 at 10:37:48 PM UTC-7, Avind wrote:

we did try with append mode (available since 2.2) which is to avoid the compaction .. but it was performing worse so we switched back (did not check in detail at that time by bumping hardware for the boxes)

looks like nothing specific on the horizon on that front http://opentsdb.net/docs/build/html/new.html#planned

Ultimately we avoided the problem by avoiding http for writes and writing to hbase using rpc leveraging the TSDB snapshot in our spark programs (some thing similar to what was attempted in the link discussion link posted earlier).

There is work going on around a co-processor for HBase that performs appends without the read-modify-write workload as currently implemented. It's saving tons of space at compaction time and CPU at write time and we hope to have it out by Q3.

Reply all

Reply to author

Forward