metasync vs tsd.core.meta.enable_realtime

Izak Marais

unread,

Oct 20, 2016, 6:36:55 AM10/20/16

to OpenTSDB

Hi All,

We are load testing distributed OpenTSDB after moving from a single node. The tsd.core.meta.enable_realtime_ts=true is a problem, it limits performace to 17k dps on a beefy 4 node cluster. After disabling we handle 3-4 times as much.

This is aggravated by our inability pre-split the tsdb-meta table; the tsdb-meta key-space is an undocumented black box. So the write load is not as evenly distributed as it is for the pre-split tsdb table.

We really want to use the Grafana template variable features, which requires meta data.

Does anyone have experience with this?
Suggestions for presplitting tsdb-meta?
Is it feasible to run commandline tsdb uid metasync every hour instead of using tsd.core.meta.enable_realtime_ts=true? Other posts seem to indicate that this can also have problems.

Thanks for any tips.

Izak

Jonathan Creasy

unread,

Oct 20, 2016, 11:16:06 AM10/20/16

to Izak Marais, OpenTSDB

1) Yes, lots of us have experience with this, something is wrong with your node if your write performance is maxed at 17k dps, I have a VM here on my macbook doing nearly twice that for testing.

2) Pre-splitting your table is recommended: http://opentsdb.net/docs/build/html/user_guide/writing.html#pre-split-hbase-regions Some Examples of that: https://github.com/OpenTSDB/opentsdb/issues/538

3) Yes, I would use the metasync, I believe some folks run a separate node for this purpose

Izak Marais

unread,

Oct 21, 2016, 3:19:07 AM10/21/16

to Jonathan Creasy, Izak Marais, OpenTSDB

Hi Johathan,

Thank you very much for the reply!

1) Is that performance in your test VM with enable_realtime_ts=true set? The problem is setting it true causes a high write load to tsdb-meta. With this table unsplit, the HBase regionserver node hosting it runs out of heap. With enable_realtime_ts=false we got 10x write performance increase yesterday.

2) Yes, we did use those pre-spit guides and scripts for the tsdb table, but they don't explain how to apply them to the tsdb-meta table. The tsdb-meta key-space is undocumented.

3) OK, thanks, that gives me hope that metasync will solve our problems!

Regards

Izak

ManOLamancha

unread,

Dec 19, 2016, 10:42:07 PM12/19/16

to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com

On Friday, October 21, 2016 at 12:19:07 AM UTC-7, Izak Marais wrote:

1) Is that performance in your test VM with enable_realtime_ts=true set? The problem is setting it true causes a high write load to tsdb-meta. With this table unsplit, the HBase regionserver node hosting it runs out of heap. With enable_realtime_ts=false we got 10x write performance increase yesterday.

Yeah, it's crappy. One thing you can try is to enable "tsd.core.meta.enable_tsuid_tracking" and then disable "tsd.core.meta.enable_tsuid_incrementing" That only writes puts instead of atomic increments and should help the write rate a bit. But it will require a manual meta-sync periodically to generate the meta data objects. Our HBase engineers came up with a way to improve those atomic increments a ton (Along with appends) and when that makes it open source that will help meta a ton.

2) Yes, we did use those pre-spit guides and scripts for the tsdb table, but they don't explain how to apply them to the tsdb-meta table. The tsdb-meta key-space is undocumented.

I'll try and doc that soon. In the mean time, the meta table row key is just the TSUID. That means it starts with the metric UID followed by the first tag key UID and it's paired tag value UID followed by the other tag pairs. So you can split on the metric values. If you have random metric assignments enabled, split on the full 3 bytes. If you have the default incrementing UIDs enabled then split up to your max UID.

john....@skyscanner.net

unread,

Dec 21, 2016, 9:17:15 AM12/21/16

to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com

Are the atomic update / append performance improvements to the asynchbase library or to hbase itself ?

ManOLamancha

unread,

Jan 7, 2017, 7:41:57 PM1/7/17

to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com

On Wednesday, December 21, 2016 at 6:17:15 AM UTC-8, john....@skyscanner.net wrote:

Are the atomic update / append performance improvements to the asynchbase library or to hbase itself ?

The improvements are a co-processor our engineers wrote that simply maintains the list of edits to HBase columns and plays them back at query time. During HBase's compactions the edits are squashed into the final result. We tested it just before the holidays and it was really impressive, helping us to store almost 10x the data in block cache compared to straight TSD writes without TSD compaction. And it didn't have the same CPU overhead as normal HBase appends. The trade off is greater storage utilization until HBase compactions have completed.

john....@skyscanner.net

unread,

Jan 17, 2017, 6:25:14 AM1/17/17

to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com

This is really encouraging, do you have any idea of a timeline for it being open sourced

ManOLamancha

unread,

Jan 22, 2017, 5:23:14 PM1/22/17

to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com

On Tuesday, January 17, 2017 at 3:25:14 AM UTC-8, john....@skyscanner.net wrote:

This is really encouraging, do you have any idea of a timeline for it being open sourced

Not quite yet, the engineers who created it are on a huge crunch right now that's due at the end of the quarter so I hope to poke them into OSSing it after that.

Reply all

Reply to author

Forward

metasync vs tsd.core.meta.enable_realtime_ts=true

Izak Marais

Jonathan Creasy

Izak Marais

ManOLamancha

john....@skyscanner.net

ManOLamancha

john....@skyscanner.net

ManOLamancha