metasync vs tsd.core.meta.enable_realtime_ts=true

447 views
Skip to first unread message

Izak Marais

unread,
Oct 20, 2016, 6:36:55 AM10/20/16
to OpenTSDB
Hi All, 

We are load testing distributed OpenTSDB after moving from a single node. The tsd.core.meta.enable_realtime_ts=true is a problem, it limits performace to 17k dps on a beefy 4 node cluster. After disabling we handle 3-4 times as much.

This is aggravated by our inability pre-split the tsdb-meta table; the tsdb-meta key-space is an undocumented black box. So the write load is not as evenly distributed as it is for the pre-split tsdb table.

We really want to use the Grafana template variable features, which requires meta data. 

  1. Does anyone have experience with this? 
  2. Suggestions for presplitting tsdb-meta? 
  3. Is it feasible to run commandline tsdb uid metasync every hour instead of using tsd.core.meta.enable_realtime_ts=true? Other posts seem to indicate that this can also have problems.

Thanks for any tips.

Izak

Jonathan Creasy

unread,
Oct 20, 2016, 11:16:06 AM10/20/16
to Izak Marais, OpenTSDB
1) Yes, lots of us have experience with this, something is wrong with your node if your write performance is maxed at 17k dps, I have a VM here on my macbook doing nearly twice that for testing.
3) Yes, I would use the metasync, I believe some folks run a separate node for this purpose

Izak Marais

unread,
Oct 21, 2016, 3:19:07 AM10/21/16
to Jonathan Creasy, Izak Marais, OpenTSDB
Hi Johathan,

Thank you very much for the reply!

1) Is that performance in your test VM with enable_realtime_ts=true set? The problem is setting it true causes a high write load to tsdb-meta. With this table unsplit, the HBase regionserver node hosting it runs out of heap. With enable_realtime_ts=false we got 10x write performance increase yesterday. 
2) Yes, we did use those pre-spit guides and scripts for the tsdb table, but they don't explain how to apply them to the tsdb-meta table. The tsdb-meta key-space is undocumented.
3) OK, thanks, that gives me hope that metasync will solve our problems!

Regards
Izak

ManOLamancha

unread,
Dec 19, 2016, 10:42:07 PM12/19/16
to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com
On Friday, October 21, 2016 at 12:19:07 AM UTC-7, Izak Marais wrote:
1) Is that performance in your test VM with enable_realtime_ts=true set? The problem is setting it true causes a high write load to tsdb-meta. With this table unsplit, the HBase regionserver node hosting it runs out of heap. With enable_realtime_ts=false we got 10x write performance increase yesterday. 

Yeah, it's crappy. One thing you can try is to enable "tsd.core.meta.enable_tsuid_tracking" and then disable "tsd.core.meta.enable_tsuid_incrementing"  That only writes puts instead of atomic increments and should help the write rate a bit. But it will require a manual meta-sync periodically to generate the meta data objects. Our HBase engineers came up with a way to improve those atomic increments a ton (Along with appends) and when that makes it open source that will help meta a ton.

2) Yes, we did use those pre-spit guides and scripts for the tsdb table, but they don't explain how to apply them to the tsdb-meta table. The tsdb-meta key-space is undocumented.

I'll try and doc that soon. In the mean time, the meta table row key is just the TSUID. That means it starts with the metric UID followed by the first tag key UID and it's paired tag value UID followed by the other tag pairs. So you can split on the metric values. If you have random metric assignments enabled, split on the full 3 bytes. If you have the default incrementing UIDs enabled then split up to your max UID.

john....@skyscanner.net

unread,
Dec 21, 2016, 9:17:15 AM12/21/16
to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com
Are the atomic update / append performance improvements to the asynchbase library or to hbase itself ?

ManOLamancha

unread,
Jan 7, 2017, 7:41:57 PM1/7/17
to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com
On Wednesday, December 21, 2016 at 6:17:15 AM UTC-8, john....@skyscanner.net wrote:
Are the atomic update / append performance improvements to the asynchbase library or to hbase itself ?

The improvements are a co-processor our engineers wrote that simply maintains the list of edits to HBase columns and plays them back at query time. During HBase's compactions the edits are squashed into the final result. We tested it just before the holidays and it was really impressive, helping us to store almost 10x the data in block cache compared to straight TSD writes without TSD compaction. And it didn't have the same CPU overhead as normal HBase appends. The trade off is greater storage utilization until HBase compactions have completed.

john....@skyscanner.net

unread,
Jan 17, 2017, 6:25:14 AM1/17/17
to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com
This is really encouraging, do you have any idea of a timeline for it being open sourced 

ManOLamancha

unread,
Jan 22, 2017, 5:23:14 PM1/22/17
to OpenTSDB, jona...@ghostlab.net, izakm...@yahoo.com
On Tuesday, January 17, 2017 at 3:25:14 AM UTC-8, john....@skyscanner.net wrote:
This is really encouraging, do you have any idea of a timeline for it being open sourced 

Not quite yet, the engineers who created it are on a huge crunch right now that's due at the end of the quarter so I hope to poke them into OSSing it after that. 
Reply all
Reply to author
Forward
0 new messages