InfluxDB's take on how they compare to OpenTSDB

238 views
Skip to first unread message

Thibault Godouet

unread,
Apr 24, 2018, 3:41:53 AM4/24/18
to OpenTSDB
Hi,


We all know benchmarks should be taken with a pinch of salt.  That said, a benchmarks saying that Influx is that much better than OpenTSDB from a performance point of view seemed a little suspicious to me.
Not saying there is any ill intent there, OpenTSDB+HBase is relatively complex and someone who hasn't spent enough time playing with it can easily miss important points.  I may add a comment on that page to point out how the test could be made a bit more relevant.

Reading the details, I saw a few things that could explain this:
- no mention of pre-splitting the HBase tables.  On newly created tables, that would give us a single region: am I right to think that HBase would be unlikely to use all the resources available on the server in that scenario, which would result in poor performance, and could explain the difference in performance they report?
- they disabled TSDB compaction (due to performance issues), and they don't mention using compression on the HBase tables: would that explain a good chunk if not all of the disk usage difference?
- also no mention of the TSDB metadata settings from what I can remember, even though they have a significant impact on performance.  As there is a trade off between performance and features, would make sense to state which settings they chose here.
- the test is on a single node, but I suspect most people run OpenTSDB on multi-nodes, and often quite a few of them.  Testing that scenario too would be sensible.

Anything else I missed?

Regards,
Thibault. 

Mathias Herberts

unread,
Apr 25, 2018, 4:35:06 PM4/25/18
to OpenTSDB
Why even bother looking at those benchmarks. InfluxData had hired an intern in the past who did a bunch of those pseudo benchmarks. Their marketing/PR division has been exploiting those documents for some time now, but they are not fooling anyone who has seriously played with OpenTSDB.

Thibault Godouet

unread,
Apr 25, 2018, 5:51:19 PM4/25/18
to OpenTSDB
Hi Mathias,

"they are not fooling anyone who has seriously played with OpenTSDB"... but what about everyone who hasn't?
If you have a choice between two systems, and you see a benchmark that tells you system A is better than system B, and nothing that contradicts it, I think that even if you are not sure how much you can trust the benchmark you'd tend to go for system A wouldn't you?
A constructive comment on that page, if InfluxData are fair and accept it, could help people out there to make a more informed choice I hope.

Thibault.

Mathias Herberts

unread,
Apr 26, 2018, 7:39:24 AM4/26/18
to OpenTSDB
OpenTSDB like Warp 10 are seen as complex beasts solely because they use HBase which lots of people think is very difficult to manage.

Entering the benchmark game will shift the debate to the complexity plane, where the HBase dependency will be pointed as a weakness.

ManOLamancha

unread,
May 22, 2018, 2:39:46 PM5/22/18
to OpenTSDB
On Tuesday, April 24, 2018 at 12:41:53 AM UTC-7, Thibault Godouet wrote:

Ooo, thanks for the link. I had a good talk with the Influx folks about their comparison and I'll need to see if they addressed them in this post. With a stand-alone instance of HBase and OpenTSDB, Influx did perform a bit better, bit was 100k vs 80k writes per second in my tests. Without any HBase tuning.
 
We all know benchmarks should be taken with a pinch of salt.  That said, a benchmarks saying that Influx is that much better than OpenTSDB from a performance point of view seemed a little suspicious to me.

Definitely salty but they do have a good point. Influx is targeted for single machine installs and in that case it definitely does outperform TSD and HBase in write throughput and storage size. Warp10 is probably even better at the write throughput  than TSD :) And it certainly is much, much easier to install than a full Hadoop cluster. But TSD and Warp10 are designed for redundancy and scalability by using those features of HBase. So we trade-off single machine performance for availability and distribution. Influx had tons of problems with their early clustering so Paul said they were redesigning it. So we then need to look at their new clustering solution and bench that. 

One problem with the old paper is that they tried clustering HBase and HDFs with 3 way replication and compared that to a single Influx install. If this comparison is cluster to cluster then it would be better to look at. Also we need bench marks on queries.
 
Not saying there is any ill intent there, OpenTSDB+HBase is relatively complex and someone who hasn't spent enough time playing with it can easily miss important points.  I may add a comment on that page to point out how the test could be made a bit more relevant.

Reading the details, I saw a few things that could explain this:
- no mention of pre-splitting the HBase tables.  On newly created tables, that would give us a single region: am I right to think that HBase would be unlikely to use all the resources available on the server in that scenario, which would result in poor performance, and could explain the difference in performance they report?

Right, if you have 500 region servers but haven't split the table, writes will only hit a single server.
 
- they disabled TSDB compaction (due to performance issues), and they don't mention using compression on the HBase tables: would that explain a good chunk if not all of the disk usage difference?

Yup, we always recommend using compression on HBase. With Snappy the difference was much smaller, though Influx still saved more space.
 
- also no mention of the TSDB metadata settings from what I can remember, even though they have a significant impact on performance.  As there is a trade off between performance and features, would make sense to state which settings they chose here.

Hmm, the UID conversion we have in TSD plays a marked difference here. Mathias's Warp 10 uses hashed values and a much better meta system.
 
- the test is on a single node, but I suspect most people run OpenTSDB on multi-nodes, and often quite a few of them.  Testing that scenario too would be sensible.

Absolutely. TSD folks play with development on a single node but rarely run production that way. Everyone has clusters of HBase. 
Reply all
Reply to author
Forward
0 new messages