Difference in InfluxDB vs KairosDB

1,149 views
Skip to first unread message

Raghavendra Nandagopal

unread,
Jan 20, 2017, 1:47:11 PM1/20/17
to KairosDB
Hi Team,
  We are currently using InfluxDB and Grafana in our monitoring platform.  Since InfluxDB open source version doesn't have an scaling functionality we are planning to replace it with any other TSDB.  We are checking on different options OpenTSDB/KairosDB. Do you have any performance numbers in terms of writes/queries for KairosDB and also the compatibility of Grafana with KairosDB (functions/alerting).  Any info will be much appreciated.

Thanks,
Raghavendra Nandagopal

Brian Hawkins

unread,
Jan 20, 2017, 10:07:07 PM1/20/17
to KairosDB
Part of the decision is the backend.  Cassandra vs HBase.  I've setup both and in my opinion Cassandra is easier to setup and manage.  With Cassandra you can start with a 3 node cluster and then expand as your usage increases.  I've read articles where they recommend that HBase needs to start with 6 to 10 before you can increase for expansion.

Performance numbers are hard.  Here is a paper that was done just about 2 years ago comparing the two: http://www.koziolek.de/docs/Goldschmidt2014-IEEE-CLOUD-preprint.pdf

I'm in the middle of converting the Kairos code over to use CQL (was using thrift) and I'm testing performance.  I have 4 identical computers (i5 4 core, 16gig ram, ssd's), one is a kairos node and the other 3 are Cassandra nodes with a replication factor of 1.  With the CQL code I can push 1 million metrics/sec through Kairos.  Granted this test doesn't account for protocol parsing - the metrics are generated within the Kairos service.

The changeover to CQL will make kairos queries faster, I'll have real numbers in the next few weeks.  I have tried to see how many queries/sec I can run through the same setup, each query only hitting a dozen data points - this matches a use case I'm looking at for work.  Querying a few data points I can run about 500 queries/sec through the single kairos node.  Your mileage will vary depending on the number of data points you hit in the query.

You will get a smaller data footprint with opentsdb as they have a background compaction process that goes back over your inserted data and compacts the columns (both a good and bad thing).

Does that help?

Brian
Reply all
Reply to author
Forward
0 new messages