We are trying to migrate our current time series data source to a more modern system. KairosDB is an obvious candidate since it is based on Cassandra (which we already use). However, we are not already sure that we want to use Kairos as we believe our use cases might be unfit for it performance-wise.
We have already developed a prototype to clarify some issues we had. The features provided by KairosDB allow to satisfy all our needs, but we have issues with read performances in half our cases.
Actually, we would like to be able to read fast in both those cases:
A. Reading all values of a small number of time series.
We found that case A is well supported natively by Kairos. Read response-time for all time steps (~10K) of a single time series is around 70ms.
B. Reading only a few values of all time series.
Case B however is not as satisfactory. Read response-time for a single value of N time series seems to grow linearly with N and is around 2s for N = 1000.
(All our tests were performed on a single Cassandra node - unfortunately I can't provide you with its specs. I joined the response time actual values but I believe that they are not as important as the ratio between them)
Provided with those results, some of us are starting to believe that KairosDB does not suit our needs. What do you think we should do to improve performances in case B?
We thought about storing data in 2 manners :
This solution would probably improve greatly the response-time in case B, but it would also obviously degrade write performances. Additionaly, I feel that doing so would alienate the solution from its initial philosophy.
Please feel free to ask for any additional information!