Hi,
We want to use Apache Cassandra for storing big data gathered from realm time sensor data. We have developed an IOT platform capable to handle 1 million events per second. We want to persist them in Cassandra.
Our table looks like :
Sensor_data_by_date |
||
Realm |
text |
K |
Bucket |
int |
K |
dateTimeReceived |
timestamp |
Clustering column |
sensor_id |
text |
|
Message_id |
text |
|
Sensor_name |
text |
|
Query we are interested in is :
Give me all results for all sensor data for “realm-a” for a dateTime range say “5th may” to “12th may” order by “dateTimeReceived”.
Solution :
Since our platform can handle upto 1 million events per second, when I even try to include DATE + HOUR as partition key, it will still increase the maximum recommended size by Casssandra. So we decided to keep bucket along with realm as partition key.
Problem :
Now say when we have a wide range of date range as mentioned (5th may to 13th may), we will have multiple buckets to lookup from. We also need to support ordering.
When we have this in place, I need to use “IN clause” for buckets say :
……….. where realm=realm-a and bucket in (1,2,3,4) and dateTimeReceived>… and dateTimeReceived <… order by dateTimeReceived
This would complain that IN clause and order by cant work together with pagination.
I need to have pagination as well…..
Can you please help me how to achieve this functionality?
Help will be much appreciated.
Regards
--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.