Let me expose the architecture of my system before diving into the heart of the problem.
I Have stream of data that comes from Kafka and my company uses a distributed cache (hazelcast precisely) that make data ready to be requested through web services that we expose. We also want to persist the data in the cache to cassandra so it would be durable. I have two solutions on how to put the data to hazelcast and I would like to have your suggestions (maybe another way of doing) and tell me in your view what's the best solution and why?
1/ use a kafka-hazelcast connector to send data directly from kafka to hazelcast and then persist the data to cassadandra using write-behind and mapstores ==> there two main drawbacks with this solution, first we to serialize/deserialize each time we store data to cassandra (important usage of CPU) and second we put all the data to the cache even not needed by users (we have lots of evictions hapenning)
2/ Use a kafka-cassandra connector and write data directly to cassandra and then find a means (how complex you think this part could be ?) to notify hazelcast to update/evict the data if it's already in the cache ==> the pros of this solution is that we get rid of the serilizatino/deserialization needed by the mapstores and we load only the data that was queried before and the key is already in the cache
Which one of the two solutions do you prefer and why ? what's the best means to notify hazelcast in the second solution in you point of view ?
Thank you in advance for your suggestions/answers I hope i was concise and clear !