Redis Pub Sub a Mechanism to Mitigate Lost Writes?

Kevin Johnson

unread,

Jun 7, 2017, 11:29:34 AM6/7/17

to Redis DB

Hello,

I was wondering if it's possible to utilize Redis publish/subscribe as a means to recover from lost writes caused by a network partition or similar events.

Is it possible for a replica to publish a message to a queue once it receives a replicated write? The client could then poll the queue to ensure all expected nodes have received the write.

Just a thought...

Thanks!

-Kevin

AlexanderB

unread,

Jun 7, 2017, 1:23:01 PM6/7/17

to Redis DB

Hey Kevin, I'm not sure if you mean a a redis replica, (another redis host configured as a slave of another instance), or something at the layer of your application. If you're thinking about the former, a redis slave, check out the wait command. It can be very tricky to get exactly the distributed system semantics you want as there are still a handful of rare edge cases where is won't work perfectly, but it would give you something very close to what you want. In general though, a redis slave won't actually get out of sync from just a lost write. The communication between master and slave is over tcp, and it already will handle retransmission for writes that don't get acknowledged.

If you were thinking more in terms of something at the application level than Redis can be a great way to implement generic queues. That being said, you wouldn't actually want to be using the pub-sub mechanism to do this.

Pub-sub is a single fire and forget. It doesn't guarantee that there are any clients currently subscribed, and will return successfully even if a network partition has caused what ever process was subscribed to miss the messages.

Instead you'd probably want to use either a redis list, or a redis sorted set to store the queue. There's lots of good examples of people posting designs for queued system they've built with redis, and I bet you could find somethat that's a good fit for whatever your use case is with a bit of google searching for "redis queues examples", or by looking at tools built for doing generic queuing like Celery.

Kevin Johnson

unread,

Jun 8, 2017, 11:00:26 AM6/8/17

to Redis DB

Thanks Alexander

I was using the term "replica" to mean slave node. I was thinking the client could use a queue to store write events and these events could be purged after all slave nodes ack each replication event they handle successfully. If for some reason a slave node didn't receive the write, the client could repeat the write operation.

Reply all

Reply to author

Forward