I am trying to understand how a distributed storage system built on Raft filter duplicate requests even after client session expiration.
I have gone through the Raft dissertation chapter 6.3 which talks about how LogCabin (which is a distributed storage system built on Raft) filters the duplicate requests by maintaining the client sessions. Whenever the Cluster doesn't hear any requests from the client for example like an hour, it will expire the client sessions.
Lets say, client ( Id = 1 ) issued a command to increment the count of a product to 3 in the current active session (sessionId = 123 ) as below :
{ ProductName = iPod, Count = INC(3) }Leader received the client request and replicated it to the majority of the followers, applied it to the state machine and cached the results so that if the same request is issued by the client again it can simply return the cached result as it is a duplicate request.
The same Client( Id = 1 ) went inactive for an hour, So leader expired the client session.
Client issues the same request (duplicate ) to process again in the new session.
Now how cluster will still filter the duplicate requests when client joins with a new session and tries to execute the same request which was issued in previous session ?
Raft Dissertation chapter 6.3 talks below as one solution :
The second issue is how to deal with a client that continues to operate after its session was expired. We expect this to be an exceptional situation; there is always some risk of it, however, since there is generally no way to know when clients have exited. One option would be to allocate a new session for a client any time there is no record of it, but this would risk duplicate execution of commands that were executed before the client’s previous session was expired. To provide stricter guarantees, servers need to distinguish a new client from a client whose session was expired. When a client first starts up, it can register itself with the cluster using the RegisterClient RPC. This allocates the new client’s session and returns the client its identifier, which the client includes with all subsequent commands. If a state machine encounters a command with no record of the session, it does not process the command and instead returns an error to the client. LogCabin currently crashes the client in this case (most clients probably wouldn’t handle session expiration errors gracefully and correctly, but systems must typically already handle clients crashing).I am finding it difficult to understand how this will solve to filter duplicate requests issued by client in the new session after his previous session was expired. Would like to also know how this kind of issues is handled in other distributed systems.
--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/85d47983-e38e-4017-a1ff-29450144f4een%40googlegroups.com.
![]() | Oren Eini CEO / Hibernating Rhinos LTD
|
Thanks for the response Ayende Rahien.What if client connects back with the new session after his previous session got expired and issue the same duplicate command to process in the new session. In this case we are still processing the duplicate request and not filtering at all.
Is there a way to handle this scenario ?Thanks,
On Monday, October 12, 2020 at 12:08:32 PM UTC+5:30 Ayende Rahien wrote:
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/ac5f5714-b4c9-4160-b66d-77a8b4d3ad4an%40googlegroups.com.
Why would the client do this?
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/CAF0G-Zjz-D5Koj-N1ZSYLAcqCOgU-a4POWSEWDS8GdUK88AB_A%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/CAEajhJOxi0pzbL3XS9Frqdsx3n%3DCMr3SDKvURrZjXSiDFbazig%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/5bca57e6-835f-4b5e-a05c-21c6065d7e60n%40googlegroups.com.
It comes back to the same question which i have asked.You may store the client sessions in the cluster like for an hour of client inactivity after which you will remove the inactive client session.Client will initiate connection later again and sends the same duplicate command to increment X. In this case, how can we still filter that duplicate requests ?