request queue size full exceptions at larger throughputs in the same thread

366 views
Skip to first unread message

Krishna

unread,
Jan 3, 2020, 3:00:03 PM1/3/20
to lettuce-redis-client-users
Our application in some use cases writes massive amount (~10K) of reads/writes per thread and this is causing "request queue size > 10K" exceptions. No issues were noticed if same throughput happens across multiple threads.

If we understood this statement correctly, "Lettuce does not await completion of earlier commands before dispatching a new command as Lettuce uses Netty pipelining", in the context of request queue size per connection when a single thread executes many commands.

My understanding is:
Within a specific connection, when you execute a command using AbstractRedisAsyncCommands, it is dispatched and written using ClusterDistributionChannelWriter -> DefaultEndpoint.channelWriteAndFlush where we are tracking QUEUE_SIZE but not actually storing the requests in any stack. Netty then takes over the command and passes it to its event loop executor to be executed, this includes encoding and Netty AbstractChannelHandlerContext$WriteAndFlushTask uses CommandHandler reference which again also tracks request queue size using an ArrayDeque stack (Is this the protocol stack?) and when Netty gets the response, the objects are decoded and commands popped from stack.

Questions:
Why are we tracking request queue size in both io.lettuce.core.protocol.DefaultEndpoint vs io.lettuce.core.protocol.CommandHandler?
When request queue size is full exception occurs, it could be due to slow encoding/decoding, low number of Netty IO/CPU bound threads or due to actual response from Redis coming slowly. What happens when Redis does not respond, how do we remove from stack all entries that did not arrive on time
What if we did not care about the write response or the order, how can we make writes faster?
When you say Lettuce/Netty pipelining, you are really referring to the Netty channel queueing one by one?
How is Lettuce/Netty pipelining or batching different from using Multi SET commands for performance? How can we ignore the disconnect buffer altogether for performance?

Mark Paluch

unread,
Jan 4, 2020, 9:44:03 AM1/4/20
to lettuce-redis-client-users
Your understanding is correct. If you have a queue size that hits 10000 concurrently active commands, then likely you're touching on an overload scenario. Having the queue size regularly a bit higher than empty is a good balance between saturation and efficiency but 1000s of active commands sounds as if the load is higher than your capacity.

To your questions:
Lettuce has a graceful disconnected mode: Lettuce reconnects disconnected connections and replays buffered commands. Therefore, we require a protocol stack that is active for connected connections. Any commands issued while disconnected are buffered in the disconnected queue. Also, manual batching (autoFlush disabled) are buffered in the endpoint.

What happens when Redis does not respond, how do we remove from stack all entries that did not arrive on time

You can't. While there is a reset() method on the connection, it's not a safe operation as reset() can cause more harm than good especially when commands are active and waiting for completion. Probably the only good way is closing the connection.

What if we did not care about the write response or the orderhow can we make writes faster

You can submit commands without an Output. In that case, commands aren't added to the protocol stack, they are just written in a fire+forget fashion. Note that you must consume somehow the command response, so disabling redis replies is probably what you're looking for.

When you say Lettuce/Netty pipelining

Each command is written and flushed by default. With setAutoFlush, you can collect multiple commands and flush later. This operation is only safe when you exclusively use a single connection by a single process and when using Cluster connections, when all node connections are established.

You can also call dispatch(…) with a Collection of commands to flush only once.

How is Lettuce/Netty pipelining or batching different from using Multi SET

MSET is atomic while issuing multiple SET commands is not. With Redis Cluster, commands must be separated by slot, so issuing multiple commands there is a requirement and while you can still use pipelining to help with performance when working with the same node.

Hope this helps.

Cheers, 
Mark

Krishna

unread,
Jan 6, 2020, 6:43:54 PM1/6/20
to lettuce-redis-client-users
Thanks for the quick response.

To disregard commands on connection failure, we choose at-most-once option and figure out how to set this.


You can't. While there is a reset() method on the connection, it's not a safe operation as reset() can cause more harm than good especially when commands are active and waiting for completion. Probably the only good way is closing the connection.
Curious, why would we not remove entries from queue after a Redis command timeout and still wait for Redis to actually respond?


You can submit commands without an Output. In that case, commands aren't added to the protocol stack, they are just written in a fire+forget fashion. Note that you must consume somehow the command response, so disabling redis replies is probably what you're looking for.
Couldn't find the source to submit commands without an Output.

In our case, we batch write in the same hash slot in a cluster for efficiency, using this scenario, can we assume:
Are these the same: MSET (atomic) VS dispatch<Collection commands> (atomic)
And you can further optimize them by batching batch execution (does not need a dedicated connection) vs redis pipelining/flushing (needs a dedicated connection) which roughly do the same?
Wonder what Redis server metric should we monitor to confirm how much we can group and how many groups we can batch/pipeline?

Mark Paluch

unread,
Jan 7, 2020, 8:09:03 AM1/7/20
to lettuce-redis-client-users
Curious, why would we not remove entries from queue after a Redis command timeout and still wait for Redis to actually respond?

Removing commands from the queue would mix up the protocol state. Timeouts are set on a per-command basis and mostly a frontend-issue (frontend as of the invocation or Future). Calling RedisFuture.get(…) with a timeout may time out the call of the get(…) method but not necessarily the command itself. Global timeouts affect the command itself and are not limited to the command invocation.

Let's assume we have two commands in the queue: BLPOP and SET. Also, assume a command has a timeout of a second but Redis responds after 1001ms (1.001seconds). If we would remove the queued command, then the response would be added to the SET command. We cannot do that otherwise commands would receive erroneous responses.

Cheers, 
Mark

y2k...@gmail.com

unread,
Jan 9, 2020, 12:43:10 AM1/9/20
to lettuce-redis-client-users
>>Let's assume we have two commands in the queue: BLPOP and SET. Also, assume a command has a timeout of a second but Redis responds after 1001ms (1.001seconds). If we would remove the queued command, then the response would be added to the SET command. We cannot do that otherwise commands would receive erroneous responses.

I maybe missing something, if the above is true, does it mean Lettuce process receiving command in sequential fashion? how can it achieve good performance?
Let's say i have GET A and GET B two commands, GET A return a payload size of 200M, and it need 5 seconds to just to receive or process by the IO thread, does it mean my GET B command will also be blocked and only available to the caller after 5 seconds ( assuming we got plenty of available threads in IOThread pool) ? If i have set my client timeout to 2 seconds, GET B command will throw timeout exception but in fact the server response within 2 seconds (?)

thanks

Krishna

unread,
Jan 9, 2020, 9:50:38 PM1/9/20
to lettuce-redis-client-users
y2k, in general, any expensive Redis operation should slow down everyone else as well. 

In such a case, I think, both commands will be issued immediately to Redis Transport returning you a separate Future as a client if you are using async commands, however, "if they are on the same connection", then, Lettuce/Netty will expect response in the same order as Redis internally is a Single Threaded Processing but still doing concurrent asynchronous IO. However, the processing/decoding will be again in separate Netty computation threads I believe giving your parallelism on client side.

Krishna

unread,
Jan 18, 2020, 5:11:59 PM1/18/20
to lettuce-redis-client-users
@Mark, couldn't find any documentation to submit commands without an Output.

Mark Paluch

unread,
Jan 19, 2020, 6:01:57 AM1/19/20
to lettuce-redis-client-users
Reply all
Reply to author
Forward
0 new messages