Redis Intrinsic latency - Question

1,641 views
Skip to first unread message

Kashyap Mhaisekar

unread,
May 22, 2015, 2:46:14 PM5/22/15
to redi...@googlegroups.com
Hi,
We are building a high concurrency system and Redis is plays a important role in the mix. Am presently facing an issue where as the concurrency increases - just 150 parallel threads concurrently calling a lua script that essentially does -
1. Increment a counter
2. If the counter is equal to a number, publish the message.

What we see is that more the number of threads, the response from Redis gets worse.

I changed slowlogs to 2 (ms). When the threads are running i always see a performance degradation and the slow log shows an occasional lua script take 2047 microseconds (2.047 ms) indicating that it is probably fine, but the latency shows the following -

127.0.0.1:11411> latency doctor
Dave, I have observed latency spikes in this Redis instance. You don't mind talking about it, do you Dave?

1. command: 1 latency spikes (average 10ms, mean deviation 0ms, period 76516.00 sec). Worst all time event 10ms.
------------------------
I have a few advices for you:

- Check your Slow Log to understand what are the commands you are running which are too slow to execute. Please check http://redis.io/commands/slowlog for more information.
- Deleting, expiring or evicting (because of maxmemory policy) large objects is a blocking operation. If you have very large objects that are often deleted, expired, or evicted, try to fragment those objects into multiple smaller objects.
----------------------

Tried the redis intrinsic latency and it keeps increasing -
--------------------------
-bash-4.1$ redis-cli --intrinsic-latency 100
Max latency so far: 3 microseconds.
Max latency so far: 4 microseconds.
Max latency so far: 25 microseconds.
Max latency so far: 319 microseconds.
Max latency so far: 330 microseconds.
Max latency so far: 331 microseconds.
Max latency so far: 671 microseconds.
Max latency so far: 5218 microseconds.
Max latency so far: 12186 microseconds.
Max latency so far: 14017 microseconds.
Max latency so far: 17065 microseconds.
Max latency so far: 18986 microseconds.
Max latency so far: 23000 microseconds.
Max latency so far: 31014 microseconds.

33393611 total runs (avg latency: 2.9946 microseconds / 29945.85 nanoseconds per run).
Worst run took 10357x longer than the average latency.
-bash-4.1$
--------------------------

Question: Is the intrinsic latency shown above normal? What can be done to reduce this? I suspect this is causing the redis response times to go up?

Regards,
Kashyap

The Baldguy

unread,
May 23, 2015, 4:13:54 PM5/23/15
to redi...@googlegroups.com


On Friday, May 22, 2015 at 1:46:14 PM UTC-5, Kashyap Mhaisekar wrote:
Hi,
We are building a high concurrency system and Redis is plays a important role in the mix. Am presently facing an issue where as the concurrency increases - just 150 parallel threads concurrently calling a lua script that essentially does -
1. Increment a counter
2. If the counter is equal to a number, publish the message.

What we see is that more the number of threads, the response from Redis gets worse.

This is expected. Redis is single threaded, and the more you queue up to it the more those items will have to wait.
 

I changed slowlogs to 2 (ms). When the threads are running i always see a performance degradation and the slow log shows an occasional lua script take 2047 microseconds (2.047 ms) indicating that it is probably fine, but the latency shows the following -

You can see this effect more clearly using Commissar's latency tool ( https://github.com/therealbill/commissar/tree/master/latency ) which allows you to set a concurrently level and see the effect of nothing more than the Redis PING command at various levels of concurrency. Testing in this way eliminates any data access or scripting from the equation - showing the direct effect of concurrency. 

It could be useful for you to serve as a baseline to indicate what could be *possible* under ideal circumstances. Obtain the difference between that latency and yours and you'll have a rough idea of the latency overhead of the commands you're running.

Another way to look at the performance is to look at command stats, but with Lua scripting that gets a bit delicate. 


Question: Is the intrinsic latency shown above normal? What can be done to reduce this? I suspect this is causing the redis response times to go up?


Intrinsic latency is a test which tests the system you are running the test on to see who fast it can perform an operation. However, it can not test concurrency effects directly. As concurrency increases and Redis has to do the switches necessary to handle it the latency will increase because Redis is spending more time switching to handling various connections. So, given a higher level of concurrency your latency will increase. Intrinsic latency will increase as your CPU has more work to do - whether it be Redis or not The intrinsic latency test doesn't even open a connection to a server.

The problem isn't intrinsic latency, it is high concurrency. One option you have to reduce this latency is to use RMux ( https://github.com/forcedotcom/rmux ) to reduce your concurrency level to Redis. Another is to to the test client side rather than server-side. It is really simple to do and would help cut down the time to execute the command. High concurrency is a case where you may be better off scaling by using the CPU of the clients individually rather then backing them up behind a single-threaded server. Still, it is only a short-term measure as your concurrency level is the root of the latency level. A third option is to have a faster CPU and ensure you have good core isolation. But that would be more of a last resort type measure unless your CPU is really low powered.

Consider 150 clients issuing a command which takes up to 100 microseconds. The last one to enter the queue will have to wait for up to 100*148 microseconds for it's command to be executed. The server-side command processing time will still be 100 microseconds and thus won't show up in slowlog. You could, however, measure it directly if you  a) have microsecond timing resolution on the clients, and b) time *just* the command execution on the client side. Knowing exactly how long a given command takes is easier when not using Lua to chain multiple commands together and would allow you to have better predictability by having actual data to fit into a model of what you can expect performance-wise.

On a single-threaded server we have to use a lower mental model of what "high concurrency" means. Sure, for a multi-threaded server (or multi-process), such as Nginx or Apache, 150 concurrent clients are not "high concurrency". But for Redis that isn't quite the case.

I hope that helps, and if it raises more questions feel free to ask.


Cheers,
Bill

Josiah Carlson

unread,
May 24, 2015, 12:15:51 AM5/24/15
to redi...@googlegroups.com
Everything that Bill said is correct.

I would add one more thing into the mix: Redis is not the only thing running on your machine.

Even if you are running your own bare metal server, your kernel is doing scheduling of Redis plus periodically running one of a few if not a dozen background daemons on one or more CPU cores based on several things that are basically out of your control. Unless you explicitly switch to a real-time kernel and make heavy patches to Redis (and even if you do those things), Redis *will* have periodic spikes in latency that are no fault to Redis. You will find this with *every* piece of network-connected software that you run.

That said, if your Redis configuration is such that Redis is constantly forking to rewrite the AOF, write a new RDB snapshot, or connect a new slave, you can fix your configuration and bring down your average latency substantially.

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Kashyap Mhaisekar

unread,
May 24, 2015, 12:43:12 AM5/24/15
to redi...@googlegroups.com

Thanks guys,
Then I wonder what does 10k connections by default in Redis mean? And also I wonder how the Redis is fit for webscale caching if concurrency is a problem?

Noted that I can use a slave for all reads and I can change AOF and check. Other than that, when Redis documentation uses terms like High Performance, how should one understand what is high performance?

Thank you,
Kashyap

Greg Andrews

unread,
May 24, 2015, 1:56:17 AM5/24/15
to redi...@googlegroups.com
I've been reading this list for several years now, and "concurrency" is very rarely a problem in the Redis server process.  In most cases the problem is in the client or in other processes running on the Redis machine and taking resources away from Redis.

There are discussions of benchmark tests and the factors that can limit performance in http://redis.io/topics/benchmarks, and in some old threads in this mailing list (searchable in Google Groups).

The first questions I ask when someone is seeing unexpectedly low performance are:
  1. Do you have a high proportion of connections per second or connections per command (as shown in the output of INFO)?
  2. Do you have a high proportion of threads within your client application compared to connections to Redis?
  3. Does the Redis server process consume 100% of a cpu core during the time you see low performance?
  4. What is the round-trip network latency between the client and the Redis server?  If it's above 1-2 ms, are you pipelining commands to compensate?
  5. Is your client application writing or reading large strings (as values or key names) often?
  6. Is the disk on the Redis server machine very busy during the times you see low performance?
  7. Is Redis saving snapshot data to disk (for backup or slave syncing) during the time you see low performance?
95% of the performance complaints I see in this mailing list are caused by #1 or #6. About 4% are caused by #3 or #4.

In your particular use case, you're benchmarking a lua script that performs four actions:  (1) an increment, (2) a test, (4) publishes a message, (5) returns a result code to the client.
You haven't said what commands are involved in "publish a message".  Is that a single PUBLISH command?  Something else?

What performance does your benchmark show when you only do an increment of the counter?  When you do a PUBLISH?

What throughput do you see when the number of threads in your client application is equal to the number of connections to Redis, so there's no bottleneck within your app?

  -Greg

Josiah Carlson

unread,
May 24, 2015, 2:29:45 AM5/24/15
to redi...@googlegroups.com
Related to what Greg said; Redis on an AWS EC2 c1.medium will happily sustain 40-50k simple operations per second from properly configured clients. I used to use an AWS EC2 t1.micro as the destination for stats, and that little box saw 5-7k HINCRBY calls every second for the 9 months we ran it (4-5 years ago now).

Long story short: you keep looking to Redis as the source for problems in your setup, when you only just plugged Redis into your setup. Don't get me wrong, Redis has its limitations, its edge cases, and even use-cases where Redis might be ill-suited. But all of the information you have been giving us is telling me that Redis is *not* your problem. Greg listed 7 big issues, and I'd wager $20 that if/when you do eventually figure out why you're having as many problems as you're having, it will be due to one of the 7 he mentioned, or just generally "bad configuration".

 - Josiah

Bill Anderson

unread,
May 25, 2015, 2:01:27 AM5/25/15
to redi...@googlegroups.com




On May 23, 2015, at 23:43, Kashyap Mhaisekar <kash...@gmail.com> wrote:

Thanks guys,
Then I wonder what does 10k connections by default in Redis mean?

It means it will stop accepting new connections when this limit is hit.

And also I wonder how the Redis is fit for webscale caching if concurrency is a problem?



Define webscale. There is no metric which meets this term. Don't use vague meaningless terms when discussing performance. Use specific scenarios and data. Vague phrases are open to interpretation. 

For example, you aren't doing caching so what does Redis' caching performance mean for you? Nothing. From what little you've described you have placed client side logic into the server and are using it as a conditional communication bus - not a cache. 

Noted that I can use a slave for all reads and I can change AOF and check. Other than that, when Redis documentation uses terms like High Performance, how should one understand what is high performance?


See above. Is a Ferrari a high performance car? It depends on how you measure it, and if you're measuring what you need. This is no different for Redis. When approaching Redis performance, the proper mindset is "how can I avoid slowing Redis down?" If you measure to your specific metrics required and it is "slow" you've already gone down the wrong path. 

There is a tremendous difference between, say, 5k open connections and 5k *active* connections. While even an open connection incurs a minor performance penalty it is nothing compared to an active one. 

If you want more specific help for your specific scenario, you will need to post actual code so it can be examined. What you have described is too generic, and in a non-testable vague description for any more detailed help than you've already been proceed with. 

Vague questions get vague answers, detailed questions get detailed answers. 


Cheers,
Bill
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/dKdqCXdx2Oo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Kashyap Mhaisekar

unread,
May 27, 2015, 12:21:27 PM5/27/15
to redi...@googlegroups.com
Greg/Josiah.
Thanks for the 7 commandments. Answering them indicates that redis itself is yet to reach it max limits and I dont seem to be anywhere near it. I have keyed in answers anyways. 

There are around a 100 threads presently which try to access Redis from a pool of 1000 connections and hence I guess atmost there are 100 active connections at max.


Do you have a high proportion of connections per second or connections per command (as shown in the output of INFO)?
[Kashyap]: Stats show 1 instantaneous ops per sec. Connections per second could be at most 100 and one connection is used for executing 1 lua script which does 10 fetches and one incr and if the INCR value matches a number, then publishes and deletes a value. The max time I saw the LUA script executing slow logs was also around 5 ms.

Do you have a high proportion of threads within your client application compared to connections to Redis?
[Kashyap]: Threads in application. Close to 80 threads. somaxxcon in Redis server at 128 and tcp_backlog at 1024.

Does the Redis server process consume 100% of a cpu core during the time you see low performance?
[Kashyap]: No. Never exceeds 5%

What is the round-trip network latency between the client and the Redis server?  If it's above 1-2 ms, are you pipelining commands to compensate?
[Kashyap]: I am using Lua script to compensate instead of pipelining. Network latency is around 0.3-0.5 msec.

Is your client application writing or reading large strings (as values or key names) often?
[Kashyap]: Keys are around 32 chars while values are around 22KB each. Each lua script sends 10 keys and expects 10 values back which makes each call around 230KB. (only reads). However there is an INCR that is invoked each time and when it reaches a specified limit, the INCR value is deleted.

Is the disk on the Redis server machine very busy during the times you see low performance?
[Kashyap]: NMON report does not seem to indicate this. stats are 10 IO/sec and while running the app, there are deletes that happen.

Is Redis saving snapshot data to disk (for backup or slave syncing) during the time you see low performance?
[Kashyap]: Dont think so. The BGSAVE is commented out. There is one slave connected and the redis logs have teh following. Is there a save happening every 5 seconds? -
[7825] 27 May 06:32:33.750 - DB 6: 101005 keys (0 volatile) in 131072 slots HT.
[7825] 27 May 06:32:33.750 - 0 clients connected (1 slaves), 43763776 bytes in use
[7825] 27 May 06:32:38.757 - DB 6: 101005 keys (0 volatile) in 131072 slots HT.
[7825] 27 May 06:32:38.757 - 0 clients connected (1 slaves), 43763776 bytes in use
[7825] 27 May 06:32:43.767 - DB 6: 101005 keys (0 volatile) in 131072 slots HT.
[7825] 27 May 06:32:43.767 - 0 clients connected (1 slaves), 43763776 bytes in use
[7825] 27 May 06:32:48.775 - DB 6: 101005 keys (0 volatile) in 131072 slots HT.
[7825] 27 May 06:32:48.775 - 0 clients connected (1 slaves), 43800624 bytes in use

Thank you!
Regards,
Kashyap

Kashyap Mhaisekar

unread,
May 27, 2015, 12:34:12 PM5/27/15
to redi...@googlegroups.com
Bill,
I get your point on web scale. Probably a wrong word to use. The eventual plan for us is to use Redis to support 1-5 million keys which is a big deal for us.

Redis is being used for Caching purpose (all 1-5 million keys cached) but we are also using it for other purposes as well as it has teh features like INCR and PUBLISH. Found it easier to use for incrementing and then publishing values out than traditional means.

I have 10K keys presently in the format - (This 10K will be scaled to 1 million. This is just a PoC)
ITM:1="9ab0d88431732957af18d4a469a0d4c3" (32 char string)

and another 10K keys as 
9ab0d88431732957af18d4a469a0d4c3=<<22KB String>>.

With this said, what I am trying to achieve is that I need to iterate through the Items and get the values and do some operations on the value. each time it is iterated, we increment a counter in Redis by 1 and then when the incremented value reaches a certain number (Say 10000, we publish a message saying job is completed)

"local luaresults={} "
+ "local fn=redis.call "
+ "local ti=table.insert "
+ "for idx=ARGV[1],ARGV[2] do "
+ "local itm=fn('GET',KEYS[idx]) "
+ "local itmval=fn('get','ITM:'..ofr) "
+ "ti(luaresults,itm..'~'..itmval) " + " end "
+ "return luaresults";

Second script that increments is -
String pubtojvm = "local itmcnt=redis.call('INCRBY',ARGV[1],ARGV[2]) "
+"if ofrcnt==tonumber(ARGV[4]) then "
// + "local appresults=redis.call('GET',ARGV[3]) "
+ "redis.call('PUBLISH','RESULTSCHANNEL'..ARGV[1],'qualitems:'..ARGV[3]) "
+ "redis.call('del',ARGV[1]) " 
+ "end "
+ "return itmcnt";

Above two scripts are called by 100 threads in parallel from a pool consisting of 5K connections currently.

Thanks for the comments.

Regards,
Kashyap

Bill E. Anderson

unread,
May 27, 2015, 1:52:12 PM5/27/15
to redi...@googlegroups.com

> On May 27, 2015, at 11:34 AM, Kashyap Mhaisekar <kash...@gmail.com> wrote:
>
> Bill,
> I get your point on web scale. Probably a wrong word to use. The eventual plan for us is to use Redis to support 1-5 million keys which is a big deal for us.

And not a big deal for Redis. But the key thing to think of here in terms of performance is not how many keys you’re strong in Redis, but what operations you’re doing. Of course, I’m assuming you are not using the KEYS command. ;)


> Redis is being used for Caching purpose (all 1-5 million keys cached) but we are also using it for other purposes as well as it has teh features like INCR and PUBLISH. Found it easier to use for incrementing and then publishing values out than traditional means.
>
> I have 10K keys presently in the format - (This 10K will be scaled to 1 million. This is just a PoC)
> ITM:1="9ab0d88431732957af18d4a469a0d4c3" (32 char string)
>
> and another 10K keys as
> 9ab0d88431732957af18d4a469a0d4c3=<<22KB String>>.

Just a note on resource usage as this scales up in key count. If you can map these to integers you’ll se a dramatic drop in memory consumption. Automatically mapping these in a Redis Hash back and forth using lua would be a good use of Lua here.


> With this said, what I am trying to achieve is that I need to iterate through the Items and get the values and do some operations on the value. each time it is iterated, we increment a counter in Redis by 1 and then when the incremented value reaches a certain number (Say 10000, we publish a message saying job is completed)
>
> "local luaresults={} "
> + "local fn=redis.call "
> + "local ti=table.insert "
> + "for idx=ARGV[1],ARGV[2] do "
> + "local itm=fn('GET',KEYS[idx]) "
> + "local itmval=fn('get','ITM:'..ofr) "
> + "ti(luaresults,itm..'~'..itmval) " + " end "
> + "return luaresults";
>
> Second script that increments is -
> String pubtojvm = "local itmcnt=redis.call('INCRBY',ARGV[1],ARGV[2]) "
> +"if ofrcnt==tonumber(ARGV[4]) then "
> // + "local appresults=redis.call('GET',ARGV[3]) "
> + "redis.call('PUBLISH','RESULTSCHANNEL'..ARGV[1],'qualitems:'..ARGV[3]) "
> + "redis.call('del',ARGV[1]) "
> + "end "
> + "return itmcnt";
>
> Above two scripts are called by 100 threads in parallel from a pool consisting of 5K connections currently.

I think you’d be much better served by doing this on the client rather than in Lua. I’ve seen a lot of this type of stuff done in Lua. Then when performance is discovered to be sub-par it gets handled on the client and the performance issue disappears. Using Lua here is not really helping you. In particular look at your second script. You’re doing data type conversion on the server. (tonumber…). Because it all operates within a “single redis command” (the script execution) the server is unable to process other items at the same time. With concurrent programming such as you’re fundamentally doing you want those server-blocking commands to be as fast as possible and as small of context as possible.

By doing the bulk of that script on the client side you achieve these. For example on the client you would:

1. Call INCR (returns new val) *
2. Do any necessary data type conversions
3. Do test logic
4. If conditions met, call PUBLISH *
5. If conditions met, call DEL *

Note that in between these steps other connections can do their thing. Say the lua script took 25 microseconds to execute as a whole. That is 25us of latency added to the chain. However, if you can interleave commands by moving it client side, your interval latency drops to the time to execute an individual Redis command on the server as indicated in the above list by the ‘*’ at the end. For simple math purposes, say each one took 5us to execute. Now your maximum interval latency is 5us - 1/5th of the original. Plus you’ll know exactly what each operation requires meaning you’ll be able to predict performance profiles and changes, thus allowing you to have a good idea of what resources you’ll need as each component scales - and you can model it as well. That may sound large and it is. But it is also fairly representative of what I’ve seen when people convert Lua scripts like this to client-side testing instead.

As far as network latency you can pipeline the above calls if desired to keep that to a minimum. Unless you have a condition where multiple processes are trying to work on the same keys you wouldn’t even need MULTI/EXEC.

What you’ve described is “very parallel” - so don’t serialize it on the server. ;)

Cheers,
Bill

Kashyap Mhaisekar

unread,
May 27, 2015, 2:19:15 PM5/27/15
to redi...@googlegroups.com
Wow! Thanks a lot for all the inputs Bill. I was going down the LUA script path because I wanted to get as much work in one connection as possible. But the implication on the other requests seems to be greater if I go down this path.

On a side note, I was looking into the http://redis.io/topics/admin and looks like we have the TCP Backlog at 512 and somaxcon at 128. I plan to increase these to a higher number and also move from Lua to client side as you suggested.

Thanks a lot!
Kashyap


Cheers,
Bill

Reply all
Reply to author
Forward
0 new messages