Hi all,We're struggling with an issue where connections from our application to Redis are regularly timing out. The Ruby Redis gem responds with: Redis::TimeoutError: Connection timed out. The errors come from both our Rails application and our Resque workers.We're running Redis 2.4.17 with a 6.5 GB DB. In the redis.conf for our Redis Master, timeout is set to 0.Initially when we started having these issues, we shut off all persistence on the Master and ran with persistence on the Slave only. This prevented timeout issues for a few weeks, but we've started seeing timeouts again.We have a 120 Resque workers connected to the Redis DB and then a varying number of threads from Passenger running our Rails app. We're seeing up to 800 clients connected to Redis, but the timeouts seem to happen regardless of the number of clients - we've seen them happen with only a few hundred connections.This issue - https://github.com/mperham/sidekiq/issues/517 – indicated that the issue might be due to swapping. We have plenty of free RAM and load is rarely above 1.We largely see the timeouts occurring on an hourly basis, which would lead to suspecting some kind of scheduled job. We are running scheduled tasks via Resque Scheduler, but in reviewing these tasks, haven't identified anything running deletes against the DB (which we've seen produce Redis timeouts when doing DB maintenance in the past).Based on this issue - http://code.google.com/p/redis/issues/detail?id=500 - we thought that perhaps is was conflict between concurrent connections, so we increased the open file limit for the user running the Redis app on our Redis master, from 1024 to 2048. This didn't have any effect.Our Redis info is below.Our application is growing quickly and the timeouts are causing concern from our operations team about whether Redis can stand up to our growth. Additionally, we're increasing the number of servers and Resque workers that we have running to handle growing load on our site, so this has us concerned that the issue is only going to worsen in severity.Any suggestions of where to look for the root cause of this issue?Thanks!Rob Shedd$ ./redis-cli inforedis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.1.2process_id:4456run_id:efe123235fabcd130f8558ac71f03e9398031172uptime_in_seconds:2159271uptime_in_days:24lru_clock:1779123used_cpu_sys:201364.34used_cpu_user:107124.49used_cpu_sys_children:5.86used_cpu_user_children:28.78connected_clients:516connected_slaves:1client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:7021963088used_memory_human:6.54Gused_memory_rss:7210254336used_memory_peak:7057749736used_memory_peak_human:6.57Gmem_fragmentation_ratio:1.03mem_allocator:jemalloc-3.0.0loading:0aof_enabled:0changes_since_last_save:1981202607bgsave_in_progress:0last_save_time:1357809314bgrewriteaof_in_progress:0total_connections_received:52464424total_commands_processed:20682744678expired_keys:1124399evicted_keys:0keyspace_hits:354418470keyspace_misses:199629652pubsub_channels:0pubsub_patterns:0latest_fork_usec:273343vm_enabled:0role:masterslave0:10.97.18.18,6379,onlinedb0:keys=10492919,expires=3392db1:keys=222,expires=0--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
I already checked the kernel log (/var/log/messages) with no luck.Pretty sure it's not swap:top - 18:40:21 up 132 days, 1:11, 1 user, load average: 0.14, 0.17, 0.17Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombieCpu(s): 0.8%us, 1.3%sy, 0.0%ni, 96.0%id, 0.0%wa, 0.3%hi, 1.5%si, 0.0%stMem: 32959864k total, 15864288k used, 17095576k free, 3805836k buffersSwap: 2096472k total, 8032k used, 2088440k free, 4655836k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND4456 betdash 15 0 6934m 6.7g 820 S 21.0 21.4 5219:04 38m redis-serverThe server Redis is running on has 8 cores and 32 GB RAM. So, plenty of power. It is a VMWare instance.We're on CentOS 5.5. This is sysctl.conf which is fairly standard across our stack:# puppet sysctl module## Kernel sysctl configuration file for Red Hat Linux## For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and# sysctl.conf(5) for more details.net.ipv4.ip_forward = 0net.ipv4.conf.all.send_redirects = 0net.ipv4.conf.default.send_redirects = 0net.ipv4.tcp_max_syn_backlog = 1280net.ipv4.icmp_echo_ignore_broadcasts = 1net.ipv4.conf.all.accept_source_route = 0net.ipv4.conf.all.accept_redirects = 0net.ipv4.conf.all.secure_redirects = 0net.ipv4.conf.all.log_martians = 1net.ipv4.conf.default.accept_source_route = 0net.ipv4.conf.default.accept_redirects = 0net.ipv4.conf.default.secure_redirects = 0net.ipv4.icmp_echo_ignore_broadcasts = 1net.ipv4.icmp_ignore_bogus_error_responses = 1net.ipv4.tcp_syncookies = 1net.ipv4.conf.all.rp_filter = 1net.ipv4.conf.default.rp_filter = 1net.ipv4.tcp_timestamps = 0# ensure code dumps can never be made by setuid programsfs.suid_dumpable = 0# BetDash Redis Setting to resolve memory allocation errorvm.overcommit_memory = 1We'll look into limiting the number of concurrent connections within our architecture. Is there an upper limit that we should be wary of here?Thanks for the thoughts.
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
We're not using iptables on the Redis machine, so we've discounted this.
If there was kernel resource starvation, I would expect to see this reflected in the logs. If such starvation was happening at such a regular and well-defined interval (i.e. on the hour) it should in theory be possible to determine the trigger and remediate accordingly. In the absence of anything in the logs, it is difficult to confirm or deny such starvation.
Redis::CannotConnectError: Timed out connecting to Redis on redis.betdash.com:6379from:[PROJECT_ROOT]/vendor/bundle/ruby/1.9.1/gems/redis-3.0.2/lib/redis/client.rb, line 266Redis::TimeoutError: Connection timed outfrom:[PROJECT_ROOT]/vendor/bundle/ruby/1.9.1/gems/redis-3.0.2/lib/redis/client.rb, line 204
[server]$ ./redis-cli --latency
min: 0, max: 3, avg: 0.04 (852 samples)
[server]$ ./redis-cli --latency
min: 0, max: 2, avg: 0.02 (322 samples)
[server]$ ./redis-cli --latency
min: 0, max: 28193, avg: 157.51 (179 samples) <<<<<<<<<<< while timeouts ongoing
Hello Robert,
did you ever have found a solution on this? We are facing exactly the same problems and struggeling with it for a lot of weeks now. As soon as there is some heavy I/O we are getting redis connection timeouts.
We would really appreciate any hints to solve this.
Thanks in advance
Frank
Hello Salvatore and Robert,thanks a lot for your answers and for leading us into the right direction!Real
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/LrFlfQRMNNw/unsubscribe.--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
I have tried different settings in the configuration file for the snapshotting and then we decided to try different configurations: we first had a cluster of 4 redis instances running with twemproxy (all running on the same machine with 8 cores). we moved to elasticache and we are currently running on a single process server running on an m1.large instance (version 2.6.13). All of them showed the same behaviour.
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Josiah,I have checked all possible ways but still can't find the root cause. my redis log shows below and log updated long ago.tail -100f redis_6379.log[10865] 30 Apr 02:41:36.725 # Server started, Redis version 2.6.14[10865] 30 Apr 02:41:36.725 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.[10865] 01 May 02:04:49.737 # User requested shutdown...[10865] 01 May 02:04:49.970 # Redis is now ready to exit, bye bye...[2545] 01 May 02:04:53.013 # Server started, Redis version 2.6.14[2545] 01 May 02:04:53.014 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
[root@xxxxx log]$ dateFri Jul 11 06:46:30 UTC 2014
slow log======redis-cliredis 127.0.0.1:6379> ping(error) ERR operation not permittedredis 127.0.0.1:6379> slowlog get 2(error) ERR operation not permitted