Frequent switching of Redis master in a Redis-Sentinel Environment

Mehedi Hasan

unread,

Jul 20, 2019, 11:40:58 AM7/20/19

to Redis DB

Hi Good People,

We have been using Redis-Sentinel cluster for almost 2 years. We have 14 clients (sentinel running in each client) and configurations of sentinel is like below:

bind 0.0.0.0

port 26379

sentinel myid ***************************************

sentinel monitor redis-cluster IP Port 11

sentinel down-after-milliseconds redis-cluster 10000

sentinel failover-timeout redis-cluster 10000

sentinel auth-pass redis-cluster *******************************************

logfile "/path/to/redis-sentinel.log"

# Generated by CONFIG REWRITE

daemonize yes

pidfile "/var/run/redis/sentinel.pid"

dir "/"

sentinel config-epoch redis-cluster 635

sentinel leader-epoch redis-cluster 635

Redis has AOF enabled (persec). It is being used for key-value database and also we are maintaining a queue through this. For last 1 month we are facing master switch almost everyday. Initially we had seen below error:

12646:M 04 Jul 01:21:24.082 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:21:31.100 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:21:40.046 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:21:48.095 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:21:57.039 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:22:07.094 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:22:52.005 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:24:42.066 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:24:57.047 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:25:39.076 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

12646:M 04 Jul 01:25:57.022 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

Our Redis server was in our private data center (VM) & was sharing disk with other servers. After observing those, we migrated both master & slave in a separate LUN where no other VM is mounted. But still we were facing those. We checked network, memory, disk & everything. Disk is HP 3PAR with SSD and disk performance is attached.

Current configuration of each server is:

CPU: 12 vCPU
Memory: 72GB
Disk: 300GB (SSD)
Redis Max_memory: 20GB
Redis over-commit: False
THP: Enabled
vNIC capacity: 10G

We have not enabled over-commit has the virtual machine has sufficient memory. The latest logs before last switches are something like below:

Master Log:

6308:C 19 Jul 22:22:00.040 * RDB: 1153 MB of memory used by copy-on-write

28654:M 19 Jul 22:22:00.857 * Background saving terminated with success

28654:M 19 Jul 22:23:01.055 * 10000 changes in 60 seconds. Saving...

28654:M 19 Jul 22:23:01.740 * Background saving started by pid 6760

28654:M 19 Jul 22:23:38.081 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

6760:C 19 Jul 22:25:59.717 * DB saved on disk

6760:C 19 Jul 22:26:00.205 * RDB: 1159 MB of memory used by copy-on-write

28654:M 19 Jul 22:26:01.017 * Background saving terminated with success

28654:M 19 Jul 22:27:02.016 * 10000 changes in 60 seconds. Saving...

28654:M 19 Jul 22:27:02.679 * Background saving started by pid 7270

28654:S 19 Jul 22:27:12.875 # Connection with slave IP:6379 lost.

28654:S 19 Jul 22:27:12.998 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.

28654:S 19 Jul 22:27:12.998 * SLAVE OF IP:6379 enabled (user request from 'id=75176616 addr=IP:23816 fd=626 name=sentinel-5c7721e9-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')

28654:S 19 Jul 22:27:13.000 # CONFIG REWRITE executed with success.

28654:S 19 Jul 22:27:13.482 * Connecting to MASTER IP:6379

28654:S 19 Jul 22:27:13.482 * MASTER <-> SLAVE sync started

28654:S 19 Jul 22:27:13.482 * Non blocking connect for SYNC fired the event.

28654:S 19 Jul 22:27:13.482 * Master replied to PING, replication can continue...

28654:S 19 Jul 22:27:13.483 * Trying a partial resynchronization (request ****************).

28654:S 19 Jul 22:27:38.429 * Full resync from master: ********************

28654:S 19 Jul 22:27:38.429 * Discarding previously cached master state.

7270:C 19 Jul 22:29:51.022 * DB saved on disk

7270:C 19 Jul 22:29:51.477 * RDB: 3642 MB of memory used by copy-on-write

28654:S 19 Jul 22:29:52.462 * Background saving terminated with success

Slave Log:

17194:C 19 Jul 22:23:49.183 * RDB: 1339 MB of memory used by copy-on-write

11795:S 19 Jul 22:23:49.942 * Background saving terminated with success

11795:S 19 Jul 22:24:50.093 * 10000 changes in 60 seconds. Saving...

11795:S 19 Jul 22:24:50.651 * Background saving started by pid 17634

11795:M 19 Jul 22:26:16.869 # Setting secondary replication ID to ce5d1d734a007a2b96183d48ede899c174e83f1b, valid up to offset: 162712239579. New replication ID is 15793e555c6080779d18e1c8f876bf0c1b2d5bfd

11795:M 19 Jul 22:26:16.869 # Connection with master lost.

11795:M 19 Jul 22:26:16.869 * Caching the disconnected master state.

11795:M 19 Jul 22:26:16.869 * Discarding previously cached master state.

11795:M 19 Jul 22:26:16.869 * MASTER MODE enabled (user request from 'id=1432442 addr=IP:38994 fd=28 name=sentinel-e1402af9-cmd age=69213 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')

11795:M 19 Jul 22:26:16.870 # CONFIG REWRITE executed with success.

11795:M 19 Jul 22:26:16.923 * Replication backlog freed after 3600 seconds without connected slaves.

11795:M 19 Jul 22:27:13.483 * Slave IP:6379 asks for synchronization

11795:M 19 Jul 22:27:13.483 * Partial resynchronization not accepted: Replication ID mismatch (Slave asked for 'ce5d1d734a007a2b96183d48ede899c174e83f1b', my replication IDs are '68b45810acaf15a79c8d4bd6ee98d2759f1bb74b' and '0000000000000000000000000000000000000000')

11795:M 19 Jul 22:27:13.483 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC

17634:C 19 Jul 22:27:36.683 * DB saved on disk

17634:C 19 Jul 22:27:37.116 * RDB: 1361 MB of memory used by copy-on-write

11795:M 19 Jul 22:27:37.840 * Background saving terminated with success

11795:M 19 Jul 22:27:37.840 * Starting BGSAVE for SYNC with target: disk

11795:M 19 Jul 22:27:38.428 * Background saving started by pid 18056

11795:M 19 Jul 22:28:11.012 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

11795:M 19 Jul 22:29:21.001 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

18056:C 19 Jul 22:30:27.528 * DB saved on disk

18056:C 19 Jul 22:30:27.958 * RDB: 1350 MB of memory used by copy-on-write

11795:M 19 Jul 22:30:28.729 * Background saving terminated with success

11795:M 19 Jul 22:31:29.034 * 10000 changes in 60 seconds. Saving...

11795:M 19 Jul 22:31:29.619 * Background saving started by pid 18583

18583:C 19 Jul 22:34:16.692 * DB saved on disk

18583:C 19 Jul 22:34:17.160 * RDB: 1283 MB of memory used by copy-on-write

11795:M 19 Jul 22:34:17.883 * Background saving terminated with success

11795:M 19 Jul 22:35:18.074 * 10000 changes in 60 seconds. Saving...

11795:M 19 Jul 22:35:18.657 * Background saving started by pid 19043

19043:C 19 Jul 22:38:04.105 * DB saved on disk

19043:C 19 Jul 22:38:04.546 * RDB: 1305 MB of memory used by copy-on-write

11795:M 19 Jul 22:38:05.276 * Background saving terminated with success

11795:M 19 Jul 22:39:06.056 * 10000 changes in 60 seconds. Saving...

11795:M 19 Jul 22:39:06.642 * Background saving started by pid 19501

19501:C 19 Jul 22:41:52.293 * DB saved on disk

19501:C 19 Jul 22:41:52.735 * RDB: 2424 MB of memory used by copy-on-write

11795:M 19 Jul 22:41:53.639 * Background saving terminated with success

11795:M 19 Jul 22:42:54.074 * 10000 changes in 60 seconds. Saving...

11795:M 19 Jul 22:42:54.656 * Background saving started by pid 19983

19983:C 19 Jul 22:45:37.980 * DB saved on disk

Redis-info is also attached. CPU, Memory utilization is less than 50% during incidents. Can anyone help me to figure out what to do?

Disk Performance.png

redis_info.txt

Greg Andrews

unread,

Jul 20, 2019, 4:05:20 PM7/20/19

to Redis DB

So you have Sentinels indicating the master stops reponding to them, and the master's activity log around those times consistently indicates slow disk write performance. You also seem to be saving to disk on all your Redis instances - master and slaves.

The first thing I would do is test whether this really is linked to disk writes: Turn off the AOF on your master and see if this makes your Sentinels happier with the master's responsiveness. You have AOF on your slaves, so you are not endangering your ability to recover data after a crash. Then turn AOF back on and see if the problem re-appears. Then turn it off again and ensure it disappears.

If you reach this point and have established that the AOF write troubles are triggering the failovers from the Sentinels, you can dig deeper into your disk subsystem to learn how/why and find a fix or workaround. If the failovers are not being triggered by disk, then you can start monitoring the master redis-server process more closely to learn what's causing the lack of responsiveness at certain times.

One such non-disk possibility: As the warning in the https://redis.io/commands/keys page indicates, certain commands can ruin performance of a redis-server process. If a client or a script is sending such a command to your master redis-server process, there's a distinct possibility the command can make the process slow to respond at just the time several sentinels want it to answer, and trigger a failover. The slow log should reveal this particular kind of trouble, so it doesn't seem likely in your case. I'm just giving it as an example of non-disk-related performance troubles.

One other thing I noticed about your Redis info: The instance had been up for 242777 seconds (2.8 days) and in that time received 19,133,592 connections, but at the time of the INFO command, there were only 33 clients connected. That means clients connect/disconnect with that Redis server an average rate of 79 per second. It's very likely the peak rate of connections/disconnections is significantly higher. See this post of mine from a year ago which has a link to a Trivago blog post about their Redis scaling adventures - one of the early ones being the performance impact of their Redis clients constantly connecting/disconnecting rather than keeping connections open. Another possible source of poor responsiveness from the master to the Sentinels.

Greg Andrews

unread,

Jul 20, 2019, 4:15:57 PM7/20/19

to Redis DB

Oh, I just noticed. You have two different types of saving to disk enabled: RDB, aka background snapshots, and AOF. These are somewhat redundant.

The AOF write complaints in the Redis logfile occur as the RDB dump is being performed. So your process is dumping the entire 17GB database to a file, and apparently using up the available disk i/o bandwidth as it does so, producing complaints by the thread that's writing the small amount of AOF data. Your disk graphs should show this spike in write activity, but you will likely have to zoom in to a 30-minute or 60-minute view rather than an 18-hour view, and you would prefer to see the peak or max numbers rather than the average.

I would still turn off disk saving, but start with the background snapshots (the config file "save" lines) and leave the AOF going. This is the most likely source of your trouble.

On Saturday, July 20, 2019 at 8:40:58 AM UTC-7, Mehedi Hasan wrote:

Mehedi Hasan

unread,

Jul 20, 2019, 6:49:33 PM7/20/19

to redi...@googlegroups.com

Hi Andrews,

Thanks for going into the details. Actually in Redis documentatio, they recommended not to use AOF alone ( https://redis.io/topics/persistence).

Also I had the 'assumption' that rdb only writes the last 'N' changes over 'X' time as provided by the default value of 'save'. In this case, it was 10K changes every minute. But you are saying whole 17G data is being flushed to disk when snapshoting is triggered. So my understanding was wrong?

Anyway, I am taking 3 recos from you:

1. Disabling rdb snapshoting in master

2. Keeping tcp keep alived connection open with Redis.

3. Checking for blocking commands

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/redis-db/fddfd13f-500f-4c24-b3e8-79dfe04f6dff%40googlegroups.com.

Greg Andrews

unread,

Jul 21, 2019, 2:38:59 AM7/21/19

to Redis DB

Unfortunately, the documentation is a little obscure on this point. The AOF performs incremental writes to its file. Snapshots do not perform incremental writes. They save the entire dataset (database). As the page you linked says in the "Snapshotting" section (https://redis.io/topics/persistence#snapshotting):

How it works

Whenever Redis needs to dump the dataset to disk, this is what happens:

Redis forks. We now have a child and a parent process.
The child starts to write the dataset to a temporary RDB file.
When the child is done writing the new RDB file, it replaces the old one.

In every place the page talks about snapshots, it refers to the data being saved as "the dataset". The plain meaning is "the whole dataset", and nothing in that Persistence page, nor the blog post explanation, nor the SAVE / BGSAVE indicates something different from the plain meaning. The only time incremental updates are discussed is in relation to the AOF file, which is a different file from the snapshot/RDB file.

The value of the 'save' parameter that appears (from your logfile entries) to be operative is 'save 60 10000". The meaning is "If at least 10000 changes have been made in the last 60 seconds, I will save a snapshot of the database to the RDB file."

If the three action items you listed, please first try disabling the RDB/snapshot saves. The available evidence indicates this is the most likely reason your master is not responding to Sentinels quickly enough.

To unsubscribe from this group and stop receiving emails from it, send an email to redi...@googlegroups.com.

Mehedi Hasan

unread,

Jul 22, 2019, 11:38:26 AM7/22/19

to Redis DB

Hi Again,

We will turn off RDB in our production tonight. However we got some more information after further investigation. We found that we have certain slow logs. For example, in normal condition we are getting slow response from INFO command.

8) 1) (integer) 41141

2) (integer) 1563809001

3) (integer) 1200

4) 1) "INFO"

5) "127.0.0.1:50924"

6) ""

9) 1) (integer) 41140

2) (integer) 1563808999

3) (integer) 1176

4) 1) "INFO"

5) "REMOTE_IP:37389"

6) "sentinel-61ed1d3d-cmd"

10) 1) (integer) 41139

2) (integer) 1563808996

3) (integer) 1455

4) 1) "INFO"

5) "REMOTE_IP:64862"

6) "sentinel-1ac515bd-cmd"

Seems when PRedis is running INFO its getting slow response. Further dig out shown that, we are facing huge expiry of keys always. According to Redis documentation, active expiry happens on a probabilistic manner and max 200 expiry per second can happen. This can induce significant amount of latency. But its not clear in any forum how to avoid this.

Source of expiry: We are caching some information which has high expiry (60 days). Number of those keys of these type is around 15M. This business feature launched 2.5 months ago and we are suspecting that those keys have started to expiry and causing the latency.

Any help or suggestion in this regard?

Reply all

Reply to author

Forward