Redis replication issue for a very large data set (180 GB)

Christine Dodrill

unread,

Apr 23, 2015, 3:53:26 PM4/23/15

to redi...@googlegroups.com

Hello,

We have a rather large (180 GB) data set in redis that is having trouble replicating. It is causing errors after about 30-40 minutes of working (depending on buffer sizes) and returns an error similar to:

[31824] 22 Apr 12:13:15.955 # I/O error reading bulk count from MASTER: Resource temporarily unavailable

This has been working fine until the last failover, when the slave started to fail syncing like this.

Does anyone have any ideas as to what could be causing this? We are on Redis 2.8.7.

Thanks,

--

Christine Dodrill

Site Reliability Engineer @ IMVU

Josiah Carlson

unread,

Apr 24, 2015, 1:43:02 AM4/24/15

to redi...@googlegroups.com

You are probably going to need to take down your servers for a few minutes to upgrade Redis versions to 2.8.19 at least, and maybe even the most recent 3.0.0 (you can run in non-cluster mode). Maybe even getting a bigger machine, or look into whether you can shard your data and either run Redis cluster or Twemproxy + sharded Redis, or just sharded Redis controlled/managed by your clients. You have options and a way forward.

Some questions:

How much main memory does the system have? Is it a VM, colo, self-hosted? How fast is the network? What is your buffer set at? How big is your most recent successful dump? Do you have free space? What does the system log on the master say? Can you show the INFO output on the master, a working, and non-working slave? What is the network bandwidth? How much data is Redis handling/second? What is the network utilization? What is the disk IO? Is Redis swapping (can find this out from INFO output)? Can you provide the first few lines of 'top' all the way to and including the line starting with 'Swap:'? Are you using AOF? RDB? Spinning disks? SSDs? RAID? Do you have any utilization/CPU graphs of around the time that this happened? What does it look like? Any other weird stuff going on?

- Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Christine Dodrill

unread,

Apr 24, 2015, 6:37:24 PM4/24/15

to redi...@googlegroups.com

On Thursday, April 23, 2015 at 10:43:02 PM UTC-7, Josiah Carlson wrote:

> You are probably going to need to take down your servers for a few minutes to upgrade Redis versions to 2.8.19 at least, and maybe even the most recent 3.0.0 (you can run in non-cluster mode). Maybe even getting a bigger machine, or look into whether you can shard your data and either run Redis cluster or Twemproxy + sharded Redis, or just sharded Redis controlled/managed by your clients. You have options and a way forward.

> Some questions:

> How much main memory does the system have?

For the master:

(10:32:00) cdodrill@AF002171:~

$ cat /proc/meminfo | head

MemTotal: 528351876 kB

MemFree: 260988440 kB

For the slave:

(10:34:30) cdodrill@AF002170:~

$ cat /proc/meminfo | head

MemTotal: 528351876 kB

MemFree: 485304736 kB

> Is it a VM, colo, self-hosted?

They are both servers in our colo cage.

> How fast is the network?

We have one gigabit ethernet between machines in our production cluster.

> What is your buffer set at?

For the master:

$ redis-cli -h AF002171 config get client-output-buffer-limit

1) "client-output-buffer-limit"

2) "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60"

For the slave:

$ redis-cli -h AF002170 config get client-output-buffer-limit

1) "client-output-buffer-limit"

2) "normal 0 0 0 slave 365072220160 365072220160 60 pubsub 33554432 8388608 60"

> How big is your most recent successful dump?

(15:35:32) cdodrill@AF002171:/var/lib/redis

$ ls -lh

total 50G

-rw-rw---- 1 redis redis 34G 2015-04-24 15:26 dump.rdb

-rw-rw---- 1 redis redis 195M 2015-03-27 15:10 temp-1349.rdb

-rw-r----- 1 redis redis 0 2014-07-02 22:22 temp-1404364952.24370.rdb

-rw-rw---- 1 redis redis 241M 2014-07-17 12:31 temp-14247.rdb

-rw-rw---- 1 redis redis 11G 2015-04-24 15:35 temp-22838.rdb

-rw-rw---- 1 redis redis 185M 2014-07-02 21:09 temp-31301.rdb

> Do you have free space?

(10:39:16) cdodrill@AF002171:~

$ df -h /

Filesystem Size Used Avail Use% Mounted on

/dev/sda5 856G 52G 804G 7% /

(10:34:32) cdodrill@AF002170:~

$ df -h /

Filesystem Size Used Avail Use% Mounted on

/dev/sda5 856G 37G 819G 5% /

A significant amount on both hosts

> What does the system log on the master say?

[25352] 22 Apr 11:55:09.059 # Client addr=10.2.8.122:51808 fd=26 name= age=578 idle=578 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16384 oll=3176 omem=77967424 events=r cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.

[25352] 22 Apr 11:55:09.060 # Connection with slave 10.2.8.122:6379 lost.

[12364] 22 Apr 12:13:01.285 * DB saved on disk

[12364] 22 Apr 12:13:06.637 * RDB: 15411 MB of memory used by copy-on-write

[25352] 22 Apr 12:13:16.028 * Background saving terminated with success

[25352] 22 Apr 12:13:17.569 * Slave asks for synchronization

[25352] 22 Apr 12:13:17.569 * Full resync requested by slave.

[25352] 22 Apr 12:13:17.569 * Starting BGSAVE for SYNC

[25352] 22 Apr 12:13:22.667 * Background saving started by pid 17275

[25352] 22 Apr 12:20:54.093 # Client addr=10.2.8.122:52598 fd=38 name= age=457 idle=457 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16368 oll=3067 omem=74985592 events=r cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.

[25352] 22 Apr 12:20:54.093 # Connection with slave 10.2.8.122:6379 lost.

On the slave:

[31824] 22 Apr 11:45:31.324 # I/O error reading bulk count from MASTER: Resource temporarily unavailable

[31824] 22 Apr 11:45:31.512 * Connecting to MASTER AF002171:6379

[31824] 22 Apr 11:45:31.513 * MASTER <-> SLAVE sync started

[31824] 22 Apr 11:45:31.513 * Non blocking connect for SYNC fired the event.

[31824] 22 Apr 11:45:31.513 * Master replied to PING, replication can continue...

[31824] 22 Apr 11:45:31.514 * Partial resynchronization not possible (no cached master)

[31824] 22 Apr 11:45:31.514 * Full resync from master: 9db318a6a93e7e1137800d67405ce2a0cbc610fa:1411857424636

[31824] 22 Apr 12:13:15.955 # I/O error reading bulk count from MASTER: Resource temporarily unavailable

[31824] 22 Apr 12:13:16.887 * Connecting to MASTER AF002171:6379

[31824] 22 Apr 12:13:16.888 * MASTER <-> SLAVE sync started

[31824] 22 Apr 12:13:16.889 * Non blocking connect for SYNC fired the event.

[31824] 22 Apr 12:13:17.567 * Master replied to PING, replication can continue...

[31824] 22 Apr 12:13:17.568 * Partial resynchronization not possible (no cached master)

[31824] 22 Apr 12:13:17.569 * Full resync from master: 9db318a6a93e7e1137800d67405ce2a0cbc610fa:1412015033050

> Can you show the INFO output on the master, a working, and non-working slave?

On the master:

(10:56:09) root@AF002171:~

# redis-cli

127.0.0.1:6379> info

# Server

redis_version:2.8.7

redis_git_sha1:00000000

redis_git_dirty:0

redis_build_id:aeb6a1447e4d3a87

redis_mode:standalone

os:Linux 3.2.0-26-generic x86_64

arch_bits:64

multiplexing_api:epoll

gcc_version:4.4.3

process_id:25352

run_id:9db318a6a93e7e1137800d67405ce2a0cbc610fa

tcp_port:6379

uptime_in_seconds:2393850

uptime_in_days:27

hz:10

lru_clock:383481

config_file:/etc/redis/redis.conf

# Clients

connected_clients:37

client_longest_output_list:1

client_biggest_input_buf:0

blocked_clients:0

# Memory

used_memory:205034900680

used_memory_human:190.95G

used_memory_rss:220557266944

used_memory_peak:214827623272

used_memory_peak_human:200.07G

used_memory_lua:33792

mem_fragmentation_ratio:1.08

mem_allocator:jemalloc-3.5.1

# Persistence

loading:0

rdb_changes_since_last_save:4835284

rdb_bgsave_in_progress:1

rdb_last_save_time:1429896593

rdb_last_bgsave_status:ok

rdb_last_bgsave_time_sec:1606

rdb_current_bgsave_time_sec:1579

aof_enabled:0

aof_rewrite_in_progress:0

aof_rewrite_scheduled:0

aof_last_rewrite_time_sec:-1

aof_current_rewrite_time_sec:-1

aof_last_bgrewrite_status:ok

aof_last_write_status:ok

# Stats

total_connections_received:80929824

total_commands_processed:6354434567

instantaneous_ops_per_sec:3926

rejected_connections:0

sync_full:529

sync_partial_ok:0

sync_partial_err:0

expired_keys:46522303

evicted_keys:0

keyspace_hits:1864201300

keyspace_misses:115381101

pubsub_channels:0

pubsub_patterns:0

latest_fork_usec:5400114

# Replication

role:master

connected_slaves:0

master_repl_offset:1426657393650

repl_backlog_active:1

repl_backlog_size:1048576

repl_backlog_first_byte_offset:1426656345075

repl_backlog_histlen:1048576

# CPU

used_cpu_sys:103776.85

used_cpu_user:149847.14

used_cpu_sys_children:96818.01

used_cpu_user_children:2168143.75

# Keyspace

db0:keys=120137760,expires=430857,avg_ttl=1944976

db12:keys=119245,expires=119143,avg_ttl=342950357

We do not have a working slave. This is info from the slave that keeps failing to start replicating:

(10:57:19) cdodrill@AF002170:~

$ redis-cli

127.0.0.1:6379> info

# Server

redis_version:2.8.19

redis_git_sha1:00000000

redis_git_dirty:0

redis_build_id:20fd63bba41f903c

redis_mode:standalone

os:Linux 3.2.0-26-generic x86_64

arch_bits:64

multiplexing_api:epoll

gcc_version:4.4.3

process_id:9147

run_id:9c57482cd8bc884edbda5827ee49f6068677f8f5

tcp_port:6379

uptime_in_seconds:76066

uptime_in_days:0

hz:10

lru_clock:3834882

config_file:/etc/redis/redis.conf

# Clients

connected_clients:1

client_longest_output_list:0

client_biggest_input_buf:0

blocked_clients:0

# Memory

used_memory:1850256

used_memory_human:1.76M

used_memory_rss:3067904

used_memory_peak:1908000

used_memory_peak_human:1.82M

used_memory_lua:35840

mem_fragmentation_ratio:1.66

mem_allocator:jemalloc-3.6.0

# Persistence

loading:0

rdb_changes_since_last_save:0

rdb_bgsave_in_progress:0

rdb_last_save_time:1429822176

rdb_last_bgsave_status:ok

rdb_last_bgsave_time_sec:-1

rdb_current_bgsave_time_sec:-1

aof_enabled:0

aof_rewrite_in_progress:0

aof_rewrite_scheduled:0

aof_last_rewrite_time_sec:-1

aof_current_rewrite_time_sec:-1

aof_last_bgrewrite_status:ok

aof_last_write_status:ok

# Stats

total_connections_received:11695

total_commands_processed:81377

instantaneous_ops_per_sec:0

total_net_input_bytes:2806868

total_net_output_bytes:21339157

instantaneous_input_kbps:0.00

instantaneous_output_kbps:0.00

rejected_connections:0

sync_full:0

sync_partial_ok:0

sync_partial_err:0

expired_keys:0

evicted_keys:0

keyspace_hits:0

keyspace_misses:0

pubsub_channels:0

pubsub_patterns:0

latest_fork_usec:0

# Replication

role:slave

master_host:AF002171

master_port:6379

master_link_status:down

master_last_io_seconds_ago:-1

master_sync_in_progress:1

slave_repl_offset:1

master_sync_left_bytes:-1

master_sync_last_io_seconds_ago:3

master_link_down_since_seconds:1429898242

slave_priority:100

slave_read_only:1

connected_slaves:0

master_repl_offset:0

repl_backlog_active:0

repl_backlog_size:1048576

repl_backlog_first_byte_offset:0

repl_backlog_histlen:0

# CPU

used_cpu_sys:51.41

used_cpu_user:16.81

used_cpu_sys_children:0.00

used_cpu_user_children:0.00

> What is the network bandwidth?

During the replication attempts, this was the network bandwidth use: http://i.imgur.com/U5Kv5Pp.png

> How much data is Redis handling/second?

http://i.imgur.com/OCmlhv9.png

> What is the disk IO?

master: http://i.imgur.com/b1PeJ7B.png

slave: http://i.imgur.com/11oubaQ.png

> Is Redis swapping (can find this out from INFO output)?

It does not appear to be swapping on the main server

> Can you provide the first few lines of 'top' all the way to and including the line starting with 'Swap:'?

For the slave (AF002170):

top - 15:14:31 up 296 days, 22:30, 2 users, load average: 0.07, 0.08, 0.10

Tasks: 168 total, 1 running, 167 sleeping, 0 stopped, 0 zombie

Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 528351876k total, 43048076k used, 485303800k free, 996k buffers

Swap: 39085052k total, 0k used, 39085052k free, 36359084k cached

For the master (AF002171):

top - 15:15:23 up 281 days, 1:24, 2 users, load average: 1.22, 1.28, 1.31

Tasks: 171 total, 3 running, 168 sleeping, 0 stopped, 0 zombie

Cpu(s): 5.9%us, 1.1%sy, 0.0%ni, 92.8%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 528351876k total, 288765976k used, 239585900k free, 1316k buffers

Swap: 39085052k total, 0k used, 39085052k free, 55631636k cached

> Are you using AOF? RDB? Spinning disks? SSDs? RAID?

We are using RDB persistence. The machines each have a raid of 3 solid state drives.

> Do you have any utilization/CPU graphs of around the time that this happened? What does it look like?

Yep: http://i.imgur.com/VhbDBdp.png left is the broken slave and the right is the master.

> Any other weird stuff going on?

There shouldn’t be anything weird going on.

Josiah Carlson

unread,

Apr 25, 2015, 12:04:19 AM4/25/15

to redi...@googlegroups.com

Looking at your master log:

[25352] 22 Apr 12:20:54.093 # Client addr=10.2.8.122:52598 fd=38 name= age=457 idle=457 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16368 oll=3067 omem=74985592 events=r cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.

... the slave is disconnected after 7 1/2 minutes and has a backlog of almost 75 megs, which is about 165k/second in writes. This trips up the 64 meg 60 second soft limit. According to your INFO output:

rdb_last_bgsave_time_sec:1606

rdb_current_bgsave_time_sec:1579

...It takes ~27 minutes to create a snapshot after a slave connects, which induces a 267 meg backlog total. This is the replication backlog just to create the snapshot. You add the 36 gig of the snapshot, and you are looking at 36 gigs in change of backlog in total. Now, a 1 gigabit connection can move 100 megs/second, so the snapshot + backlog gets passed over in ~6 minutes. Then the slave stops what it is doing, and loads the snapshot, which will take another 13-30 minutes, then catches up to the master 8-30 minutes later.

Your backlog config...

For the master:

$ redis-cli -h AF002171 config get client-output-buffer-limit

1) "client-output-buffer-limit"

2) "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60"

That's a 64 meg soft limit over 60 seconds, or a 256 meg hard limit. As mentioned earlier, that is hit 7 1/2 minutes in. On the master, set the slave limit to something like 64 gigs, just to have some headroom. You can run the following command directly:

CONFIG SET client-output-buffer-limit slave 96gb 64gb 60

And to save yourself some headache:

CONFIG SET repl-backlog-size 1gb

That will let your slave be disconnected due to network issues for up to about 90 minutes, and not require a full resync. Previously, you only got 6-7 seconds before a slave would require a full resync.

And also update the config file.

Do those things, and you should be set and have a working slave again.

Regards,

- Josiah

Sabai

unread,

Jul 10, 2018, 1:35:35 PM7/10/18

to Redis DB

This helps to solve my issue. I have 100Gb of data i my master Redis and slave keeps trying to do full sync, after updating the below in master slave can sync with master sucessfully.

login to master through redis-cli -h <masternode> -p <port> . **Below command its temp until host reboot, it is not permanent.**

"config get client-output-buffer-limit" mostly it will be ( "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60" ) now

# CONFIG SET client-output-buffer-limit "slave 10737418240 10737418240 60" <--- this is to update from 256 mb to 10gb as my physical memory is 512 gb

"config get repl-backlog-size" < mostly it will be 1mb >

# CONFIG SET repl-backlog-size 1gb <-- increases to 1gb

Юрий Соколов

unread,

Jul 10, 2018, 2:39:40 PM7/10/18

to Redis DB

Had the same issue recently with much smaller size: ~7GB of data. And also fixed by increasing client-output-buffer-limit and repl-backlog-size.

Other in-memory database with disk persistence through snapshot+log, sovles the issue by feeding replica from disk: both snapshot and log is feeded into replica from disk until replica is close enough to current master's state.

Reply all

Reply to author

Forward