Extremely low memory fragmentation ratio

715 views
Skip to first unread message

Andrew Eisenberg

unread,
Nov 6, 2017, 1:35:36 AM11/6/17
to Redis DB
Hi all,

We are seeing some odd memory issues with our Redis 4.0.2 instances. The master instance has a ratio of 0.12, whereas the slaves have reasonable ratios that hover just above 1. When we restart the master instance, the memory fragmentation ratio goes back to 1 until we hit our peak load times and the ratio goes back down to less than 0.2. The OS (Ubuntu) is telling us that the redis instance is using 13GB of virtual memory and 1.6GB of RAM.  And once this happens, most of the data gets swapped out to disk and the performance grinds almost to a halt.

Our keys tend to last for a day or two before being purged. Most values are hashes and zsets with ~100 or so entries and each entry being less than 1kb or so.

We are not sure what is causing this. We have tried tweaking the OS overcommit ratio. We also tried the new MEMORY PURGE command, but that neither seem to help.  We are looking for other things to explore and suggestions to try. Any advice would be appreciated. Thanks.

Here is a dump of our memory stats:

127.0.0.1:8000> info memory
# Memory
used_memory:12955019496
used_memory_human:12.07G
used_memory_rss:1676115968
used_memory_rss_human:1.56G
used_memory_peak:12955019496
used_memory_peak_human:12.07G
used_memory_peak_perc:100.00%
used_memory_overhead:19789422
used_memory_startup:765600
used_memory_dataset:12935230074
used_memory_dataset_perc:99.85%
total_system_memory:33611145216
total_system_memory_human:31.30G
used_memory_lua:945152
used_memory_lua_human:923.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:0.13
mem_allocator:jemalloc-4.0.3
active_defrag_running:0
lazyfree_pending_objects:0

127.0.0.1:8000> memory stats
 1) "peak.allocated"
 2) (integer) 12954706848
 3) "total.allocated"
 4) (integer) 12954623968
 5) "startup.allocated"
 6) (integer) 765600
 7) "replication.backlog"
 8) (integer) 1048576
 9) "clients.slaves"
10) (integer) 33716
11) "clients.normal"
12) (integer) 184494
13) "aof.buffer"
14) (integer) 0
15) "db.0"
16) 1) "overhead.hashtable.main"
    2) (integer) 17691184
    3) "overhead.hashtable.expires"
    4) (integer) 32440
17) "overhead.total"
18) (integer) 19756010
19) "keys.count"
20) (integer) 337422
21) "keys.bytes-per-key"
22) (integer) 38390
23) "dataset.bytes"
24) (integer) 12934867958
25) "dataset.percentage"
26) "99.853401184082031"
27) "peak.percentage"
28) "99.999359130859375"
29) "fragmentation"
30) "0.12932859361171722"

hva...@gmail.com

unread,
Nov 6, 2017, 1:29:38 PM11/6/17
to Redis DB
It would be helpful to see all of the output from the INFO command.  It shows memory-affecting behavior like snapshots, replication buffers allocated for slaves, and other things.

What else is running on the server that causes swapping when Redis is using only 1/3rd the available memory at peak?  (this question actually is:  how much memory do the other processes leave available for Redis to use?)

Andrew Eisenberg

unread,
Nov 6, 2017, 1:57:37 PM11/6/17
to Redis DB
Thanks for your reply.

Yes, something I neglected to mention in my previous post was that the machine has 32GB of ram total and during peak time there are some memory crunches where available ram <<< required memory and swapping occurs (we are expecting more memory soon). But several things to notice here:

1. We were running a redis slave on the same machine and its ratio stayed close to 1 (ie- we had no problems with it).
2. The master's memory fragmentation ratio remained extremely low even after memory usage came down to reasonable levels.

It's understandable that the memory fragmentation ratio will go down when there is a RAM shortage, but can redis recover from this when the OS has more resources to provide?

Here is the most recent output from the info command. We've had to restart our instance since my previous post, so the current values may not reflect the low memory fragmentation issue we had this weekend.

127.0.0.1:8000> info
# Server
redis_version:4.0.2
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:4739a91f5597d6c6
redis_mode:standalone
os:Linux 4.4.0-98-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:5.4.0
process_id:196114
run_id:6ef34050fc36dfee2bc8962112404d4d32fe6a69
tcp_port:8000
uptime_in_seconds:19325
uptime_in_days:0
hz:10
lru_clock:44716
executable:/usr/bin/redis-server
config_file:/etc/redis/8000.conf

# Clients
connected_clients:11
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory
used_memory:149119504
used_memory_human:142.21M
used_memory_rss:240590848
used_memory_rss_human:229.45M
used_memory_peak:4164885016
used_memory_peak_human:3.88G
used_memory_peak_perc:3.58%
used_memory_overhead:2492448
used_memory_startup:765600
used_memory_dataset:146627056
used_memory_dataset_perc:98.84%
total_system_memory:33611145216
total_system_memory_human:31.30G
used_memory_lua:660480
used_memory_lua_human:645.00K

maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:1.61

mem_allocator:jemalloc-4.0.3
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0
rdb_changes_since_last_save:1426856
rdb_bgsave_in_progress:0
rdb_last_save_time:1509974831
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:5386240
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0

# Stats
total_connections_received:448
total_commands_processed:29467766
instantaneous_ops_per_sec:778
total_net_input_bytes:5881507724
total_net_output_bytes:9862541824
instantaneous_input_kbps:12.03
instantaneous_output_kbps:50.54
rejected_connections:0
sync_full:4
sync_partial_ok:0
sync_partial_err:4
expired_keys:67
evicted_keys:0
keyspace_hits:16397274
keyspace_misses:1080002
pubsub_channels:2
pubsub_patterns:3
latest_fork_usec:102407
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0

# Replication
role:master
connected_slaves:1
slave0:ip=192.168.200.31,port=8001,state=online,offset=173513116,lag=0
master_replid:fa4ae383a60b647c706f6f437539daacedbf1cbf
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:173516238
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:172467663
repl_backlog_histlen:1048576

# CPU
used_cpu_sys:221.35
used_cpu_user:466.06
used_cpu_sys_children:16.73
used_cpu_user_children:61.25

# Cluster
cluster_enabled:0

# Keyspace
db0:keys=6853,expires=133,avg_ttl=160005944

Andrew Eisenberg

unread,
Nov 7, 2017, 9:31:29 AM11/7/17
to Redis DB
So, here's an update.  I was able to reproduce this problem on a blank redis 4.0.2 instance.

Here's what I did:

  1. Create a new master instance of redis, with the save option turned off
  2. Create a slave instance of redis and set it to use diskless replication with the master. Default save options are used
  3. Targetting the first redis, I ran two pretty horrific ruby scripts:
    1. create_junk.rb
      require "redis"
      r = Redis.new({:timeout => 1000})

      BLOB_SIZE = 10000
      ITERATIONS = 5
      KEYS_AMOUNT = 3000000

      (1..ITERATIONS).each do |iter|
        puts "Creating iteration #{iter}"
        r.debug('populate', KEYS_AMOUNT, 'dummy', BLOB_SIZE)

        puts "Deleting iteration #{iter}"
        (1..KEYS_AMOUNT).each do |i|
          r.del("dummy:#{i}")
        end
      end


      puts "Last iteration"
      r.debug('populate', KEYS_AMOUNT, 'dummy', BLOB_SIZE)
    2. more_junk.rb
      require "redis"
      r = Redis.new({:timeout => 1000})

      blob = r.get('dummy:1')

      r.keys('*').shuffle.each do |k| # use shuffle to help ensure there is no memory locality
        r.del(k)
        r.set('again:' + k, blob)
      end
  4. I ran each script once overnight and came back in the morning
  5. Now the info command on the master shows the following (see below)

Now that I have the redis instance in this state, I would like to be able to clear out the memory, and fix the fragmentation ratio. How can I do this?


127.0.0.1:6379> info


# Server
redis_version:4.0.2
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:4739a91f5597d6c6
redis_mode:standalone
os:Linux 4.4.0-98-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:5.4.0

process_id:28619
run_id:8ebcf313d1e1788ada1b3436f435007df743eb58
tcp_port:6379
uptime_in_seconds:64883
uptime_in_days:0
hz:10
lru_clock:115521
executable:/home/andrew.eisenberg/redis-server
config_file:

# Clients
connected_clients:1


client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory

used_memory:30941231512
used_memory_human:28.82G
used_memory_rss:4258734080
used_memory_rss_human:3.97G
used_memory_peak:31065311136
used_memory_peak_human:28.93G
used_memory_peak_perc:99.60%
used_memory_overhead:155435264
used_memory_startup:765488
used_memory_dataset:30785796248
used_memory_dataset_perc:99.50%
total_system_memory:33611145216
total_system_memory_human:31.30G
used_memory_lua:37888
used_memory_lua_human:37.00K


maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction

mem_fragmentation_ratio:0.14


mem_allocator:jemalloc-4.0.3
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0

rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1510044314
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:17956
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:2301952


aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0

# Stats

total_connections_received:213
total_commands_processed:19476731
instantaneous_ops_per_sec:1
total_net_input_bytes:24390903484
total_net_output_bytes:1003936041
instantaneous_input_kbps:0.05
instantaneous_output_kbps:0.01
rejected_connections:5
sync_full:3
sync_partial_ok:0
sync_partial_err:3
expired_keys:0
evicted_keys:0
keyspace_hits:2
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:374842


migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0

# Replication
role:master
connected_slaves:1

slave0:ip=127.0.0.1,port=8104,state=online,offset=15118951610,lag=0
master_replid:2bf77a532bdf4e8e87bd8822ca3d562ac7c5355c
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:15118951610


second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576

repl_backlog_first_byte_offset:15117903035
repl_backlog_histlen:1048576

# CPU
used_cpu_sys:463.07
used_cpu_user:174.64
used_cpu_sys_children:303.20
used_cpu_user_children:198.70



# Cluster
cluster_enabled:0

# Keyspace

db0:keys=3000007,expires=0,avg_ttl=0
127.0.0.1:6379>

Salvatore Sanfilippo

unread,
Nov 7, 2017, 9:40:02 AM11/7/17
to redi...@googlegroups.com
Please could you send the output of "cat /proc/<PID>/smaps" ? Thanks.

Also "MEMORY DOCTOR" output if possible.
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.

Andrew Eisenberg

unread,
Nov 7, 2017, 9:49:49 AM11/7/17
to redi...@googlegroups.com
Sure thing. Thanks for your response.

127.0.0.1:6379> memory doctor
Hi Sam, I can't find any memory issue in your instance. I can only account for what occurs on this base.

Attaching smaps.txt as a separate file.

Please note that I have not actually deleted any keys since I got into this state. Nor have I done anything to explicitly try to release memory.



> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
       — Fred Brooks, "The Mythical Man-Month", 1975.

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/17nbNJdQsd4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+unsubscribe@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.



--
Andrew Eisenberg, PhD
Ganchrow Scientific
smaps.txt

Salvatore Sanfilippo

unread,
Nov 7, 2017, 10:20:14 AM11/7/17
to redi...@googlegroups.com
Hello Andrew, thanks, so that's the problem:

7fb178800000-7fb8e9800000 rw-p 00000000 00:00 0
Size: 31211520 kB
Rss: 4156240 kB
Pss: 4156240 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4156240 kB
Referenced: 886028 kB
Anonymous: 4156240 kB
AnonHugePages: 243712 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 27010112 kB
SwapPss: 27010112 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB

Basically once the instance was under memory pressure, the Linux
kernel had to find a way to deal with the fact it needed free memory
pages, and swapped out the pages to disk.
Normally this rarely happens with Redis: most boxes are configured
without or with little swap nowadays. In this case however the
following combinations of things allowed to swap pages on disk:

1) Most data in the Redis instance is not accessed by any client at all.
2) There was enough swap on disk to store the Redis pages.

So Redis is telling the truth that there is fragmentation < 1, that
is, the memory used in the computer is just 10% of the actual data
Redis is holding.
As you try to access the memory pages, for instance by calling the
"SAVE" command to make a copy of the dataset on disk, all the memory
pages will be restored to memory, if there is enough space.
Otherwise if the kernel is out of free memory what will happen is that
SAVE will take a while and as memory pages are loaded into memory,
others are swapped to disk again.

It's a shame that MEMORY DOCTOR is not able to advice the user about
this condition. While rare, it's worth it.

If there is anything I can clarify, please ping me.
Cheers,
Salvatore
>> > email to redis-db+u...@googlegroups.com.
>> > To post to this group, send email to redi...@googlegroups.com.
>> > Visit this group at https://groups.google.com/group/redis-db.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - Redis Labs https://redislabs.com
>>
>> "If a system is to have conceptual integrity, someone must control the
>> concepts."
>> — Fred Brooks, "The Mythical Man-Month", 1975.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Redis DB" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/redis-db/17nbNJdQsd4/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> redis-db+u...@googlegroups.com.
>> To post to this group, send email to redi...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/redis-db.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> Andrew Eisenberg, PhD
> Ganchrow Scientific
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.

Andrew Eisenberg

unread,
Nov 7, 2017, 11:45:31 AM11/7/17
to Redis DB
Thanks. That makes a lot of sense.

So, once we get into this state, is there any way out without restarting redis?  I was expecting that if memory pressure on the entire system lessens (eg- if we turn off the slave), then master will move all of its data to ram and stop using swap.

But, what I am seeing is that when the slave shuts down, swap usage, ram usage, and fragmentation ratio on master stays roughly the same. And performance on master is still poor.

hva...@gmail.com

unread,
Nov 7, 2017, 12:21:06 PM11/7/17
to Redis DB
The Redis process doesn't know what the Linux kernel did with its memory pages (swapping them - i.e., copying them to disk in order to use the memory pages for other things).  The activity of swapping memory to disk is designed to be transparent (invisible) to the processes.  So there isn't a reason the Redis server process to 'move its data back to ram' - it never knew the data was moved.

Salvatore suggested one way you can force the data back into ram:


As you try to access the memory pages, for instance by calling the "SAVE" command to make a copy of the dataset on disk, all the memory pages will be restored to memory, if there is enough space.

Another approach is to tell the kernel it has no swap (as root: "swapoff -a && swapon -a").  "Removing" swap forces the pages back into ram, if they're still claimed by a running process or open buffer.

Swapoff/swapon may be preferable to issuing a Redis "SAVE" command because it won't make Redis add disk write activity (to an RDB file) on top of the disk read activity (copying data to ram).  Be aware that neither approach will make the Redis server process more responsive until all the data has been copied back into ram.

Salvatore Sanfilippo

unread,
Nov 7, 2017, 12:35:29 PM11/7/17
to redi...@googlegroups.com
I agree that disabling swapping is better. Also note that disabling
swapping may fail AFAIK if the kernel is not able to load all the
swapped pages into memory, so it's safer.
Instead SAVE in a memory constrained environment will take ages.

Andrew Eisenberg

unread,
Nov 7, 2017, 2:59:34 PM11/7/17
to Redis DB
Thanks, both of you, for the analysis and suggestions.

I am currently trying the swapoff/on approach and it is taking a very long time (over an hour so far), but swap is slowly going down. Maybe it will work for us. Maybe it won't. And based on your suggestions, we are considering setting swappiness for the entire system or maybe just the redis instances to 0 or 1.

I'll let you know how this all turns out.

Salvatore Sanfilippo

unread,
Nov 7, 2017, 4:00:03 PM11/7/17
to redi...@googlegroups.com
Thanks for the update Andrew,

Perhaps Redis itself should change its swappiness if it's possible to
do it without superuser capabilities (not sure). In general swapping
Redis memory pages never makes sense however to lock the memory pages
or doing other things, assuming requires root, would be an overkill.
Normally this problem never happens because even accessing a small
subset of keys touches all the memory pages.

Cheers,
Salvatore

On Tue, Nov 7, 2017 at 8:59 PM, Andrew Eisenberg
Reply all
Reply to author
Forward
0 new messages