Hi!
I use Redis in data processing: We have a file of about 90 GB that we
need to process every week. Before starting I flush the whole DB, so
it starts clean. I have written a script which processes the file, and
increments counters in several sorted sets. After the file processing
is done (which takes a few hours), I get the data I want from Redis,
mostly by getting some ranges from the sorted sets.
The total amount of memory used by Redis is around 14GB. I have this
running on a machine which has 24 GB of memory, and is currently only
used for this purpose.
However, during the adding of the keys, the machine becomes really
slow, and is running out of memory (and swap). At that time there are
2 Redis processes, both using 14 GB, which doesn't fit in the 24GB.
I assume this is because Redis forks itself while saving.
I currently have this as my save config:
save 900 1
save 300 10
save 60 10000
Since these saves happen in the background, the memory size is
doubled, which causes my issue. Does the SAVE command have the same
memory usage? From the docs it seems it would happen in the same
process?
http://redis.io/commands/save
I wouldn't mind if I would have to call SAVE manually, and wait for it
to finish, if that means the memory needed is less (since there is no
forking). I could just call it in my processing script, and wait to
save, once every 30 min or so.
Would that work? Any other pointers to keep the process within my
current machine limits?