Issue 525 in redis: Slow clients using pubsub cause memory to grow unbounded

1,271 views
Skip to first unread message

re...@googlecode.com

unread,
Apr 17, 2011, 2:12:54 PM4/17/11
to redi...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 525 by jwillp: Slow clients using pubsub cause memory to grow
unbounded
http://code.google.com/p/redis/issues/detail?id=525

What version of Redis you are using, in what kind of Operating System?

2.0.4 on Linux (Fedora 14)

What is the problem you are experiencing?

The server process memory usage grows without bounds. (May cause
unintentional denial of service.)

What steps will reproduce the problem?

A client that is SUBSCRIBEd to a high volume will cause the server to
buffer all pending PUBLISHed data without limit, when the client is too
slow to keep up, or if the client has hung (blocked). This can cause a
single faulty client to kill a redis server. (This is also possible when a
blackholing iptable type filter is raised between server and client, or
when an intermediate stateful firewall decides to be buggy.)

Do you have an INFO output? Please past it here.

redis> info
redis_version:2.0.4
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
process_id:31762
uptime_in_seconds:24418
uptime_in_days:0
connected_clients:4
connected_slaves:0
blocked_clients:0
used_memory:162028112
used_memory_human:154.52M
changes_since_last_save:74156
bgsave_in_progress:0
last_save_time:1303061753
bgrewriteaof_in_progress:0
total_connections_received:57
total_commands_processed:132388962
expired_keys:0
hash_max_zipmap_entries:64
hash_max_zipmap_value:512
pubsub_channels:0
pubsub_patterns:1
vm_enabled:0
role:master
db0:keys=292,expires=0


If it is a crash, can you please paste the stack trace that you can find in
the log file or on standard output? This is really useful for us!

Please provide any additional information below.


Could you add a configuration feature to redis.conf for setting the maximum
buffer space to use for client subscriptions? This would be analogous to
the Spread toolkit MaxSessionMessages (defaults to 1000, but I generally
raise it to 5,000 or 10,000), or the activemq.maximumPendingMessageLimit
parameter of ActiveMQ for dealing with too-slow consumers. The best
strategy for the server is to drop the connection to a slow client after it
has backlogged this configured maximum buffer size. Then the memory impact
on the server is limited to a fixed size per client connection.

This should be a relatively small change to the pubsubPublishMessage
function in src/pubsub.c Before calling addReply() and addBulk(), the
client connection should be checked to see if too much data is buffered
up. If so, then drop the connection.

More generically, the fix could even go into src/networking.c into the
_addReplyObjectToList() function, done only in the least-likely branch
where there is already a list of objects, and we can't append to the last
item in the list. So perhaps in the unstable branch, line 127 would be a
good spot to check the list size (in terms of bytes) before adding another
object to the tail of the list. That would have zero CPU cost for clients
that read fast enough that the static buffer never spills over into the
linked list, and would be a generic place to implement the bounds-checking.

I'm not very familiar with the code for redis, though. So, you might have
a better way to solve this!

Thanks!

Will P

unread,
Apr 17, 2011, 3:24:28 PM4/17/11
to Redis DB
I ran into this when I started looking at how to implement a "mult-
master" kind of "cluster" of redis servers, for only publish/subscribe
messaging with no single points of failure.

Because a redis server can only slave against one master, we can only
configure replication as a tree. It's really interesting that you can
publish to any server in the replication tree, and it works perfectly
in delivering the message "down" the tree to the leaves (replication
slaves). But this isn't fault-tolerant without every client knowing
the entire up/down state of the entire replication tree, for it to
pick the new root. (yes, Redis Cluster is still in the oven)

So I thought, maybe I could stand up 3 or more independent (non-
slaving) redis servers. Each one would have a unique prefix for all
of its local publish channels. Any client PUBLISHing to that server
would always use that server's prefix on channel names, i.e.
"a:channame" for server A, and "b:channame" for server B. Then I
could run a client "repeater" process on each redis server and
subscribe to all the local channels on that server, and publish the
data out to the other two servers. On server A, it would PSUBSCRIBE
to "a:*" and for every message it received, it would PUBLISH the same
message to the corresponding b: and c: channels on the other two
servers. Likewise on server B, the "repeater" process PSUBSCRIBEs to
b:*, sending everything to server A and C. And finally, on server C
another repeater would run and copy everything seen from "c:*" to
servers A and B (changing the channel name prefixes respectively).

It isn't as elegant as what Redis Cluster promises, but for publish-
subscribe where there are no single points of failure for writing, I
think this just might work.

What surprised me was that my proof of concept "external subpub
repeater" was only 8 lines of python, including the shebang line.

Here they are (the last 2 lines are indented 4 spaces):

#!/usr/bin/python
import redis
r_src = redis.Redis(host='localhost', port=6379, db=0)
r_dst = redis.Redis(host='hostb', port=6379, db=0)
r_src.psubscribe('a:*')
for msg in r_src.listen():
channel = msg['channel']
r_dst.publish('b:' + channel[2:], msg['data'])

This connects to redis on localhost and on "hostb", and PSUBSCRIBEs to
all channels of localhost matching a:* and PUBLISHes their data to the
corresponding b:* channel on "hostb". The code is shorter than my
description of it.

This is just toy code, to see if it would work. And it does! Except,
if you suspend it with control-Z, and any other redis client is
streaming messages to a channel matching a:*, then the redis server
balloons in memory usage without bound. My local test instance (redis
v2.0.4) is using over 19 GB of RAM now.

Until Redis Cluster is available, I'm going to try to get this to work
well enough that it can be used as a fault-tolerant publish-subscribe
message bus. If each redis server had a local key indicating what the
channel prefix should be, then it would be simple to set up a VIP to
point to all three redis servers for writing (publish). A writing
client would connect to the VIP and be connected randomly to any
living redis server, ask for the server prefix with "GET
pubsub_prefix", and then start sending its PUBLISH messages to
channels using that prefix.

For clients that only SUBSCRIBE, using built-in replication and
subscribe-only slaves is one option, though it doesn't help when the
master dies. I guess a client could check the INFO output to see if
the replication status is okay. But that's not easy, because when you
are in SUBSCRIBE mode and the data stops arriving, you can't get an
INFO response from the same connection without issuing an
UNSUBSCRIBE. (At least in redis-py.)

I can't think of any reason against letting the readers SUBSCRIBE on
the same nodes that the writers use, and let everyone talk through a
VIP to get to a redis server- as long as both writers and readers
check the "pubsub_prefix" key after initial connect to adjust their
channel names.

Obviously, if any repeater dies, then the game is over. I can think
of a few ways to handle this, even with the limitation that client
subscribers are essentially 'stuck' and cannot get or set keys or do
any other commands. If there was a dedicated local "status" channel,
named something like... a:STATUS that the repeater pinged once every
second (yes, sounds like Redis Cluster again) with a "PUBLISH a:STATUS
server=a,current_time=<timestamp>,state=ok" then reader and writer
clients of that server could also get notification that the repeater
has died, by including the a:STATUS channel in its subscription. When
the status updates stop arriving for that server, it means the
repeater is dead and the client can reconnect to the VIP (or use a DNS
rotor, whatever), missing only a few messages instead of failing
silently. (Dropping a handful of messages is okay for my needs.)

I'm somewhat influenced by the Spread toolkit, in thinking about
solving this problem. But I don't think this repeater option is
really terribly elegant, and Redis Cluster is just around the
corner...

Has anyone tried this before? Other than this slow/hung-client issue
causing the server memory to grow, are there any other boogeymen or
risks?





Josiah Carlson

unread,
Apr 18, 2011, 6:13:29 PM4/18/11
to redi...@googlegroups.com
If you want a fault tolerant message bus, you may want to try zeromq.
It was designed to do exactly that, and that alone.

Limiting the size of the outgoing queue for these kinds of things
isn't a bad thing, but I wouldn't be surprised if this doesn't get
picked up very soon.

- Josiah

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

re...@googlecode.com

unread,
Sep 8, 2011, 3:10:20 PM9/8/11
to redi...@googlegroups.com

Comment #1 on issue 525 by dave.pet...@gmail.com: Slow clients using pubsub

I think this would be a very useful feature. I'd be willing to work on it,
if it's considered an acceptable addition.

re...@googlecode.com

unread,
Sep 8, 2011, 8:08:02 PM9/8/11
to redi...@googlegroups.com

Comment #2 on issue 525 by jwillp: Slow clients using pubsub cause memory

I'd definitely appreciate it!

re...@googlecode.com

unread,
Dec 21, 2011, 11:16:45 AM12/21/11
to redi...@googlegroups.com

Comment #3 on issue 525 by jwi...@gmail.com: Slow clients using pubsub

I've added a new feature that lets you specify a limit on the number of
pending objects in the client's "reply" linked list (pending outbound
messages). It adds a new server level config parameter "maxclientqueue",
with a default of 0 (disabled).

When this setting is enabled and a client is too slow to read replies fast
enough to keep up with the server, then the server will drop the client's
connection when it has built up the maximum number of messages on the
server side. It's not perfect, because ideally you would want to limit the
client based on the number of bytes that have been queued up. But this
setting is intended to be a safety-measure (like a shearing-pin). Actually
counting up the enqueued bytes and maintaining that value seems like
overkill when we already have an object count in the linked list structure.

When a client exceeds this limit, the server also logs a warning message to
indicate that it has dropped a client due to this overflow. That provides
a positive signal if this setting is used and is set too low, or if clients
are frequently too slow to keep up. (There is also a low frequency check
(every ~30 sec or so) for any clients that have overflowed, but haven't yet
been dropped due to another write, as a cleanup pass.)

My github branch for this patch is here:
https://github.com/willp/redis/tree/willp_issue525_memory

And the diff is attached as a patch as well.


Attachments:
issue525_memory_v1.patch 6.9 KB

re...@googlecode.com

unread,
Dec 21, 2011, 11:23:50 AM12/21/11
to redi...@googlegroups.com

Comment #4 on issue 525 by dave.pet...@gmail.com: Slow clients using pubsub

Thanks for doing this!

I think your solution is very reasonable, and probably the simplest
out of the alternatives.


Reply all
Reply to author
Forward
0 new messages