Replace Redis 4.0.6 publish subscripe with a UDP connection?

frc138

unread,

Feb 3, 2018, 5:02:37 AM2/3/18

to Redis DB

I am in the processing of replacing Redis 4.0.6 publish subscripe with a UDP connection
for a high speed data transfer at 400 Kilobytes per second. The reason we would like to
replace our C++11 RedHat Linux 7.0 Redis client plugin drops 20% of packets when
Redis 4.0.6 publish subscripe networking.c disconnects and suffers buffer overflow
when overwhelmed with messages consisting of key-value pairs containing digitized
analog measurements even when we bump up the redis.conf soft limit for the redis output buffer.

How do I best make this replacement "given UDP gives the fastest network communication. UDP is less reliable but much faster than TCP. It is best used for streaming non critical data like sound, video, or multiplayer game data as it can drop packets depending on network connectivity and other factors. UDP can be used for local IPC as well, but is slower than #1's Unix Socket or Windows Socket Implementation because UDP sockets go through the network card while Unix and Windows Sockets do not."
as ghost commented on May 13, 2016 •

frc138

unread,

Feb 3, 2018, 1:59:39 PM2/3/18

to Redis DB

There's a lot to disentangle here - the Redis publish-subscribe model is at a higher level than the TCP/UDP question.

The key question with a change from TCP to UDP is how you handle lost packets. You will have to have your code identify missing packets and have some mechanism to tell the server to resend them - or - you will just have to live with the lossiness of UDP. The reason UDP is faster than TCP is that UDP doesn't worry about the guaranteed delivery of packets. UDP is used for video and music delivery where it doesn't matter if you lose some packets, because you are trading off the guaranteed delivery for faster speed or keeping up with being current (like for a live TV show).

Has any one yet tried to replace Redis publish-subscribe with TCP - going to UDP sounds like you will just encounter the lossiness all over again.

hva...@gmail.com

unread,

Feb 3, 2018, 5:34:00 PM2/3/18

to Redis DB

Yes of course. The transport protocol does not shield you from network trouble. If traffic is delayed or lost in the network between Redis and your client, then the traffic is still delayed or lost. The selection of TCP or UDP is merely a choice of which software you want to detect and repair the data delay/loss:

TCP: You ask the operating systems on the sending/receiving machines to handle the packet delay/loss so your application software (e.g. Redis and your client) doesn't have to deal with it.
UDP: You ask the operating systems to ignore packet delay/loss, and you write your application software to worry about delay/loss and take appropriate actions when it happens.

In your follow-up post it sounds like you don't want to lose the measurements in your data stream, even when there is delay/loss in the network between your server (where Redis is) and your client. This means something, somewhere, must buffer up the data that has been transmitted until the receiver has acknowledged its safe receipt. With TCP, the operating system buffers a lot and the application (Redis) can buffer more. With UDP, the sending application must buffer all of it.

Here's a thought experiment: Can your particular data stream handle occasional pieces being lost or received out of order, as some kinds of video streaming can? If there's a seconds-long or minutes-long outage in the network between your sender and your receiver, is it okay to have a gap in the stream and get new data when the outage is over? If so, then UDP with little or no buffering by the sender (and little or no acknowledgements by the receiver) can probably do well for you. If not, then depending on the characteristics that your data is sensitive to (order data received, occasional small losses), UDP may still be a good choice, with more buffering by the sender and acking by the receiver.

But the stricter your requirements are for preserving your data, intact and in order, the more your needs match up with the TCP protocol. You can certainly implement your own replacement at the application layer (your code) in UDP, and this can give you more control, but you would also have to re-implement things like MSS/MTU, selective ACK, Nagle's algorithm, and so on, to match the good features that the operating system's TCP offers you.

I haven't seen a description of the problem that started you pursuing UDP as a transport. Your first post mentions your client dropping packets in the same phrase as changing the server's config for more buffering, so it's not clear if you perceive a problem in the client, or in the server, or something in between. It sounds like your client is getting disconnected, and by the time it re-connects, the incoming data stream has overflowed the Redis server's buffer for that client and about 20% of the data is lost. If this is the case, have you already investigated the causes for the disconnections and decided you can't fix them?

frc138

unread,

Feb 4, 2018, 3:44:02 PM2/4/18

to Redis DB

Yesterday , I read Chapter 6 in Redis In Action where Joshua Carlton writes that

" One of the drawbacks of the Redis Publish and Subscribe model are that the client

must be connected at all times to receive messages, disconnections can cause

the client to lose messages, and older versions of Redis could become unusable,

crash, or be killed if there was a slow subscriber."

Then, Joshua Carlton states that ,

"Though push messaging can be useful, we run into problems when clients can’t stay

connected all the time for one reason or another. To address this limitation, we’ll write

two different pull messaging methods that can be used as a replacement for

PUBLISH/SUBSCRIBE. We’ll first start with single-recipient messaging, since it shares

much in common with our first-in, first-out queues. Later in this section, we’ll move to a

method where we can have multiple recipients of a message. With multiple recipients,

we can replace Redis PUBLISH and SUBSCRIBE when we need our messages to

get to all recipients, even if they were disconnected"

We are interested to know whether it would be more performant to replace

Redis PUBLISH and SUBSCRIBE with Joshua Carlton's Section 6.5.2

Multiple-recipient publish/subscribe replacement
instead of harnessing the TCP or

UDP protocol to detect and repair the disconnection loss.

If one chooses to use Joshua Carlton's Section 6.5.2

Multiple-recipient publish/subscribe replacement there is a race condition

in sending messages which it’s can be handled with the use of a lock or mutex.

What is the nature of that race condition?

Finally , may we use multithreading in conjunction with Joshua Carlton's Section 6.5.2

Multiple-recipient publish/subscribe replacement to harness the processing power

of today's multicore CPUs?

hva...@gmail.com

unread,

Feb 6, 2018, 11:40:42 AM2/6/18

to Redis DB

Speaking for myself, I haven't read Josiah's book. I don't know what approaches he describes, so I can't comment on the details of how they work.

A multi-threaded client can overcome limitations of a single-threaded client, if those limitations are what's holding you back. Multi-threading doesn't offer any ways to avoid disconnections (and, in fact, usually makes disconnect/reconnect activity more complex to manage). What it does offer is the ability for a client to make multiple connections to the Redis server and download different keys/messages in each connection. Using parallel connections in this fashion can get you faster data transfer performance through the network - especially when there's a high amount of latency between the client and server. But if your bottleneck is something besides local cpu power or network latency, multi-threading will not offer you any real benefits. It won't do you a lot of good to cast about for solutions, you'll be much better served by identifying the cause of the problem, the major symptoms produced by the cause, and choosing a solution that addresses the cause/symptoms.

frc138

unread,

Feb 7, 2018, 12:12:47 PM2/7/18

to Redis DB

What tcp_max_syn_backlog in redis.conf to would prevent either of Joshua Carlson's single-recipient messaging, and multiple -recipient messaging methods from disconnecting under a load of greater or equal to 20,000 messages per second where each message is 20 bytes long?

Also, UDP is vulnerable to bursts in traffic generated by a software data recorder.

hva...@gmail.com

unread,

Feb 7, 2018, 11:52:09 PM2/7/18

to Redis DB

"tcp_max_syn_backlog" is a Linux kernel parameter and not a redis.conf configuration parameter. The parameter in redis.conf is "tcp-backlog". There was a thread recently on the Redis tcp-backlog setting and the Linux kernel's somaxconn setting, how they interact with each other and what they do: https://groups.google.com/d/topic/redis-db/ftmlTjEPv98/discussion The second link in my contribution to that thread explains the role of tcp_max_syn_backlog together with the other parameters.

The summary is they control behavior when two computers are first making a TCP connection to each other. Those two parameters have nothing whatsoever to do with maintaining a TCP connection or how the computers behave when breaking a TCP connection. (they would come into play when a client is re-connecting AFTER being disconnected)

Are disconnections the problem that prompted you to start asking about UDP and buffering? It kind of sounds like it.

frc138

unread,

Feb 10, 2018, 10:06:00 PM2/10/18

to Redis DB

Yes, disconnections are the problems that prompted me to ask about UDP and buffering witth persistence.

There are two cooperating C++11 threads. One thread enqueues messages originating from DECOM on a

doubly linked list under the protection of a mutex . The other thread deques messages from the std::list and

sends them to IADS for visual display under the protection of a mutex

There is a whitelist which filters the portion of the 843 message types to be graphed by IADS.

Could we attain a throughput of 500,000 bytes per secons with 1 percent probability of dropped message loss?

hva...@gmail.com

unread,

Feb 11, 2018, 2:03:20 AM2/11/18

to Redis DB

The question asked is:

Could we attain a throughput of 500,000 bytes per secons with 1 percent probability of dropped message loss?

You're asking one question, but there's another, more fundamental question that must be asked and answered first.

Your client application has one thread for receiving the stream of data from Redis through one non-compressed TCP connection. The more fundamental question here is: Can a single TCP non-compressed connection achieve 500KB/s (500,000 bytes/second)?

This question must be answered first because if a single TCP connection cannot achieve 500KB/s, then your client can't either - no matter how well it's written.

The main part of the answer comes from sources like this Wikipedia page: Measuring network throughput

Near the end of the page, it describes the TCP protocol as "adds its own overhead", which isn't useful for the purposes of answering this question. In my past experience, TCP connection streams have very small amounts of overhead. You get reasonably close to the real bytes/second throughput if you use the easy calculation of bits/second divided by 8. (8 bits per byte). Near the start of that page is the section titled "Theory: Short Summary" which calculates the maximum rate of a single uncompressed TCP connection as 2.376 Megabits/second. Dividing by 8 produces the figure 297 KB/second. This is less than the 500 KB/second you are asking about.

So your client can be perfectly capable of accepting and processing data at 500KB/sec (or faster), but the single TCP connection it uses will prevent the data from flowing faster than 280-300KB/s. Is this the problem you're seeing right now? I don't know. I'm not bringing it up as a diagnosis of your current trouble. I'm bringing it up as trouble you will run into sooner or later.

So how can this transmit bottleneck be overcome? There are two general approaches:

Data compression - If you can get an average of 3:1 compression with this data, then the effect can be to triple the maximum throughput. The compression incurs more cpu consumption on the transmitting end, though. (technically more cpu on the receiving end too, but the increase on the receiving end is usually tiny)
Use multiple TCP connections - If you open 3 connections to Redis and pull different data samples through each connection, you are tripling the maximum throughput (even without compression).

Redis doesn't support compressed connections, so the first approach represents additional work you would have to do. Multiple TCP connections are supported by Redis, though an ordinary Pub/Sub channel would not behave the way you'd want. (it would send duplicate messages through the subscribed connections instead of different messages through each connection) But you could very likely use the alternative designs in the Redis In Action book. Making your client use 4 threads instead of 2 threads is also additional work, but perhaps not very much since it's already multi-threaded.

Now to the question you asked: Is the design of your client one that can handle 500KB/s? I'm not a developer, so I can't offer a deep analysis, but here's the approach I would take to find out:

The general design you described sounds pretty good to me. You're keeping the actions of receiving and queueing data separate from the actions of dequeueing and transmitting to IADS. The question in my mind is the coupling between them via the common data structure and the mutex - is it fast enough? Also, is the coupling so close that a slowdown in the dequeue/sending routine will hobble the receiving/enqueueing routine?
Are these two routines fast enough? The approach I would take would be to test them separately:

Test the receiving thread pulling the data from Redis and calling a stub routine that gets pretends to manage the mutex and enqueue the data into the linked list, but just immediately returns. You'll see if the unshackled receiving routine can keep up with the incoming flow from Redis. And how fast it goes. Then test with the real mutex/enqueue routine, but it always gets the mutex because the other thread isn't running. Does the mutex/enqueue add overhead and slow the receiving routine down?
Test the dequeueing and sending routine with a pre-loaded data structure and a routine that only pretends to get the mutex. This unshackled routine should process the input data as fast as possible and send it to IADS. Is this fast enough to keep up with the flow from Redis? Add the real mutex/dequeue, but without the receiving thread, so there's no additional delay getting the mutex. Does it have enough overhead to slow the routine down? How much?
Finally test both the receiving/enqueueing thread and the dequeueing/sending threads together and see what their combined performance is.

Your measurements will give you the info on whether the routines are fast enough by themselves, and whether the mutex locking slows them down too much. I apologize if you've already done all this testing. I couldn't be sure and I thought it would be better to risk explaining steps you don't need to try instead of omit steps do you need to try.

Other folks who are more experienced in software development can chime in and offer better suggestions.

To recap: I think looking into the use of multiple TCP connections to Redis may be necessary to achieve your throughput goals.

frc138

unread,

Feb 11, 2018, 4:54:36 AM2/11/18

to Redis DB

Thank you for your detailed analysis of TCP throughput. In the Stack Overflow article,

http://stackoverflow.com/questions/36193916/proper-way-to-calculate-link-throughput originated

by user3243499, Jack Chan wrote an answer that says, "So usually , the throughput measured

by UDP should be higher than that from TCP although the difference should be small (~5% to 10%)"

Why is this true?

Also, a difficulty of measuring UDP throughput is that the sender buffer must be full.

On Saturday, February 3, 2018 at 2:02:37 AM UTC-8, frc138 wrote:

frc138

unread,

Feb 11, 2018, 6:20:39 AM2/11/18

to Redis DB

Our prototype system is a hybrid TCP/UDP design, where the DECOM uplink uses UDP while

the IADS downlink uses TCP/IP.

On Saturday, February 3, 2018 at 2:02:37 AM UTC-8, frc138 wrote:

frc138

unread,

Feb 11, 2018, 11:10:34 AM2/11/18

to Redis DB

I understand that the total throughput for our hybrid system is min(UDP downlink,TCP uplink). May I ask what

your exact recommendation using Redis 4.0.7 to replace Redis publish and subscribe so as to eliminate dropped

messages? Thank you.

On Saturday, February 3, 2018 at 2:02:37 AM UTC-8, frc138 wrote:

hva...@gmail.com

unread,

Feb 11, 2018, 6:46:21 PM2/11/18

to Redis DB

I'm not Jack Chan, so I'm not prepared to defend/explain his statements about the difference between TCP and UDP throughput measurements.

hva...@gmail.com

unread,

Feb 13, 2018, 1:05:07 PM2/13/18

to Redis DB

Other folks are free to give their own suggestions in this thread.

My suggestion is a summary of items previous discussed: I think the approach that will give you the network throughput you want and the ability to buffer data through client disconnects and re-connects is for your client to open and maintain multiple connections to the Redis server. Rather than Pub/Sub, the Redis server and client use an alternative design for holding the data samples and indexing them to be transferred to the client in the order they were received by Redis. I haven't read Josiah's book (Redis In Action), but I'm confident of his knowledge and experience in these matters, and one of the designs in his book will do the job.

frc138

unread,

Feb 28, 2018, 8:14:49 PM2/28/18

to Redis DB

@hvarzan, Does there exist an alternative approach other than Redis or UDP that will
give us the network throughput we want and the ability to buffer data through

client disconnects and re-connects ?

Thank you.

hva...@gmail.com

unread,

Mar 5, 2018, 1:31:16 AM3/5/18

to Redis DB

You replied to my summary, so you've seen it. Because you asked, I'll expand on it.

I have never created a solution similar to the one you're building, so I can't give you an answer with 100% assurance.
You have observed from other sources that UDP can be faster than TCP at transmitting data through a network, but the difference is tiny. In my opinion, UDP is not worth the overhead of the extra code you must write to detect missing data and re-transmit, so TCP is best.
The links I posted earlier about TCP throughput calculations indicate that a single TCP connection cannot achieve the transfer rate you want. You need at least 2 parallel TCP connections, each transferring different data from the other, to increase the transfer rate. I suggest using 3 so a single dropped connection will not create a backlog while it is down, and also to allow for normal growth in the amount of data you're handling over time.
On the sending side, the software can be Redis, or RabbitMQ, or Kafka, or similar kinds of queueing software. Would RabbitMQ or Kafka be more suitable to your needs? Perhaps. I haven't used them, so I can't say for sure, but I think they're worth investigating.
In Redis (since this is the Redis mailing list), a simple Pub/Sub queue will not work because when you attach 3 subscriber connections to the queue, they'll all get the same messages rather than different ones.
I believe an alternative approach exists in Redis, but because I haven't built one myself, I can't describe it for you. Does the book _Redis in Action_ have any examples that replace a Pub/Sub queue with an alternative approach?

hva...@gmail.com

unread,

Mar 5, 2018, 11:42:51 PM3/5/18

to Redis DB

Today I had the opportunity to read through some of the Kafka documentation and I found some encouraging stuff. Kafka has topics, which act similar to a PubSub channel.

Topics can be divided into partitions, and the messages spread around the partitions in round-robin fashion. As an example, divide the topic into 3 partitions, so each partition gets 1/3rd of the messages.
A Kafka consumer thread/process handles a connection from a client and reads the messages from one of the partitions and delivers them to the client.
So 3 client connections each talk to a consumer thread, each reading messages from a partition.

This divides the messages up among the client connections the way I suggested in yesterday's post, transferring the data up to 3x faster than a single client connection. If one of the connections breaks, the corresponding partition continues to receive new messages, and when the connection is made again, the client and consumer pick up where they left off. Meanwhile the other clients/consumers can continue streaming messages.

I believe it's possible to configure RabbitMQ to have a similar effect, and as I said yesterday, I believe there's a way to do what you need with Redis. I posted this about Kafka because I found some relevant info and wanted to pass it along. Hope this helps.

Reply all

Reply to author

Forward