Subscribe events for Redis for Hot Standby

Jonathan Hulme

unread,

May 11, 2022, 10:12:48 AM5/11/22

to rtpengine

Hi,

I am not entirely sure on the way that this suppose to work, please correct me if I am wrong.

I have a primary server using a Floating IP. This works fine, calls come in and they they get written to the redis backend.

I have my standby server which subscribes and I see the psubscribe in redis monitor.

The Problem

The active server never seems to issue a PUBLISH message, rending the psubscribe useless.

I can force the open dialogs to appear on the standby server if I restart it.

What I expect

I may be mistaken in the architecture here, but I am expecting a combination SET and PUBLISH at the same time to maintain state, SET for persistance and PUBLISH to update the standby servers in real-time.

Is my understanding/expectations on this correct? If not what is the psubscribe for?

What else might I be missing to get the standby server to be instantly ready for failover?

Regards Jonathan

Richard Fuchs

unread,

May 11, 2022, 11:02:19 AM5/11/22

to rtpe...@googlegroups.com

We don't use explicit PUBLISH. Instead we use keyspace notifications, where the subscriber gets notified about any changes to any keys in the keyspace, meaning that the SET alone triggers a notification to the subscriber. Keyspace notifications must be enabled in the Redis config (they're usually disabled by default) which is a common oversight.

Cheers

Jonathan Hulme

unread,

May 11, 2022, 11:46:14 AM5/11/22

to rtpengine

Thank you, that has got me 1 step further.

The problem I am seeing now is

rtpengine[556738]: WARNING: [9jn7UU7wHza8WGocdQBlXg..]: [core] Failed to restore call ID '9jn7UU7wHza8WGocdQBlXg..' from Redis: could not retrieve JSON data from redis

Again watching the MONITOR, I can see the SET from Server A and see that the variable is being set and read from Server B

1652283461.250255 [1 10.139.0.6:43928] "SET" "9jn7UU7wHza8WGocdQBlXg.." "{\"json\":...}"
1652283461.250627 [1 10.139.0.6:43928] "EXPIRE" "9jn7UU7wHza8WGocdQBlXg.." "86400"
1652283461.251425 [1 10.139.0.6:43930] "PING"
1652283461.251459 [1 10.139.0.9:53988] "PING"
1652283461.251983 [1 10.139.0.9:53988] "GET" "9jn7UU7wHza8WGocdQBlXg.."

I am using the 10.4 stable branch on both servers. Any suggestions?

Regards Jonathan

Richard Fuchs

unread,

May 13, 2022, 3:38:31 AM5/13/22

to rtpe...@googlegroups.com

On 11/05/2022 11.46, [EXT] Jonathan Hulme wrote:

Thank you, that has got me 1 step further.

The problem I am seeing now is

rtpengine[556738]: WARNING: [9jn7UU7wHza8WGocdQBlXg..]: [core] Failed to restore call ID '9jn7UU7wHza8WGocdQBlXg..' from Redis: could not retrieve JSON data from redis

Again watching the MONITOR, I can see the SET from Server A and see that the variable is being set and read from Server B

1652283461.250255 [1 10.139.0.6:43928] "SET" "9jn7UU7wHza8WGocdQBlXg.." "{\"json\":...}"
1652283461.250627 [1 10.139.0.6:43928] "EXPIRE" "9jn7UU7wHza8WGocdQBlXg.." "86400"
1652283461.251425 [1 10.139.0.6:43930] "PING"
1652283461.251459 [1 10.139.0.9:53988] "PING"
1652283461.251983 [1 10.139.0.9:53988] "GET" "9jn7UU7wHza8WGocdQBlXg.."

I am using the 10.4 stable branch on both servers. Any suggestions?

That doesn't really explain what's happening so you need to look further. The error message is logged when the GET command didn't return any data (or didn't return a string), but I can only guess as to why that is. Perhaps it's trying to read from the wrong DB? The most thorough way to inspect what's going on is with Wireshark: look at what's happening on the Redis port, look at the GET command and what response it returns, look at the SELECT command preceding it.

Cheers

Jonathan Hulme

unread,

May 13, 2022, 4:38:00 AM5/13/22

to rtpengine

I have checked this on wireshark and it appears to be using the correct database.

I have attached the wireshark (captured from the Standby RTP Server), the stream querying that and a level 7 log (different call, same effect, also logged on the Standby RTP Server)

Please let me know if you can see what is going wrong, Thank you.

redis-stream.txt

redis.pcap

rtpengine-log.txt

Richard Fuchs

unread,

May 13, 2022, 6:26:54 AM5/13/22

to rtpe...@googlegroups.com

On 13/05/2022 04.38, [EXT] Jonathan Hulme wrote:
> I have checked this on wireshark and it appears to be using the
> correct database.
>
> I have attached the wireshark (captured from the Standby RTP Server),
> the stream querying that and a level 7 log (different call, same
> effect, also logged on the Standby RTP Server)
>
> Please let me know if you can see what is going wrong, Thank you.

The initial GET does work. It then receives another notification for the
same call ID (probably the answer coming through, triggering an update
for the call), at which point a DEL is issued to the DB. The subsequent
GET then results in failure. So you have some interference there where
both active and standby rtpengine seem be writing into the same DB?
Possibly not directly but via Redis replication? Double check config and
replication paths. Hint: The standby rtpengine first internally deletes
an existing (standby) call when an update for that call is received,
then immediately re-restores the updated version of that call from Redis.

Cheers

Jonathan Hulme

unread,

May 17, 2022, 4:52:54 AM5/17/22

to rtpengine

All working now, was a problem with my configuration of interfaces.

Thanks for your help

Reply all

Reply to author

Forward