Keyspace state after multiple Kamailio failovers

30 views
Skip to first unread message

Dries De Gendt

unread,
Jun 10, 2026, 7:48:44 AMJun 10
to Sipwise rtpengine

Hi,

I have 2 machines with keepalived on it and a floating IP (10.5.225.100) for kamailio/rtpengine.

instance #1: 10.5.225.101
instance #2: 10.5.225.102

There is also a redis cluster configured on it with VIP 10.5.225.118, which is hosted on the same instances.

rtpengine is configured as follows:

Instance #1:
OPTIONS="--interface pub1/10.5.225.100 --interface pub2/10.5.225.102 -n 10.5.225.100:2223 --redis=10.5.225.118:6379/1 --subscribe-keyspace=2 --redis-num-threads=8 --active-switchover=true -m 10000 -M 20000 -L 4 --log-facility=local1 --table=0 --delete-delay=0 --timeout=60 --silent-timeout=600 --final-timeout=7200 --offer-timeout=60 --num-threads=12 --tos=184 –-no-fallback --log-level=6 --listen-cli=10.5.225.101:9900 --listen-http=10.5.225.101:9901 --homer=192.168.83.15:9060 --homer-id=5678 --homer-protocol=udp"

Instance #2:
OPTIONS="--interface pub1/10.5.225.100 --interface pub2/10.5.225.101 -n 10.5.225.100:2223 --redis=10.5.225.118:6379/2 --subscribe-keyspace=1 --redis-num-threads=8 --active-switchover=true -m 10000 -M 20000 -L 4 --log-facility=local1 --table=0 --delete-delay=0 --timeout=60 --silent-timeout=600 --final-timeout=7200 --offer-timeout=60 --num-threads=12 --tos=184 –-no-fallback --log-level=6 --listen-cli=10.5.225.102:9900 --listen-http=10.5.225.102:9901 --homer=192.168.83.15:9060 --homer-id=5678 --homer-protocol=udp"

When both Kamailio/rtpinstances are active and a call is created on machine 2, the following is returned by list numsessions:

instance #1
Current sessions own: 0
Current sessions foreign: 1
Current sessions total: 1

instance #2
Current sessions own: 1
Current sessions foreign: 0
Current sessions total: 1

When I stop Kamailio on instance #2 to perform failover of the VIP, instance #1 is taking ownership and the call is handled by it's rtpengine instance.

instance #1:
Current sessions own: 1
Current sessions foreign: 0
Current sessions total: 1
Current transcoded media: 1

Rtpengine remains running on instance #2, but eventually cleans up its calls after timeout=60 is reached.
instance #2:
Current sessions own: 1
Current sessions foreign: 0
Current sessions total: 1
Current transcoded media: 1

instance #2:
Current sessions own: 0
Current sessions foreign: 0
Current sessions total: 0
Current transcoded media: 0

I then start Kamailio again on instance #2, but rtpengine is not aware of the still ongoing call on instance #1. If I would now failover instance #1 to instance #2, my call would drop because of no RTP.

# instance 2:
Current sessions own: 0
Current sessions foreign: 0
Current sessions total: 0
Current transcoded media: 0

If I would restart rtpengine on instance #2 before performing a failover on instance #1, the rtpengine will sync with the keyspace first after which I can successfully perform a failover.

Should I always restart rtpengine after starting Kamailio again to resolve this or am I not seeing something here?

Thanks,
Dries

Dries De Gendt

unread,
Jun 16, 2026, 9:41:38 AM (13 days ago) Jun 16
to Sipwise rtpengine
Hi,

Instead of using --active-switchover, I'm instructing rtpengine to take/leave ownership of existing calls as soon as the VIP has moved and it appears to be working.

/usr/local/src/rtpengine/utils/rtpengine-ctl -ip x.x.x.x -port 9900 active
/usr/local/src/rtpengine/utils/rtpengine-ctl -ip x.x.x.x -port 9900 standby

Dries
Op woensdag 10 juni 2026 om 13:48:44 UTC+2 schreef Dries De Gendt:
Reply all
Reply to author
Forward
0 new messages