Handling scheduled actions with HA/DR

65 views
Skip to first unread message

Gavin Henry

unread,
Feb 15, 2019, 5:41:23 PM2/15/19
to CGRateS
Hi all,

I think I've asked this before buried in other threads, but when operating in active/passive mode using say, keepalived or HAProxy, what do I do about SchedulerS?

I want to have two or more CGRateS/RALS systems that can take traffic that HAProxy can control via say sticky sessions later. For now my use of RALS is stateless; an auth and a CDR rating over https.

I'm just not sure what the SchedulerS will do with ActionPlans if there are two systems or more running. I know CGRateS can take a lot of traffic and is fast and ours is living in a Docker Swarm, so DR and HA is handled by that. We can put a health check in that and if it goes down systemd brings it back, but still. I'd like maybe a normal VM as a final backup system. What should I be concerned about?

Dan, sorry. You have explained active/passive before but I'm not sure if multiple systems running jobs at the same time is an issue. Over thinking as usual.

Thanks,
Gavin.

Dan Christian Bogos

unread,
Feb 19, 2019, 12:07:10 PM2/19/19
to cgr...@googlegroups.com
Hey Gavin,

Please see answers inline ...

On Fri, 2019-02-15 at 14:41 -0800, Gavin Henry wrote:
> Hi all,
>
> I think I've asked this before buried in other threads, but when
> operating in active/passive mode using say, keepalived or HAProxy,
> what do I do about SchedulerS?
You should normally always have scheduler running where the RALs is
running, otherwise you risk overriding the accounts from two different
places.
You can dynamically control starting/stopping of the schedulerS via
API, ie here:
https://godoc.org/github.com/cgrates/cgrates/apier/v1#ApierV1.StartService
Example of use here:
https://github.com/cgrates/cgrates/blob/master/apier/v1/apier_it_test.go#L1704

So basically you can have your NMS managing the scheduler start/stop
from remote.
>
> I want to have two or more CGRateS/RALS systems that can take traffic
> that HAProxy can control via say sticky sessions later. For now my
> use of RALS is stateless; an auth and a CDR rating over https.
>
> I'm just not sure what the SchedulerS will do with ActionPlans if
> there are two systems or more running. I know CGRateS can take a lot
> of traffic and is fast and ours is living in a Docker Swarm, so DR
> and HA is handled by that. We can put a health check in that and if
> it goes down systemd brings it back, but still. I'd like maybe a
> normal VM as a final backup system. What should I be concerned about?
The most important is that you create/top-up accounts in the same place
where you debit them (ie: via real-time calls). So that was why I have
recommended that scheduler stays with RALs always.
>
> Dan, sorry. You have explained active/passive before but I'm not sure
> if multiple systems running jobs at the same time is an issue. Over
> thinking as usual.
Only if you have bad luck and account is processed concurrently (ie:
same ms) in two different nodes.

DanB
>
> Thanks,
> Gavin.
> --
> You received this message because you are subscribed to the Google
> Groups "CGRateS" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to cgrates+u...@googlegroups.com.
> To post to this group, send email to cgr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cgrates/c83a8626-21dd-4477-a4c8-7f871a0e3a17%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.

Gavin Henry

unread,
Feb 19, 2019, 1:30:46 PM2/19/19
to CGRateS
Hi Dan,

Thanks.

I'm thinking we set up an active/passive SchedulerS, but run the http_agent things and everything else like a normal X * backend app HAProxy design. I'll do a sketch and add it here so you can see what I mean.

Gavin.

వెంకటేష్ ఎన్నల

unread,
Sep 11, 2019, 1:05:37 AM9/11/19
to CGRateS
Hi Gavin,

We are currently in same situation. How did you configure your RALs and Scheduler ?

We have mutiple RALs deployed to kubernetes. Currently we don't have scheduler and we plan to add it.
So was wondering what you ended up with.

Appreciate your response.

Thanks,
Venkat.

Gavin Henry

unread,
Sep 12, 2019, 6:31:31 AM9/12/19
to CGRateS
Hi Venkat,

Just have one RALs with a scheduler and if a health check fails on it, promote one other RALs. Either via the enabled true in the JSON or via the API. It will need a bit of a health check script, but not too difficult.

Thanks.

వెంకటేష్ ఎన్నల

unread,
Sep 12, 2019, 9:41:33 AM9/12/19
to CGRateS
Thanks Gavin.
Reply all
Reply to author
Forward
0 new messages