Questions about Dispatchers & Scheduler in HA environment.

199 views
Skip to first unread message

vasilios.t...@voiceland.gr

unread,
Apr 23, 2021, 10:19:59 AM4/23/21
to CGRateS
Dear Team,

I am using latest CGRateS v0.10 and I am in the stage where I want to make our CGRateS Service highly available and easy to maintain & upgrade without downtime.
I am developing an NMS to which among other things it will also manage schedulers service among multiple availability zones.

So have 2 questions, 1 regarding Rals + Scheduler and 1 regarding dispatcher
and how I can use it with sessions instance that is connected with Opensips.

1)Currently I have deployed CGRateS Rals service in AWS in 3 availability zones(z1,z2,z3) all facing the same DB backend
and I only send traffic from sessions service to the Rals service in the first availability zone (z1).
If Rals in z1 fails the traffic is switched to z2. (I am using HAProxy now for this)

I want to ask if there would be a problem for scheduler service to run on a CGRateS instance in zone 2 while sessions service sends traffic to Rals running on zone1.
The question is this : Can we decouple Rals and Schedulers Service to 2 different CGRateS instances or do they need to be on the same CGRateS instance for some reason?

I am asking this because I have read here --> https://groups.google.com/g/cgrates/c/eEQugJ7zHaA/m/8vFaS5S7BAAJ
that you suggest scheduler always stays with Rals. Do you mean to have the same DB or to be on the same instance?

2) My setup is this. Opensips ---> CGRateS Sessions --> Rals zone1
I wanted to use dispatchers for connections from Sessions --> Rals
so that I can define multiple Rals and enable for example a quick failover in case Rals in Zone1 fails.

When I enabled dispatcher in the CGRateS Sessions instance, opensips sends this request to CGRateS Sessions :

{ "method": "SessionSv1.AuthorizeEventWithDigest", "id": 726698612, "params": [ { "Tenant": "voiceland.dev", "ProcessStatQueues": true, "ProcessThresholds": true, "ReleaseResources": true, "AllocateResources": true, "AuthorizeResources": true, "GetSuppliers": true, "GetAttributes": true, "GetMaxUsage": true, "Event": { "UserAgent": "snomD712\/10.1.54.13", "OriginIP": "80.245.X.X", "Category": "call-out", "Destination": "+302107001397", "OriginCLD": "+302107001397", "OriginCLI": "+302107001398", "ID": "OPENSIPS_CHECK_USER_BALANCE_OUT", "CDRType": "CUSTOMER", "Source": "SIP-1-AWS", "OriginHost": "10.100.0.21", "Subject": "INTERNAL", "RequestType": "*prepaid", "Account": "+302107001398", "Tenant": "voiceland.dev", "OriginID": "3136313931303739323831323733-htdn1oxlr5ot", "SetupTime": "1619107928" } } ] }

and CGRateS responds to OpenSIPS this: 

T 2021/04/22 16:12:08.668924 127.0.0.1:2014 -> 127.0.0.1:42478 [AP] #6
{"id":726698612,"result":null,"error":"ATTRIBUTES_ERROR:MANDATORY_IE_MISSING: [ArgDispatcher]"}

As I can understand there is a parameter missing here (ArgDispatcher)
Could you please explain if dispatcher will work with Opensips or it needs any extra code from OpenSIPS side?
Do we need to pass extra opt arguments to CGRateS from OpenSIPS?

thank you in advance for your support and thanks again for this great project!

Vasilios Tzanoudakis


Alexandru Tripon

unread,
May 4, 2021, 8:10:20 AM5/4/21
to cgr...@googlegroups.com
Hi Vasilios,

We recommend using the RaLs and SchedulerS on the same machine because we could overwrite the account in DB: the RaLs takes the account for debit and the Scheduler takes the account for topup and the one that writes the account last will overwrite the previous write( because we can not lock the account if they are not in the same process).
As regarding the dispatchers, it should work but without the dispatcher authorization part(no attribute connection from the dispatcher subsystem).

Thanks,
Trial97

vasilios.t...@voiceland.gr

unread,
May 7, 2021, 1:16:55 PM5/7/21
to CGRateS
Dear Alexander,

thank you for your reply.

As I have described I am sending this config to a master cgrates server which has this config --> https://pastebin.com/DteVEACz
and config is replicated to --> SESSIONS_MASTER_Z1_REPLICATION_ADDRESS which has this config here --> https://pastebin.com/0mPKcvBq

I have disabled dispatcher authorization on SESSIONS_MASTER_Z1_REPLICATION_ADDRESS , so dispatcher section is :

"dispatchers":{
        "enabled": true,
},

and when making a request like this from opensips  : 

{"method":"AttributeSv1.ProcessEvent","params":[{"AttributeIDs":null,"Context":"*sessions","ProcessRuns":null,"Tenant":"global","ID":"4e103f29-8292-473e-b524-74aa25ca7203","Time":null,"Event":{"Account":"LOCAL_DOMAINS","ID":"CHECK_LOCAL_DOMAINS"}}],"id":4}

I receive from SESSIONS_MASTER_Z1_REPLICATION_ADDRESS: 

T 2021/05/07 16:41:06.879802 127.0.0.1:2012 -> 127.0.0.1:47368 [AP] #40633
{"id":4,"result":null,"error":"DISPATCHER_ERROR:NOT_FOUND"}

is this a sign that the dispatcher configuration is not loaded to the SESSIONS_MASTER_Z1_REPLICATION_ADDRESS server?


Here is my dispatcher configuration which is sent to cgrates master server via Apier
=====================
{
    "id": 1,
    "method": "APIerSv1.GetDispatcherProfileIDs",
    "params": [
        {
            "Tenant": "global"
        }
    ]
}

{
    "id": 1,
    "result": [
        "ALL",
        "DOMAINS"
    ],
    "error": null
}
=====================

{
    "id": 1,
    "method": "APIerSv1.SetDispatcherProfile",
    "params": [
        {
            "Tenant": "global",
            "ID": "DOMAINS",
            "Subsystems": [
                "*any"
            ],
            "FilterIDs": ["*string:~*req.EventName:Internal"],
            "ActivationInterval": null,
            "Strategy": "*weight",
            "StrategyParams": {},
            "Weight": 10,
            "Cache": "*none",
            "Hosts": [
                {
                    "ID": "SELF",
                    "Weight": 25
                }
            ]
        }
    ]
}
==========
{
    "id": 1,
    "method": "APIerSv1.SetDispatcherProfile",
    "params": [
        {
            "Tenant": "global",
            "ID": "ALL",
            "Subsystems": [
                "*any"
            ],
            "FilterIDs": [],
            "ActivationInterval": null,
            "Strategy": "*weight",
            "StrategyParams": {},
            "Weight": 10,
            "Cache": "*none",
            "Hosts": [
                {
                    "ID": "ALL",
                    "Weight": 25
                }
            ]
        }
    ]
}
============
{
    "id": 1,
    "method": "APIerSv1.GetDispatcherHostIDs",
    "params": [
        {
            "Tenant": "global"
        }
    ]
}

{
    "id": 1,
    "result": [
        "SELF",
        "ALL"
    ],
    "error": null
}
===================
{
    "id": 1,
    "method": "APIerSv1.SetDispatcherHost",
    "params": [
        {
            "Tenant": "global",
            "ID": "SELF",
            "Conns": [
                {
                    "Address": "*internal",
                    "Transport": null,
                    "Synchronous": false,
                    "TLS": false
                }
            ]
        }
    ]
}
=====================
{
    "id": 1,
    "method": "APIerSv1.SetDispatcherHost",
    "params": [
        {
            "Tenant": "global",
            "ID": "ALL",
            "Conns": [
                {
                    "Address": "127.0.0.1:2012",
                    "Transport": "*json",
                    "Synchronous": false,
                    "TLS": false
                }
            ]
        }
    ]
}
==============

I am pretty sure that there are things that I haven't understood well regarding dispatcher.

For example Why when sending this request to master server:

===========================
{
    "id": 1,
    "method": "APIerSv1.SetDispatcherHost",
    "params": [
        {
            "Tenant": "global",
            "ID": "ALL",
            "Conns": [
                {
                    "Address": "127.0.0.1:2012",
                    "Transport": "*json",
                    "Synchronous": false,
                    "TLS": false
                }
            ]
        }
    ]
}
===========================
master server responds:
==============================
{
    "id": 1,
    "result": null,
    "error": "SERVER_ERROR: DISPATCHER_ERROR:NOT_FOUND"
}
==============================
and the full ngrep traffic from master to  SESSIONS_MASTER_Z1_REPLICATION_ADDRESS is here --> https://pastebin.com/bEUfcXwx

Sorry for the big response but I had to show you the full picture.

Thank you in advance for your support.

Vasilios Tzanoudakis

Alexandru Tripon

unread,
May 10, 2021, 8:43:58 AM5/10/21
to cgr...@googlegroups.com
Hi Vasilios,

When the dispatcher subsystem is enabled the administration APIs(e.g. APIerSv1, APIerSv2, ReplicatorSv1)  are disabled on that engine and the only way to update the cache of the dispatcher engine is using the APIs on the `CacheSv1` with a `*internal` Host.
Also in order to load the data in the dataDB of the dispatcher engine, we recommend using the `cgr-loader`.
Regarding the automatic cache reload looks like we do not populate the Tenant for that API and this causes the `NOT_FOUND` on the dispatcher engine.
Please open an issue on GitHub regarding the missing tenant for the automatic cache reload.
Also, the `ALL` DispatcherHost doesn't look right: it points back on the same dispatcher engine, and this will call the same API on the same engine in an infinite loop.
Please try to load the data in the DB using the `cgr-loader` and retest.

Thanks,
Trial97

--
You received this message because you are subscribed to the Google Groups "CGRateS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cgrates+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cgrates/c3e0aa8d-b398-49e3-b7a8-098457c41a29n%40googlegroups.com.
Message has been deleted

Alexandru Tripon

unread,
May 12, 2021, 9:51:58 AM5/12/21
to cgr...@googlegroups.com
Hi Vasilios,

Thanks for opening the issue.
We do not have in plan to enable the APIer interface when the DispatcherS is active on v0.10, as the v0.10 needs to be backward compatible and only add bug fixes.
We plan to enable them when we rewrite them in the new 1.0 branch.
Regarding the cache error from loaders should be fixed on the latest v0.10 branch.
Yes, when the transport is over the network(e.g. `*json`) the dispatcher host should point to a different engine( not back at the same engine where the dispatcher is enabled).
But if you want the request to be sent to the session(or any other subsystem) that is enabled on the engine where the dispatcher is also enabled the dispacherHost from the matched profile needs to have the address `*internal`.

Thanks,
Trial97

Vasilios Tzanoudakis

unread,
May 13, 2021, 1:17:23 PM5/13/21
to cgr...@googlegroups.com
Dear Alexandru,

Thank you very much for your reply, it helped me alot understanding some more things about dispatchers.

I am now running 

/go/bin/cgr-engine --version
CGR...@v0.10.3~dev-20210512064421-a8ecf36bec3d

with your latest commit but I am getting the DISPATCHER_ERROR:NOT_FOUND when running :

/go/bin/cgr-loader -config_path /etc/cgrates/
2021/05/12 16:40:39 Could not reload cache: DISPATCHER_ERROR:NOT_FOUND

ngrep on 2012
==============================================================
{"method":"CacheSv1.ReloadCache","params":[{"APIKey":"","RouteID":"","Tenant":"","DestinationIDs":[],"ReverseDestinationIDs":[],"RatingPlanIDs":[],"RatingProfileIDs":[],"ActionIDs":[],"ActionPlanIDs":[],"AccountActionPlanIDs":[],"ActionTriggerIDs":[],"SharedGroupIDs":[],"ResourceProfileIDs":[],"ResourceIDs":null,"StatsQueueIDs":null,"StatsQueueProfileIDs":[],"ThresholdIDs":null,"ThresholdProfileIDs":[],"FilterIDs":[],"SupplierProfileIDs":[],"AttributeProfileIDs":[],"ChargerProfileIDs":[],"DispatcherProfileIDs":["voiceland.dev:ALL","voiceland.dev:RALS"],"DispatcherHostIDs":["voiceland.dev:SELF","voiceland.dev:RALS","voiceland.dev:RALS2"],"DispatcherRoutesIDs":null,"FlushAll":false}],"id":0}

##
T 2021/05/13 16:07:01.683501 127.0.0.1:2012 -> 127.0.0.1:48464 [AP] #43536
{"id":0,"result":null,"error":"DISPATCHER_ERROR:NOT_FOUND"}
==============================================================

I managed to send the CacheSv1.ReloadCache by changing "Tenant":"" --> "Tenant":"voiceland.dev" via postman and it worked.
Why is Tenant not populated to the Tenant param?

Here is my config

cat DispatcherHosts.csv

cat DispatcherProfiles.csv
---------------------------------
#Tenant,ID,Subsystems,FilterIDs,ActivationInterval,Strategy,StrategyParameters,ConnID,ConnFilterIDs,ConnWeight,ConnBlocker,ConnParameters,Weight
voiceland.dev,ALL,*sessions,,,*weight,,SELF,,20,false,,10
voiceland.dev,RALS,*rals,,,*weight,,RALS,,20,false,,10
voiceland.dev,RALS,,,,,,RALS2,,10,,,

I want only rals to be send to external boxes RALS and RALS2 and all the other queries to be send to *internal.
so for example I have configured in the sessions section to have *localhost only on rals_conns and all the other data are locally cached and should be accessed by *internal

"sessions": {                                                                                                                                                                                          
        "enabled": true,                                                                                                                                                                              
        "listen_bijson": "127.0.0.1:2014",                                                                                                                                                            
        "chargers_conns": ["*internal"],                                                                                                                                                              
        "rals_conns": ["*localhost"],                                                                                                                                                                  
        "cdrs_conns": ["*internal"],                                                                                                                                                                  
        "resources_conns": ["*internal"],                                                                                                                                                              
        "thresholds_conns": ["*internal"],                                                                                                                                                            
        "stats_conns": ["*internal"],                                                                                                                                                                  
        "suppliers_conns": ["*internal"],                                                                                                                                                              
        "attributes_conns": ["*internal"],                                                                                                                                                            
        "debit_interval": "5s",                                                                                                                                                                        
},        

complete cgrates.json --> here

The problem is that for example this method cannot be found when dispatcher is enabled (because of APier been disabled). for example OpenSIPS send to CGRateS when trying to make a call:

ngrep 2014 (opensips side)
=============================================================
{ "method": "SessionSv1.AuthorizeEventWithDigest", "id": 1367673414, "params": [ { "Tenant": "voiceland.dev", "ProcessStatQueues": true, "ProcessThresholds": true, "ReleaseResources": true, "AllocateResources": true, "AuthorizeResources": true, "GetSuppliers": true, "GetAttributes": true, "GetMaxUsage": true, "Event": { "UserAgent": "snomD712\/10.1.54.13", "OriginIP": "80.245.164.225", "Category": "call-out", "Destination": "+302107001397", "OriginCLD": "+302107001397", "OriginCLI": "+302107001398", "ID": "OPENSIPS_CHECK_USER_BALANCE_OUT", "CDRType": "CUSTOMER", "Source": "SIP-1-AWS", "OriginHost": "10.100.0.21", "Subject": "INTERNAL", "RequestType": "*prepaid", "Account": "+302107001398", "Tenant": "voiceland.dev", "OriginID": "313632303932353935353537363639-bl2knv6apee6", "SetupTime": "1620925956" } } ] }
#
T 2021/05/13 17:12:36.896354 127.0.0.1:2014 -> 127.0.0.1:56408 [AP] #24
{"id":1367673414,"result":null,"error":"SUPPLIERS_ERROR:SERVER_ERROR: rpc: can't find method Responder.GetCostOnRatingPlans"}
=============================================================

ngrep 2012
==============================================================
T 2021/05/13 16:23:21.235902 127.0.0.1:48816 -> 127.0.0.1:2012 [AP] #43939
{"method":"Responder.GetCostOnRatingPlans","params":[{"Account":"9b83b82a-878c-457d-8879-eece6a492994","Subject":"9b83b82a-878c-457d-8879-eece6a492994","Destination":"+302107001397","Tenant":"voiceland.dev","SetupTime":"2021-05-13T16:23:21Z","Usage":10800000000000,"RatingPlanIDs":["TP_RP_SPL_VOICELAND_INTERNAL_DST_ID_RT_GR_LANDLINE_2020_02_IN"]}],"id":1}

##
T 2021/05/13 16:23:21.235972 127.0.0.1:2012 -> 127.0.0.1:48816 [AP] #43941
{"id":1,"result":null,"error":"rpc: can't find method Responder.GetCostOnRatingPlans"}
==============================================================
When I disable dispatcher service this query is working fine.

A second request from OpenSIPS like this is working fine:
{ "method": "SessionSv1.AuthorizeEventWithDigest", "id": 1367673410, "params": [ { "GetAttributes": true, "Tenant": "voiceland.dev", "Event": { "ID": "GET_USER_PROFILE", "Account": "+302107001398", "Tenant": "voiceland.dev" } } ] }

As you may have understood I am a little bit confused and testing 2 days now but can't find the ideal configuration for this setup I want.
Is this a bug of "Responder.GetCostOnRatingPlans" on sessions service not working when dispatcher is enabled?

What should I configure on DispatcherProfiles and cgrates.json for all requests to go on local engine(or other external box) and only *rals to --> RALS and RALS2 connections?
Should I set more subsystems to make this setup work? 

Also I would like to ask how can we verify data currently been on the dispatcher host? I ask this because APier is disabled.

If APier for dispatcher will be available on version 1.0 then this is great news. We already have plans
to migrate to v1.0 after you release it. It is very important that you also provide some migration guidelines from v0.10 --> v1.0 for all of us that we are stuck on v0.10.

A final question, now lets assume that we have the fail over scenario with dispatcher working.
Can CGRateS send out any event if RALS Host is not reachable and request is send to RALS2 Host? 
I want CGRateS to send me an Event (for example using actions && http POST) to my NMS service
when each Remote Host Connections for selected subsystem is not accessible so that my NMS can enable schedulers service to the second RALS2 Host etc..
Is that possible somehow? Can CGRateS stick to the second Host even though the RALS Host may be up again after a while?

thank you in advance for your support.





Vasilios Tzanoudakis
Technology Director & Co-Founder
VOICELAND S.A.
p:+302122228000
f:+302122228001
a:81, Ifigenias Str., Nea Ionia, 14231, GR
w:www.voiceland.gr  e: vasilios.t...@voiceland.gr
  
This email communication (and any attachments) are confidential and are intended only for the individual(s) or entity named above and others who have been specifically authorized to receive it. If you are not the intended recipient, please do not read, copy, use or disclose the contents of this communication to others. Please notify the sender that you have received this email in error by replying to the email or by telephoning +30-212-222-8000. Please then delete the email and any copies of it. This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. Thank you.

Think Green // Print if only necessary

 



--
You received this message because you are subscribed to a topic in the Google Groups "CGRateS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cgrates/-ZHMOOIafDM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cgrates+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cgrates/CAAkCrRPedZSpYvURNKeWoO_s9f1DvWe_FBneL1Ed0xT-XwUfig%40mail.gmail.com.

Alexandru Tripon

unread,
May 17, 2021, 9:36:02 AM5/17/21
to cgr...@googlegroups.com
Hi Vasilios,

Regarding ReloadCache I'm sorry I missed some of the calls, but I reviewed this again and fixed the missing Tenant.
All Responder functions should work with DispatcherS.
Please open an issue regarding the "Responder.GetCostOnRatingPlans" API.
The RALS dispatcher profile should also have the *responder subsystem(for Responder.Debit API):
If you want a DispatcherProfile to match all subsystems you can use `*any` subsystem no need for multiple profiles.
But be careful at the weight to be smaller than the profile that matches the *rals subsystem.
e.g.
#Tenant,ID,Subsystems,FilterIDs,ActivationInterval,Strategy,StrategyParameters,ConnID,ConnFilterIDs,ConnWeight,ConnBlocker,ConnParameters,Weight
voiceland.dev,ALL,*any,,,*weight,,SELF,,20,false,,10
voiceland.dev,RALS,*rals;*responder,,,*weight,,RALS,,20,false,,20
voiceland.dev,RALS,,,,,,RALS2,,10,,,

Regarding how you can check the data you can start a new engine that points to the same DB but has the dispatchers disabled and use the APIerV1 on that engine.
But this will not work if the db_type is `*internal`.
But regardless of the db_type you can check the items that you have in the cache using the CacheSv1 APIs(https://pkg.go.dev/github.com/cgrates/cgr...@v0.9.1-rc3.0.20210511185711-91eda67c4ac9/apier/v1#CacheSv1).

The CGRateS can not send any event. If the first host is not reachable the API is sent to the next one.
Regarding the notify action when the host is down we can not implement it due to performance issues.
We recommend using an NMS(https://en.wikipedia.org/wiki/Network_monitoring) to monitor if the host is down or not.

Thanks,
Trial97


You received this message because you are subscribed to the Google Groups "CGRateS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cgrates+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cgrates/CADTki%2BD5QHrw8Jiu5ocw1Ud-vsFE2A8rmT%3DBy7nj7yHBA0b7VA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages