LoadBalancingInterceptor NodeOnlineChecker

39 views
Skip to first unread message

Sathish Ganesan

unread,
Dec 18, 2016, 7:51:36 AM12/18/16
to membrane-monitor
Hi,

We are currently using proxy version 4.1.0

And we are facing a issue which is described below,
1. Register the nodes in LoadBalancingInterceptor
2. The request is redirected correctly.
3. When the registered nodes(say node#1) go down intermittently, the request is redirected to other nodes (Failover)
4. When the node#1 comes back again, the requests are not rerouted to that node again.
5. We need a restart of the proxy to make it reroute it again to that node.

While browsing through the java code, I could see a NodeOnlineChecker field newly introduced in the latest version(4.3.0)
Was thinking whether this will solve our issue?

If not, can you let us know what we are missing in our code for the LBI.

Code snippet as below,

balancer.setClusters(new ArrayList<Cluster>(clusterMap.values()));
loadBalancingInterceptor.setClustersFromSpring(Arrays.asList(balancer));

We were hoping that loadBalancingInterceptor.setNodeOnlineChecker(noc) will keep retrying for the old node which has come up and bring it back alive
without needing a restart.

Let us know your thoughts on this.

Thanks,
Sathish G

Till Born

unread,
Mar 20, 2017, 9:05:03 AM3/20/17
to membrane-monitor
Hello Sathish Ganesan,
sorry for the late response. I also think the NodeOnlineChecker is the solution to your problem as it puts nodes down when they are not reachable but also back up, after some specified delay, when they are reachable and answer with statuscode 200 (this is currently not generalized to put them back up on any answer). The NodeOnlineChecker is available from Membrane Service Proxy v4.2 onwards. We recommend the newest version as it is deemed stable.
You can just put the NodeOnlineChecker into your load balancer as a child element. By default the NodeOnlineChecker only takes down nodes that answer 10 times with a 5xx statuscode. By default it does not try to take them back up. Use the following configuration as a basis:
<balancer>
   [...]
   
<nodeOnlineChecker nodeCounterLimit5XX="10" retryTimeInSeconds="300"/>
   [...]
</balancer>
nodeCounterLimit5xx specifies how many times a node has to fail with code 5xx until it is taken down
retryTimeInSeconds specifies the minimum time before a node is checked again if it is online. This is then done on the next request to the service proxy
I hope i could help.
Wish you all the best,
Till
Reply all
Reply to author
Forward
0 new messages