Using Eureka to add and remove nodes from the Driver

21 views
Skip to first unread message

Steven

unread,
Nov 7, 2016, 12:09:37 PM11/7/16
to DataStax Java Driver for Apache Cassandra User Mailing List
Hello, all,

At my company, we use Eureka to discover services, including Cassandra nodes. When a Cassandra node undergoes maintenance, is  restarted, or replaced, we remove it from Eureka to prevent services from connecting to it. Notably, when replacing one node with another, the script that installs the Cassandra service on the new node may restart the service a few times before it is fully configured and ready to accept requests. Only at this point is the new node included in Eureka. We would like services only to connect to a new node when it is listed in Eureka to avoid spurious errors. However, the Datastax Java Driver performs its own discovery by subscribing to host up & down events from the co-ordinator node, which means that services may connect to new nodes too soon.

We would like to create something like a HostFilterPolicy that filters out nodes that are not currently in Eureka. The existing HostFilterPolicy is not quite suitable, because it assumes that the filter is static; that is, the criteria for including or excluding a node from the the set to query is fixed at the time the HostFilterPolicy is created. We would like the filter to change upon each update to the set of available hosts in Eureka.

So, upon an update to Eureka, we would like to ask the Cluster to refresh all connections to affected nodes. We think that by calling PoolingOptions.refreshConnectedHosts(), the Cluster will ask the LoadBalancingPolicy to re-evaluate the distance to each host. The Eureka-based filter shall assign each host that is not in Eureka HostDistance.IGNORED.

Has anyone else attempted to work-around the Driver's built-in discovery like this? What do you think of this approach?

Thanks

Kevin Gallardo

unread,
Nov 10, 2016, 7:09:43 AM11/10/16
to java-dri...@lists.datastax.com
Hi, 

That's an interesting use case, implementing a Custom LoadBalancingPolicy with this approach sounds like the right one. I'd be interested to see the final solution if it gets published somewhere at some point.

Cheers.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.



--
Kévin Gallardo.
Software Engineer in Drivers and Tools Team at DataStax.

Alexandre Dutra

unread,
Nov 14, 2016, 7:15:35 AM11/14/16
to java-dri...@lists.datastax.com
Hi,

A predicate doesn't have to be "static", even if most of them are (in that they always return the same result for a given input).

Why don't you implement your own predicate and inject your custom Eureka logic in there? HostFilterPolicy never caches the result returned by the predicate for a given host, so this should work in theory.

Hope that helps,

Alexandre

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.



--
Kévin Gallardo.
Software Engineer in Drivers and Tools Team at DataStax.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Alexandre Dutra
Driver & Tools Engineer @ DataStax

Minh Do

unread,
Nov 14, 2016, 12:06:16 PM11/14/16
to java-dri...@lists.datastax.com
Hi,

We implemented a custom LoadBalancingPolicy (LBP) like Kevin also suggested.  This LBP takes in another regular LBP and has its own Eureka filter there.

We override the method:
public Iterator<Host> newQueryPlan(final String loggedKeyspace, final Statement statement);
to call the child LBP for a list of valid hosts (coming from the Cassandra gossip/peer table), and then apply the Eureka filter on this iterator before
returning the list of hosts to the caller.

Hope this helps.
Minh


On Mon, Nov 14, 2016 at 4:15 AM, Alexandre Dutra <alexand...@datastax.com> wrote:
Hi,

A predicate doesn't have to be "static", even if most of them are (in that they always return the same result for a given input).

Why don't you implement your own predicate and inject your custom Eureka logic in there? HostFilterPolicy never caches the result returned by the predicate for a given host, so this should work in theory.

Hope that helps,

Alexandre
On Thu, Nov 10, 2016 at 1:09 PM Kevin Gallardo <kevin.g...@datastax.com> wrote:
Hi, 

That's an interesting use case, implementing a Custom LoadBalancingPolicy with this approach sounds like the right one. I'd be interested to see the final solution if it gets published somewhere at some point.

Cheers.
On Mon, Nov 7, 2016 at 6:09 PM, 'Steven' via DataStax Java Driver for Apache Cassandra User Mailing List <java-driver-user@lists.datastax.com> wrote:
Hello, all,

At my company, we use Eureka to discover services, including Cassandra nodes. When a Cassandra node undergoes maintenance, is  restarted, or replaced, we remove it from Eureka to prevent services from connecting to it. Notably, when replacing one node with another, the script that installs the Cassandra service on the new node may restart the service a few times before it is fully configured and ready to accept requests. Only at this point is the new node included in Eureka. We would like services only to connect to a new node when it is listed in Eureka to avoid spurious errors. However, the Datastax Java Driver performs its own discovery by subscribing to host up & down events from the co-ordinator node, which means that services may connect to new nodes too soon.

We would like to create something like a HostFilterPolicy that filters out nodes that are not currently in Eureka. The existing HostFilterPolicy is not quite suitable, because it assumes that the filter is static; that is, the criteria for including or excluding a node from the the set to query is fixed at the time the HostFilterPolicy is created. We would like the filter to change upon each update to the set of available hosts in Eureka.

So, upon an update to Eureka, we would like to ask the Cluster to refresh all connections to affected nodes. We think that by calling PoolingOptions.refreshConnectedHosts(), the Cluster will ask the LoadBalancingPolicy to re-evaluate the distance to each host. The Eureka-based filter shall assign each host that is not in Eureka HostDistance.IGNORED.

Has anyone else attempted to work-around the Driver's built-in discovery like this? What do you think of this approach?

Thanks

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.



--
Kévin Gallardo.
Software Engineer in Drivers and Tools Team at DataStax.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.
--
Alexandre Dutra
Driver & Tools Engineer @ DataStax

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Kevin Gallardo

unread,
Nov 14, 2016, 2:43:55 PM11/14/16
to java-dri...@lists.datastax.com
Indeed as Minh Do mentions, the current HostFilter Policy does not make use of the predicate for making the query plan but only uses the predicate at unit time and for onUp/Down/... events. So maybe only using a "dynamic" predicate is enough to detect false positive onUp events as Steven gave the example but I think it stays limited to this scenario only

On Mon, 14 Nov 2016 at 12:06, 'Minh Do' via DataStax Java Driver for Apache Cassandra User Mailing List <java-dri...@lists.datastax.com> wrote:
Hi,

We implemented a custom LoadBalancingPolicy (LBP) like Kevin also suggested.  This LBP takes in another regular LBP and has its own Eureka filter there.

We override the method:
public Iterator<Host> newQueryPlan(final String loggedKeyspace, final Statement statement);
to call the child LBP for a list of valid hosts (coming from the Cassandra gossip/peer table), and then apply the Eureka filter on this iterator before
returning the list of hosts to the caller.

Hope this helps.
Minh

On Mon, Nov 14, 2016 at 4:15 AM, Alexandre Dutra <alexand...@datastax.com> wrote:
Hi,

A predicate doesn't have to be "static", even if most of them are (in that they always return the same result for a given input).

Why don't you implement your own predicate and inject your custom Eureka logic in there? HostFilterPolicy never caches the result returned by the predicate for a given host, so this should work in theory.

Hope that helps,

Alexandre
On Thu, Nov 10, 2016 at 1:09 PM Kevin Gallardo <kevin.g...@datastax.com> wrote:
Hi, 

That's an interesting use case, implementing a Custom LoadBalancingPolicy with this approach sounds like the right one. I'd be interested to see the final solution if it gets published somewhere at some point.

Cheers.
On Mon, Nov 7, 2016 at 6:09 PM, 'Steven' via DataStax Java Driver for Apache Cassandra User Mailing List <java-dri...@lists.datastax.com> wrote:
Hello, all,

At my company, we use Eureka to discover services, including Cassandra nodes. When a Cassandra node undergoes maintenance, is  restarted, or replaced, we remove it from Eureka to prevent services from connecting to it. Notably, when replacing one node with another, the script that installs the Cassandra service on the new node may restart the service a few times before it is fully configured and ready to accept requests. Only at this point is the new node included in Eureka. We would like services only to connect to a new node when it is listed in Eureka to avoid spurious errors. However, the Datastax Java Driver performs its own discovery by subscribing to host up & down events from the co-ordinator node, which means that services may connect to new nodes too soon.

We would like to create something like a HostFilterPolicy that filters out nodes that are not currently in Eureka. The existing HostFilterPolicy is not quite suitable, because it assumes that the filter is static; that is, the criteria for including or excluding a node from the the set to query is fixed at the time the HostFilterPolicy is created. We would like the filter to change upon each update to the set of available hosts in Eureka.

So, upon an update to Eureka, we would like to ask the Cluster to refresh all connections to affected nodes. We think that by calling PoolingOptions.refreshConnectedHosts(), the Cluster will ask the LoadBalancingPolicy to re-evaluate the distance to each host. The Eureka-based filter shall assign each host that is not in Eureka HostDistance.IGNORED.

Has anyone else attempted to work-around the Driver's built-in discovery like this? What do you think of this approach?

Thanks

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.



--
Kévin Gallardo.
Software Engineer in Drivers and Tools Team at DataStax.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Alexandre Dutra
Driver & Tools Engineer @ DataStax

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Steven

unread,
Nov 15, 2016, 11:45:06 AM11/15/16
to DataStax Java Driver for Apache Cassandra User Mailing List
Thanks for your replies, everyone.

I described the existing HostFilterPolicy as assuming that the filter is "static", because it only applies the predicate within HostFilterPolicy.init(), distance(), onUp(), etc., but not within newQueryPlan(). However, I'm wondering now whether its necessary to create a new LoadBalancingPolicy to get the effect I'm looking for. Upon a change to the set of hosts discovered by Eureka, is it enough to call PoolingOptions.refreshConnectedHosts() to trigger a call to distance() for each host? I had convinced myself otherwise, but maybe I was wrong.

Alexandre Dutra

unread,
Nov 17, 2016, 10:57:09 AM11/17/16
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi,

The method you are referring to has been introduced by JAVA-309 for that exact purpose; so calling refreshConnectedHosts() will indeed update all connection pools to all hosts, re-evaluating their distances according to the load balancing policy. This is exactly what you are looking for.

Note that if only some hosts have changed, you can also call PoolingOptions.refreshConnectedHost(Host), and only affected pools will be updated accordingly.

Thanks,

Alexandre 
Reply all
Reply to author
Forward
0 new messages