Add Rack/AZ awareness to TokenAware Policy (implemented and looking for opinions)

Mario Lazaro

unread,

May 6, 2015, 2:00:30 PM5/6/15

to java-dri...@lists.datastax.com

Hi everyone!

First of all we are using AWS so for us Rack = AZ (Availability Zone).

What if we add Rack awareness to TokenAware policy (the client knows in which AZ is running and it tries to use replica(s) within the same AZ first)?

Lets assume the following scenario in AWS:
Cassandra with 25 nodes, using 3 AZs (1a, 1b and 1c)
Read heavy workload and RF=3.
No fat partitions, even data.
Clients doing same amount of request to Cassandra from three different AZs (1a, 1b and 1c).

What happen if we use a modified TokenAware policy that tries local AZ replicas first and then the rest of the replicas. Think about this policy as a AZ/Rack Aware + Token Aware policy.

Can you see any disadvantages to this approach? We just implemented it and seems to work fine for us, but maybe I am missing something and that is why I want to hear experts' opinion.

If the community thinks this is useful we can contribute ! :) Always improving C*!!!

Regards,

Mario

Christopher Batey

unread,

May 6, 2015, 2:25:18 PM5/6/15

to java-dri...@lists.datastax.com

Did you try latency aware policy + token aware policy? If so did you see much difference in performance?

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Mario Lazaro

unread,

May 6, 2015, 2:48:51 PM5/6/15

to java-dri...@lists.datastax.com

No, I did not try LatencyAware policy because of what I read here: https://datastax-oss.atlassian.net/browse/JAVA-501 and also because it looks a bit tricky to tune.

Regards,
Mario

Mario Lazaro

unread,

May 6, 2015, 2:58:39 PM5/6/15

to java-dri...@lists.datastax.com

Also in a normal scenario, where everything works fine and you have even data, no fat partitions, etc all nodes should be really fast (almost same speed) and with this approach we could avoid one extra network hop.

Olivier Michallat

unread,

May 11, 2015, 6:24:09 AM5/11/15

to java-dri...@lists.datastax.com

Hi Mario,

Yes, availability zone awareness sounds like a very good idea when running in EC2.

In fact there is a longstanding ticket for this: JAVA-200. There was a pull request with a specific policy implementation, that can be wrapped in the token aware one. How does that compare with your implementation?

--

Olivier Michallat

Driver & tools engineer, DataStax

Mario Lazaro

unread,

May 11, 2015, 8:29:19 AM5/11/15

to java-dri...@lists.datastax.com

My implementation is a modified tokenAwarePolicy that tries local replicas first (localAz as an argument). Super simple and fast. you need to use it in combination with DCAware or any other main policy.

For example, If you use the modified tokenAware policy with DCAware it will use the localDC specified in DCAware policy (through childPolicy.distance(host)) so you just need to specify the localRack/localAz:

TokenAwarePolicy loadBalancer = new TokenAwarePolicy(new DCAwareRoundRobinPolicy("us-eastus-test", 50), "1b");

If you make it the default one and your C* cluster only spans one Rack/AZ, it is okay. We can make it shuffle all the replicas to spread the load.

Regards,

Mario

--

Mario Lazaro | Software Engineer, Big Data
GumGum | Ads that stick
310-985-3792 | ma...@gumgum.com

Message has been deleted

Mario Lazaro

unread,

May 13, 2015, 10:32:12 AM5/13/15

to java-dri...@lists.datastax.com

I ran some benchmarks with interesting results.

Cassandra setup:

- 18 r3.2xlarge Cassandra 2.0.12 nodes (6 nodes in 1a, 6 in 1b and 6 in 1c) in EU-WEST-1 (Datacenter1) and 20 nodes in US-EAST-1 (20 in AZ 1b). We use eu-west-1 (Datacenter1) as localDC for the benchmarks.
- Using last version of cassandra-stress tool with Datastax Java Driver 2.1.15. This instance is a r3.xlarge and it is located in EU-WEST-1 1a. We run 50k write/read requests with different number of threads till we saturated the client.
- local Rack/AZ is 1a.

- 3 Replicas in EU-WEST-1 and 3 in US-EAST-1.
- we query a simple schema very similar to one we have in prod:

stresscql.tests (
  vis_id text,
  field1_id text,
  field2_id int,
  field3 text,
  field4_id text,
  field5_id text,
  field6 timestamp,
  field7 timestamp,
  field8 boolean,
  PRIMARY KEY (vis_id, field1_id, field2_id)
) WITH CLUSTERING ORDER BY (field1_id ASC, bluekai_field2_id ASC)

- AXIS Y is the MEAN in ms
- AXIS X is the number of threads

50K reads consistency level LOCAL_QUORUM, using three different policies: Rack+Token+DCAware (new implementation), Token+DCAware (with shuffle replicas true) and Latency+Token +DCAware policies:

- 50K reads consistency level LOCAL_ONE, using three different policies: Rack+Token+DCAware (new implementation), Token+DCAware (with shuffle replicas true) and Latency+TokenAware policies:

50K writes consistency level LOCAL_QUORUM, using three different policies: Rack+Token+DCAware (new implementation), Token+DCAware (with shuffle replicas true) and Latency+TokenAware policies:

50K writes consistency level LOCAL_ONE, using three different policies: Rack+Token+DCAware (new implementation), Token+DCAware (with shuffle replicas true) and Latency+TokenAware policies:

As you can see the performance we obtain with the new version of Rack + Token + DC Aware yields better results than the other two. It is true that LatencyAware is running with default parameters, so maybe if we tune it, it can give out better results...

Let me know if you have any questions. I would love to contribute!

Regards,

Mario

Mario Lazaro

unread,

May 26, 2015, 2:45:28 PM5/26/15

to java-dri...@lists.datastax.com

Hi Cassandra Datastax group!

Any thoughts? something we are missing?

Thanks community!

Reply all

Reply to author

Forward