Scylla 3 Nodes - Load test Errors

396 views
Skip to first unread message

Alon Eldi

<alon.eldi@seadata.co.il>
unread,
Jan 27, 2016, 4:31:46 AM1/27/16
to scylladb-users@googlegroups.com
Hi ,

I started to run the load tests now on Scylla Cluster 3 nodes CL=1 RL=3.

I used 10 loaders .

Got the following error in YCSB output when used 8000-500 threads

Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (3 responses were required but only 2 replica responded)

in Scylla Logs

Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:

No Core Dumps .

I did not get these errors when I used  Scylla 1 node machine .

As Tzach recommended I will use CL = 3 and check if I get the same error .

Alon








Tzach Livyatan

<tzach@scylladb.com>
unread,
Jan 27, 2016, 6:20:45 AM1/27/16
to ScyllaDB users
On Wed, Jan 27, 2016 at 11:31 AM, Alon Eldi <alon...@seadata.co.il> wrote:
Hi ,

I started to run the load tests now on Scylla Cluster 3 nodes CL=1 RL=3.

I used 10 loaders .

Got the following error in YCSB output when used 8000-500 threads

Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (3 responses were required but only 2 replica responded)

in Scylla Logs

Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:12:17 alon-scylla-03.localdomain scylla_run[37294]: WARNING: exceptional future ignored of type 'exceptions::mutation_write_timeout_exception': Operation timed out - received only 0 responses.
Jan 27 08:


No Core Dumps .

I did not get these errors when I used  Scylla 1 node machine .

As Tzach recommended I will use CL = 3 and check if I get the same error .

Alon








--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To post to this group, send email to scyllad...@googlegroups.com.
Visit this group at https://groups.google.com/group/scylladb-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/56A88E80.1010309%40seadata.co.il.
For more options, visit https://groups.google.com/d/optout.

Alon Eldi

<alon.eldi@seadata.co.il>
unread,
Jan 27, 2016, 11:13:51 AM1/27/16
to scylladb-users@googlegroups.com
Hi ,

Following test is using 3 Nodes (Rackspace BareMetal IO Class) and 10 YCSB Loaders (Rackspace BareMetal Compute Class).

The load tested 2 consistency levels :QUOROM and THREE .Replication Level 3.

For Cluster configuration for reason no data appear in the monitoring dashboard (Amazon scylla-monitoring - ami-c7fcbbad)

even after changing --collectd-poll-period to 5000 no data appear in the dashboard.

I doubled check the monitoring with 1 nodes and got information in Tessera and Riemann dashboards.

To be sure that its not  AWS  ,I created Monitoring Server in Rackspace . The dashboard in this configuration display only the load information .

So the results below are from YCSB Counters  and Load to Monitoring Dashboard .


In both case the CPU reached 100% :

Quorum CL



Load ~75:



THREE CL - Load ~85





Summary :

Consistency Level AVG OPS/Sec MAX OPS/sec
THREE 266508 281238
QUOROM 331472 353985


For 1 Node - The Average reach 470K the maximum was 480K .

I did not succeed to test Consistency Level ONE which should lead to results close to 1 Node  , due to error :


Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (3 responses were required but only 2 replica responded)

The Error message looks strange to me , as for Consistency Level ONE  only 1 response should be required and in the message it mention that 3 responses are required .

Note - that it will be good to have the Scylla Load information in  OS command like IOSTAT , SAR , etc .
 
Regards

Alon


 




Tzach Livyatan

<tzach@scylladb.com>
unread,
Jan 27, 2016, 1:39:15 PM1/27/16
to ScyllaDB users
Thanks for reporting Alon

Can you please open issues for:
- monitoring problem you identified. If I understand correctly, it works fine for one node, stop working for 3 nodes.
  Moving the monitoring system from Ec2 to Rackspace did not work. Is it correct
- For the load with CL=1 issue. Please include both the agent errors and the log (journal)

Regards
Tzach 

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To post to this group, send email to scyllad...@googlegroups.com.
Visit this group at https://groups.google.com/group/scylladb-users.

Alon Eldi

<alon.eldi@seadata.co.il>
unread,
Feb 2, 2016, 12:56:14 PM2/2/16
to scylladb-users@googlegroups.com
Hi ,

Following are the test results  Cassandra version 2.2.4 - Cluster of 3 nodes - replication level 3 ; QUOROM consistency level .

The test is using the same infrastructure as last week test (27-1-2016) on Scylla DB .

3 Cassandra nodes  - Rackspace BareMetal Compute Class

10 YCSB loaders - Rackspace BareMetal Compute Class

 YCSB counters :



AVG 56424.99965
MAX 76432.27

Using OPS center :

Following are the read requests :



And Write Requests :




Looking at the green line (Cluster Total) the Average of read + write is about AVG 60K Ops / Sec .. 

Scylla OPS/sec for this configuration was Avg - 330K ;

This figures shows that Scylla is ~5.5 faster then Cassandra .

Regards

Alon

Following is ops center overall statistics :


Shlomi Livne

<shlomi@scylladb.com>
unread,
Feb 2, 2016, 1:17:26 PM2/2/16
to ScyllaDB users

Thanks alon a few questions/comments

On Feb 2, 2016 7:56 PM, "Alon Eldi" <alon...@seadata.co.il> wrote:
>
> Hi ,
>
> Following are the test results  Cassandra version 2.2.4 - Cluster of 3 nodes - replication level 3 ; QUOROM consistency level .
>
> The test is using the same infrastructure as last week test (27-1-2016) on Scylla DB .
>
> 3 Cassandra nodes  - Rackspace BareMetal Compute Class

I guess this is io class not compute


>
> 10 YCSB loaders - Rackspace BareMetal Compute Class
>
>  YCSB counters :
>
>
>

> AVG
> 56424.99965
> MAX
> 76432.27
>
> Using OPS center :
>
> Following are the read requests :
>
>
>

> And Write Requests :


>
>
>
>
> Looking at the green line (Cluster Total) the Average of read + write is about AVG 60K Ops / Sec .. 
>
> Scylla OPS/sec for this configuration was Avg - 330K ;
>
> This figures shows that Scylla is ~5.5 faster then Cassandra .

Aside from throughput is there a latency number as well

How long was the test run ?

Did any of the cassandra nodes had a stop the world pause ?

>
> Regards
>
> Alon
>
> Following is ops center overall statistics :
>
>

> --
> You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
> To post to this group, send email to scyllad...@googlegroups.com.
> Visit this group at https://groups.google.com/group/scylladb-users.

> To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/56B0EDB6.8010607%40seadata.co.il.

aloneldi@gmail.com

<aloneldi@gmail.com>
unread,
Feb 7, 2016, 4:44:06 AM2/7/16
to ScyllaDB users
Hi Shlomi ,

Regards you comments :
"I guess this is io class not compute" - Correct the Cassandra Nodes were IO Class .


"Aside from throughput is there a latency number as well" - The latency number is given at the end of the run as summary report . and dispaled for each YCSB Client - I will add it for next runs .


How long was the test run ? ~ 45 minutes


Did any of the cassandra nodes had a stop the world pause ? There where YCSB timeout errors which I did not get in Scylla .

Reply all
Reply to author
Forward
0 new messages