Recommended storage size per node

1,167 views
Skip to first unread message

achalil@gmail.com

<achalil@gmail.com>
unread,
Jan 21, 2016, 3:27:10 PM1/21/16
to ScyllaDB users
Hi,

Current recommended disk size for cassandra is at max 5tb. Since pre cassandra 3.0 storage format is not an optimized one 5tb storage format is not a big one especially when we consider we store data in many projection. What is the recommended size for scylladb? Does it have any advantages over cassandra in this issue?

Tzach Livyatan

<tzach@scylladb.com>
unread,
Jan 23, 2016, 4:59:56 AM1/23/16
to ScyllaDB users
On Thu, Jan 21, 2016 at 10:27 PM, <ach...@gmail.com> wrote:
Hi,

Current recommended disk size for cassandra is at max 5tb. Since pre cassandra 3.0 storage format is not an optimized one 5tb storage format is not a big one especially when we consider we store data in many projection. What is the recommended size for scylladb? Does it have any advantages over cassandra in this issue?
Hi
Scylla format disk is compatible with Cassandra 2.1.x, and take the same disk space.
As Scylla use the CPU and IO in an efficient way, each node will be able to support higher data volume.
We currently testing 10TB per node, and higher data volume testing are planed. We will publish guidelines once we finish these testing (soon). 

I agree with your comment about higher data density. 
We are looking into more efficient disk representations, but not before Scylla GA[1]


Regards
Tzach
ScyllaDB Product Manager    

 

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To post to this group, send email to scyllad...@googlegroups.com.
Visit this group at https://groups.google.com/group/scylladb-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/de15b900-1d9a-4d4b-bc7d-1eebc551319e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Teodor Pripoae

<teodor.pripoae@gmail.com>
unread,
Jan 23, 2016, 10:33:13 PM1/23/16
to ScyllaDB users, achalil@gmail.com
Hi,

What is the recommended minimal node size ? If I'm going to setup a cluster with 3 nodes and replication 3 as a starting point, what is the minimum amount of CPUs/Ram which I should add ? Currently I don't have a lot of data (only a few GB but I expect it to increase quickly).

However, I don't want to throw a lot of money until I need to scale, so: Is 3 nodes with 2 Cpu and 7.5gb ram each a good starting point for small data, 5-30Gb ?

Thanks,
Teodor

Tzach Livyatan

<tzach@scylladb.com>
unread,
Jan 24, 2016, 3:17:44 AM1/24/16
to ScyllaDB users, achalil@gmail.com
On Sun, Jan 24, 2016 at 5:33 AM, Teodor Pripoae <teodor....@gmail.com> wrote:
Hi,

What is the recommended minimal node size ? If I'm going to setup a cluster with 3 nodes and replication 3 as a starting point, what is the minimum amount of CPUs/Ram which I should add ? Currently I don't have a lot of data (only a few GB but I expect it to increase quickly).

However, I don't want to throw a lot of money until I need to scale, so: Is 3 nodes with 2 Cpu and 7.5gb ram each a good starting point for small data, 5-30Gb ?

Require CPU usage is driven from your usage pattern.
How many CQL requests (reads/writes) per second are you expecting?

 

Thanks,
Teodor

On Thursday, 21 January 2016 22:27:10 UTC+2, ach...@gmail.com wrote:
Hi,

Current recommended disk size for cassandra is at max 5tb. Since pre cassandra 3.0 storage format is not an optimized one 5tb storage format is not a big one especially when we consider we store data in many projection. What is the recommended size for scylladb? Does it have any advantages over cassandra in this issue?

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To post to this group, send email to scyllad...@googlegroups.com.
Visit this group at https://groups.google.com/group/scylladb-users.

Avi Kivity

<avi@scylladb.com>
unread,
Jan 24, 2016, 4:09:12 AM1/24/16
to scylladb-users@googlegroups.com, achalil@gmail.com
Look at our administration guide (http://www.scylladb.com/doc/admin/) for recommendations.

When discussing cpu it is important to distinguish between the number of sockets, cores, and hardware threads (lcores) to avoid confusion.  A two socket machine with 4 cores each, with hyperthreading enabled has 8 total cores and 16 total threads (denoted 2s8c16s); for such a machine our admin guide recommends:

 4 GiB minimal testing environment
 8 GiB minimal production environment
 32 GiB + recommended production environment
--

Teodor Pripoae

<teodor.pripoae@gmail.com>
unread,
Jan 27, 2016, 10:30:48 PM1/27/16
to ScyllaDB users, achalil@gmail.com
Require CPU usage is driven from your usage pattern. 
How many CQL requests (reads/writes) per second are you expecting?

I'm having a few thousand writes per second and a few hundred reads now at peak. My workload is write heavy, but it is running in background, only read requests require a decent latency SLA. You can think of something like twitter feed, where each user feed is precomputed in advance when somebody posts something. Right now I'm using cassandra on 3 nodes each with 2 Core / 7.5Gb ram in Compute Engine and I was thinking I can save some money using Scylla since I don't have 100% usage, most of the time the system is idle.

@Avi Kivity

Thank you very much. I know that for cassandra there is a minimum of 4-8 Cores and 8GB ram for production usage and 2 cores, 4GB Ram for development/testing. I read an article on your blog that scylla has more performance than cassandra with similar resources and I was thinking that I can also save in hardware costs, not just in number of nodes.


Right now

CPC

<achalil@gmail.com>
unread,
Jan 28, 2016, 6:11:44 AM1/28/16
to scylladb-users@googlegroups.com

Is there any recommendation abour disk sizing such as max tb per lcore or per node?

Dor Laor

<dor@scylladb.com>
unread,
Jan 29, 2016, 9:18:23 PM1/29/16
to ScyllaDB users
A general rule of the thumb is 1TB of SSD per 4 cores or 10TB per node.
In real life, results vary, depending on the usage pattern, ops, speed of the disks and
even your data model.

Reply all
Reply to author
Forward
0 new messages