Hi
On Wed, Mar 2, 2016 at 1:39 PM, Robbie Barton <
thesuns...@gmail.com> wrote:
> Need advice on how to virtualize ScyllaDB.
>
> I have a cluster of 2-6 1U boxes with 4-8 SSD drives in each.
>
> In Cassandra, I run Xen on each box and dedicate each SSD to it's own
> virtual machine.
> By associating Cassandra Nodes to physical disks, I isolate disk failures to
> a single node which
> can be drained and removed from the cluster.
>
> This works OK and I've tried Docker and other solutions to minimize the
> impact of virtualization
> but haven't had as much luck as I'd like. I hope some day I can get rid of
> Xen
> and run one node per drive on the raw hardware.
>
> I'm also testing ScyllaDB and am very impressed with the performance and
> hope to use
> it as a drop in replacement after GA.
We are glad to hear that!
> Since the Scylla developers came from virtualization backgrounds, maybe they
> can do better.
>
> In other words, is it possible to create /var/lib/cassandra[1-n] filesystems
> on one machine and
> run one ScyllaDB/Cassandra node per filesystem without using Xen to
> virtualize each node?
It should be totally possible to do that.
Here are the steps, assuming one instance per disk (total of X), and N
processors in the system.
I am also assuming that you will already somehow have X IPs already available
1) create directories like /var/lib/scylla-<X>
2) for each disk sdX in the system:
mkfs.xfs /dev/sdX (we currently only support XFS)
mount -o noatime /dev/sdX /var/lib/scylla-<X>
3) for each instance:
generate a scylla.yaml file (you can just copy the /etc/ one somewhere)
change it so that:
- each instance will have its own IP
- the commitlog directory is /var/lib/scylla-<X>/commitlog
- the data directory is /var/lib/scylla-<X>/data
4) start scylla, adding the options:
- --smp <N/X> --cpuset <a disjoint set of all your processors>
You can still probably just use docker for most of that. As long as
each container has sensible --smp and --cpuset options, and a local
scylla.yaml that points to the correct location, it should all work.
Getting the --smp and --cpuset options is of the utmost importance:
scylla does polling for all its I/O and inter-shard requests, so for
this reason having it share a processor with another instance can be
really detrimental to the performance of the instance.
I, myself, run clusters in the same machine all the time (mainly for
testing), and if you really want to run it this way, I see no reason
why it wouldn't work in production as well.
>
> --
> You received this message because you are subscribed to the Google Groups
> "ScyllaDB users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
scylladb-user...@googlegroups.com.
> To post to this group, send email to
scyllad...@googlegroups.com.
> Visit this group at
https://groups.google.com/group/scylladb-users.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/scylladb-users/2a36f6a5-d0ca-4a5c-a718-3cb5192d08fa%40googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.