Sizing guidelines

41 views
Skip to first unread message

Matthew Green

unread,
Jun 10, 2019, 10:49:59 PM6/10/19
to velociraptor-discuss
Hi team,

Im wondering if there are any sizing guidelines for a velociraptor install?

At this point looking at only 500ish machines but interested in any sizing information, including large size.

My use cases will primarily be collection of artefacts for live response and detection/scoping for IOCs.

Matt

Mike Cohen

unread,
Jun 11, 2019, 3:11:42 AM6/11/19
to velocirapt...@googlegroups.com
Hi Matthew

   This is a good question!


You can get an idea of the server load by looking at the workshop slides
https://docs.velociraptor.velocidex.com/presentations/crikeycon2019/
although those are a bit old so it should be better than that. For
example there is a graph that shows 2k clients idling with about 40%
load and 200mb server resident size.


Typically we use aws t2.large instances (we used to use bigger ones but
it is not really necessary). You could probably also use the next size
down (instances are typically not loaded at all). CPU is not really a
limitation - if your CPU is slower it would take a bit longer to render
the GUI but its not a huge deal.


The bigger issue is memory - When transferring a lot of files to the
server, we need to buffer each POST before we can process it. So if we
run a hunt that e.g. uploads system hives, or $MFT then we are buffering
a lot of data at the same time. Velociraptor uses concurrency control to
ensure it is not handling too many concurrent clients (by default 8
clients can upload at once). Typically server memory size might increase
to several gb under heavy uploads load (when idle it is about 200mb). I
think if you are running less than 8gb RAM you should lower the client
concurrency setting in the config file.


The main thing you need to provision is disk storage if you are going to
collect a lot of data. We dont use a database so we just collect files
and write them on the disk. Typically we provision the OS on SSD and
then a 500gb spinning disk which we format to ext4 and mount normally.
We then just configure Velociraptor to write there. This approach has
pros and cons - the obvious disadvantage is that there are no indexes
but the advantage is that you can just archive the files (or delete
them) at any time - Velociraptor does not have any internal indexes.


Velociraptor does not manage your disk space - it is up to you depending
on your retention policy and how much data you are collecting. You
literally manage your storage using the "find" and "tar" or "gzip"
utilities!


To give you an idea a 2k-5k deployment generates about 2-3gb of process
execution logs per day. You can gzip them if you want (Velociraptor will
automatically read gzip files) and they typically compress at least 10:1
because they are CSV text files. There is even a server artifact that
does that for you automatically as well - so having it compressed is a
couple of clicks in the GUI. In reality 500GB storage can last for
several years if you want to go back and look at historical data.


Velociraptor is typically very light on server resources. The things
that are expensive on the server are not what you would guess. For
example  running a yara scan on all files on all endpoints is actually
extremely light on the server (but heavier on the client unless you
restrict the client's ops/sec). This is because the server does not do
much more than copy a CSV file from the endpoint to the disk!


OTOH downloading a large file from the endpoint is a bit more heavy on
the server (mainly memory wise) but it is very low impact on the client
(mostly bandwidth use). Running many dashboards and GUI users is also
more expensive because the server then gets to do some analysis (but it
happens much more rarely).


Running a hunt mostly depends on how long it takes on the clients. For
example if you want to do a small glob (e.g. check for presence of files
or registry keys) you could expect to get all currently connected
machines in about 10-20 sec. Collecting long running artifacts might
have more variations - for example a large yara scan may take minutes
(or hours) on the endpoints so you will have more distribution of
completed endpoints. When you launch a hunt *all* currently up machines
will start the hunt at the same time immediately (unlike other tools
like GRR, Velociraptor does not poll the server so we dont need to wait
for client polls).


Hope this gives you an idea :-)

Thanks

Mike
> --
> You received this message because you are subscribed to the Google
> Groups "velociraptor-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to velociraptor-dis...@googlegroups.com
> <mailto:velociraptor-dis...@googlegroups.com>.
> To post to this group, send email to
> velocirapt...@googlegroups.com
> <mailto:velocirapt...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/velociraptor-discuss/419fa8d9-2187-4cd9-9d0b-7c3671a1d292%40googlegroups.com
> <https://groups.google.com/d/msgid/velociraptor-discuss/419fa8d9-2187-4cd9-9d0b-7c3671a1d292%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages