Moloch Elasticsearch Sizing and Index Retention

146 views
Skip to first unread message

jonathonm...@gmail.com

unread,
Aug 25, 2015, 11:40:19 AM8/25/15
to Moloch Full Packet Capture
We would like to keep around 180 days of metadata searched through ES.  We are currently consuming around 190GB of ES Index per day in our POC. Total bandwidth from the capture boxes is less than 1Gb/s.  Using the ES formula 1/4 * # GB interfaces * number of days = 45 ES nodes.  This sounds like an impossible task.  Given the budget constraints we would like to purchase two servers each with 128GB RAM and 18TB of storage and run 3 ES nodes per server.  Our calculation shows this should cover the total size of the Index files for 180 days.  Is this a bad idea? Is the search performance going to suffer?  Any guidance will be appreciated.

Andy

unread,
Aug 25, 2015, 2:48:03 PM8/25/15
to Moloch Full Packet Capture
As long as you don't run complex queries or spi views across a large number of days you might be ok, but you probably won't be happy, and will probably experience slowness..  I would start with 5 machines (which is a 1/3 of the recommend 15) and would allow you to have less disk per machine.  Since its always possible to add more machines though, you can just wait and see what happens.  My main concern is memory, you'll want to make sure you have enough to cache several days worth of data.

jonathonm...@gmail.com

unread,
Aug 25, 2015, 3:05:15 PM8/25/15
to Moloch Full Packet Capture
Thank you  for the info.  Should we buy around 36TB of drives and divide them into 5 servers right now or just buy 1/3 of the total and then purchase more drives with new servers as needed?  My concern is that if performance is fine with 5 servers we may not have to add more ES nodes but will run out of disk space. In that case we will just have to purchase more drives for each of the servers.  just want to see your opinion.  As far as memory concern you have, would memory still be a problem with running 3 ES at 30GB each on a 128GB server?

Andy

unread,
Aug 25, 2015, 4:22:36 PM8/25/15
to Moloch Full Packet Capture
The problem with all of these scaling questions is the answer really is "it depends", until you see the type of data, the true amount of data, and how operators use moloch, you don't really know.  If you have long term budget/planning issues and its hard to get machines then order more now then later, otherwise you can order as you see.  More disk will allow you to have more days in the future, and handle Moloch using more disk per day in the future.  More memory will make things faster and allow longer spi view queries.

Yes running 2 or 3 nodes on 128G machines is still the way to go.

jonathonm...@gmail.com

unread,
Aug 25, 2015, 4:33:13 PM8/25/15
to Moloch Full Packet Capture
Thank you.  One last question, do you see issue with running up to 5 nodes per server with 192GB of RAM?  Trying to fit my budget by purchasing less servers but with more memory to run more ES.

Andy

unread,
Aug 25, 2015, 8:28:10 PM8/25/15
to Moloch Full Packet Capture
It is hard to say because you might start hitting CPU limits.

jonathonm...@gmail.com

unread,
Aug 26, 2015, 8:11:34 AM8/26/15
to Moloch Full Packet Capture
thank you
Reply all
Reply to author
Forward
0 new messages