Reducing the number of shards in Elasticsearch

2,239 views
Skip to first unread message

Robert H

unread,
Aug 28, 2018, 12:15:23 PM8/28/18
to Wazuh mailing list
Hi All,
In my lab environment I have a 2 Wazuh manager, 5-node ELK cluster (with Wazuh app) with 2 of those nodes being data nodes.  As I'm learning Elastic via Wazuh I've been following the Wazuh documentation on the ELK, Wazuh area.  During the installation and configuration, I've just gone with the default shard information.  But now I find a very large number of shards on my cluster and have been reading that shards could/should be several GB in size for best searching results and too many shards introduces too much overhead.  So I'm trying to learn how to reduce the number of shards in the Wazuh/ELK setup, apparently with the logstash template.  Could anyone advise how to change the number of shards to 1 primary and 1 replica, instead of 5 primary and 1 replica.

My current cluster health shows almost a thousand shards for less than 40 days of indices/data.  

# curl -X GET "localhost:9200/_cluster/health?pretty"
{
  "cluster_name" : "ELK-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 5,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 426,
  "active_shards" : 852,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

///////////////

If changing the number of shards using the logstash templates, where I modify it as indicated below?  Also, how would the monitor index be changed as I don't see included in the template below?

# cat /etc/logstash/conf.d/01-wazuh.conf
# Wazuh - Logstash configuration file
## Remote Wazuh Manager - Filebeat input
input {
    beats {
        port => 5000
        codec => "json_lines"
#       ssl => true
#       ssl_certificate => "/etc/logstash/logstash.crt"
#       ssl_key => "/etc/logstash/logstash.key"
    }
}
filter {
    if [data][srcip] {
        mutate {
            add_field => [ "@src_ip", "%{[data][srcip]}" ]
        }
    }
    if [data][aws][sourceIPAddress] {
        mutate {
            add_field => [ "@src_ip", "%{[data][aws][sourceIPAddress]}" ]
        }
    }
}
filter {
    geoip {
        source => "@src_ip"
        target => "GeoLocation"
        fields => ["city_name", "country_name", "region_name", "location"]
    }
    date {
        match => ["timestamp", "ISO8601"]
        target => "@timestamp"
    }
    mutate {
        remove_field => [ "timestamp", "beat", "input_type", "tags", "count", "@version", "log", "offset", "type", "@src_ip", "host"]
    }
}
output {
    elasticsearch {
        hosts => ["localhost:9200"]
        index => "wazuh-alerts-3.x-%{+YYYY.MM.dd}"
        "number_of_shards" : 1 <-- here?
       "number_of_replicas" : 1 <-- here?
        document_type => "wazuh"
    }
}

Thanks,
Robert

jesus.g...@wazuh.com

unread,
Aug 29, 2018, 3:21:40 AM8/29/18
to Wazuh mailing list
Hi Robert,

What I did to achieve this task is modifying the Elasticsearch template. This way you can set the settings properly for each index. Keep in mind that already created indices 
can't modify it number of shards, only the number of replicas. 



Now modify the settings section.

Before:

{
 
"order": 0,
 
"template": "wazuh-alerts-3.x-*",
 
"settings": {
   
"index.refresh_interval": "5s"
 
},
 
...
}


After:

{
 
"order": 0,
 
"template": "wazuh-alerts-3.x-*",
 
"settings": {
   
"index.refresh_interval": "5s",
   
"number_of_shards": 1,
   
"number_of_replicas": 1
 
},
 
...
}

Stop Logstash:

systemctl stop logstash

Update your current template:

curl -XPUT 'http://localhost:9200/_template/wazuh' -H 'Content-Type: application/json' -d @template.json

Restart Logstash:

systemctl restart logstash


Already created indices (including today's index) won't be affected, also you are only able to modify their number of replicas as follow:

curl -X PUT "localhost:9200/<index>/_settings" -H 'Content-Type: application/json' -d'
{
    "index" : {
        "number_of_replicas" : 1
    }
}
'


That's all, I hope it helps. Some references:


Regards,
Jesús

Robert H

unread,
Aug 29, 2018, 11:48:57 AM8/29/18
to Wazuh mailing list
Thanks Jesus,
I have 5 elasticsearch nodes, 3 master eligible and 2 of those master eligible has logstash on them.  I have applied the below change to both of those two logstash nodes.  Then I have 2 elastic data only nodes.  Do I also need download and modify the template and load it for the other 3 elasticsearch nodes that do not have logstash on them (i.e. the master without logstash and the 2 data nodes)?

Thanks,
Robert

jesus.g...@wazuh.com

unread,
Aug 29, 2018, 12:01:20 PM8/29/18
to Wazuh mailing list
Hello again Robert,

you dont need to change Logstash configuration.

Just by changing the template file and inserting it in Elasticsearch you can change shards and replica settings. So you don't need Logstash to achieve this purpose.

I hope you understand me clearer now.

Regards,
Jesús

Robert H

unread,
Aug 29, 2018, 12:49:51 PM8/29/18
to Wazuh mailing list
Thanks Jesus,
I understand that this is an elastic template, not a logstash one.  But my question is, if I have 5 nodes in the cluster running elastic, I need to apply this template to all 5 of them or will running the curl command, curl -XPUT 'http://localhost:9200/_template/wazuh' -'Content-Type: application/json' -@template.json, on one elastic system update for the entire cluster of 5?

Regards,
Robert

Robert H

unread,
Aug 29, 2018, 1:31:46 PM8/29/18
to Wazuh mailing list
Sorry Jesus,
It sounds like you're saying if I insert, or run the curl command on any elastic node it will be inserted into the elastic cluster.  Is that correct?

Regards,
Robert

jesus.g...@wazuh.com

unread,
Aug 30, 2018, 2:48:55 AM8/30/18
to Wazuh mailing list
Hello again Robert,

Since Elastic works as a cluster, it's enough to execute the curl against one node. The important thing is to shut down each Logstash. And remember that already existing indices won't be affected.

Regards,
Jesús

Robert H

unread,
Sep 4, 2018, 12:00:27 PM9/4/18
to Wazuh mailing list
Hi Jesus,
Thanks for the clarification.  I realized after posting the question and looking more closely at the Template download and load command that is how it works.  As for the wazuh-monitoring index, I'm thinking of just trying the same method for that index or is there a different method?  I notice the release notes for 3.6 say,

  • Added new options to config.yml to change shards and replicas settings for wazuh-monitoring indices.

Are these new options only for setting before Kibana starts up for the first time?  And in my case of a running Kibana, I would use the same curl command but use the wazuh-monitoring index name instead.  Is that right?

Regards,
Robert

jesus.g...@wazuh.com

unread,
Sep 5, 2018, 3:01:45 AM9/5/18
to Wazuh mailing list
Hi Robert,

Replicas can be modified in a hot way, this means you can modify replicas for already created indices. On the other hand, shards can be modified
for future indices only. If you modify that settings, today's index will be affected by replicas, and future indices will be affected by both shards and replicas.

I hope it helps.

Regards,
Jesús

Robert H

unread,
Sep 5, 2018, 12:30:04 PM9/5/18
to Wazuh mailing list
Hi Jesus,
Yes, I'm familiar with that information.  I'm wondering how to change the primary shard number going forward for the wazuh-monitoring-3.x-* index?  I don't see a reference for it in the template.json file.  I only see a reference for the wazuh-alerts-3.x-* index (below).


{
  "order": 0,
  "template": "wazuh-alerts-3.x-*",
  "settings": {
    "index.refresh_interval": "5s",
    "number_of_shards": 1,
    "number_of_replicas": 1
  },

Regards,
Robert

jesus.g...@wazuh.com

unread,
Sep 5, 2018, 12:34:02 PM9/5/18
to Wazuh mailing list
Hi Robert,

The wazuh-monitoring indices are created inside the Wazuh app, it's not an external command, we force the template and a few other things under the hood.
the config.yml file located at /usr/share/kibana/plugins/wazuh/config.yml. 

Once you are done, restart Kibana:

systemctl restart kibana

I hope it helps.

Regards,
Jesús

Robert H

unread,
Sep 7, 2018, 11:41:59 AM9/7/18
to Wazuh mailing list
Thank you Jesus,
I have been able to set both the wazuh-alerts and wazuh-monitoring number of primary shards to 1 for current and future indices.  

Regards,
Robert

jesus.g...@wazuh.com

unread,
Sep 7, 2018, 12:03:25 PM9/7/18
to Wazuh mailing list
Happy to help, you are welcome Robert!

Regards,
Jesús
Reply all
Reply to author
Forward
0 new messages