Understanding indexer Nodes and roles

667 views
Skip to first unread message

Nico

unread,
Sep 7, 2023, 4:59:52 AM9/7/23
to Wazuh | Mailing List
Hey,

i have a problem to understand the indexer cluster. I have 3 nodes indexer-m, indexer-d1 and indexer-d2.
Actually roles are,
indexer-m = cluster_manager,data,ingest,remote_cluster_client
indexer-d1 = cluster_manager,data,ingest,remote_cluster_client
indexer-d2 = data,ingest,remote_cluster_client

When i stop d2 the cluster turns to red and its not reachable, i can't understand why. I have 2 manager nodes and 3 data nodes, if one of them down i have 2 nodes up with data. What i doing wrong with the configuration? Cluster status yellow would be okay but not red. 
I tested some configurations from OpenSearch and only node.data: true/false is working. When i try to use node.master: true my cluster will not start.

Regards 
Nico

Ifeanyi Onyia Odike

unread,
Sep 7, 2023, 8:38:02 AM9/7/23
to Wazuh | Mailing List
Hi @Nico

Thank you for using Wazuh!


"When i stop d2 the cluster turns to red and its not reachable"

This could be due to the fact that indexer-d2 is not configured as a cluster_manager like indexer-m and indexer-d1. To have a fully functional cluster, all nodes should have the cluster_manager role.

Additionally, you mentioned that you have 2 manager nodes and 3 data nodes. It's important to note that having an odd number of master-eligible nodes in a cluster is recommended to avoid split-brain scenarios. Consider adding another manager node to your configuration.

The cluster status is expected to turn red when one of the manager nodes goes down, as it affects the cluster's ability to elect a new master. However, if the cluster remains red even after indexer-d2 is stopped, other configuration issues might need to be addressed.

To further investigate the issue, reviewing the cluster's logs and configuration files would be helpful. Please provide more details about your setup, such as the version of Wazuh and your cluster logs, so we can assist you better.

Regards,

Nico

unread,
Sep 8, 2023, 1:36:23 AM9/8/23
to Wazuh | Mailing List
Hey,

my config is for all Nodes:

cluster.initial_master_nodes:
  - "indexer-m"
  - "indexer-d1"
  - "indexer-d2"
cluster.name: "opensearch"
discovery.seed_hosts:
  - "172.16.2.137"
  - "172.16.2.135"
  - "172.16.2.136"

ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role node.roles                                        cluster_manager name
172.16.2.137           41          95   2    0.20    0.22     0.19 dimr      cluster_manager,data,ingest,remote_cluster_client -               indexer-m
172.16.2.136           35          91   1    0.07    0.24     0.38 dimr      cluster_manager,data,ingest,remote_cluster_client -               indexer-d2
172.16.2.135           54          96   1    0.35    0.33     0.38 dimr      cluster_manager,data,ingest,remote_cluster_client *               indexer-d1


Logs after stopping the d2:

indexer-m:

[2023-09-07T15:34:18,251][INFO ][o.o.c.s.ClusterApplierService] [indexer-m] removed {{indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true}}, term: 68, version: 15939, reason: ApplyCommitRequest{term=68, version=15939, sourceNode={indexer-d1}{gQQ4mVt5TAOT6IwxyAf5Zw}{k9_YNutaQemKYUazhJ_7Og}{172.16.2.135}{172.16.2.135:9300}{dimr}{shard_indexing_pressure_enabled=true}}
[2023-09-07T15:34:18,265][INFO ][o.o.a.c.ADClusterEventListener] [indexer-m] Cluster node changed, node removed: true, node added: false
[2023-09-07T15:34:18,265][INFO ][o.o.a.c.HashRing         ] [indexer-m] Node removed: [64jy12WdR0eV2_fBYU5wqA]
[2023-09-07T15:34:18,266][INFO ][o.o.a.c.HashRing         ] [indexer-m] Remove data node from AD version hash ring: 64jy12WdR0eV2_fBYU5wqA
[2023-09-07T15:34:18,266][INFO ][o.o.a.c.ADClusterEventListener] [indexer-m] Hash ring build result: true
[2023-09-07T15:34:18,267][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:34:18,285][INFO ][o.o.i.s.IndexShard       ] [indexer-m] [.kibana_2][0] primary-replica resync completed with 0 operations
[2023-09-07T15:34:18,287][INFO ][o.o.i.s.IndexShard       ] [indexer-m] [.opensearch-observability][0] detected new primary with primary term [36], global checkpoint [-1], max_seq_no [-1]
[2023-09-07T15:34:18,288][INFO ][o.o.i.s.IndexShard       ] [indexer-m] [.opendistro_security][0] detected new primary with primary term [52], global checkpoint [147], max_seq_no [147]
[2023-09-07T15:34:18,311][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:34:39,133][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:34:39,487][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:35:18,209][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:35:18,340][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-m] Detected cluster change event for destination migration
[2023-09-07T15:35:18,406][INFO ][o.o.i.r.RecoverySourceHandler] [indexer-m] [.kibana_2][0][recover to indexer-d1] finalizing recovery took [6.4ms]

indexer-d1:

[2023-09-07T15:34:18,164][INFO ][o.o.c.c.FollowersChecker ] [indexer-d1] FollowerChecker{discoveryNode={indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} disconnected
[2023-09-07T15:34:18,165][INFO ][o.o.c.c.FollowersChecker ] [indexer-d1] FollowerChecker{discoveryNode={indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} marking node as faulty
[2023-09-07T15:34:18,188][INFO ][o.o.c.r.a.AllocationService] [indexer-d1] updating number_of_replicas to [1] for indices [.opensearch-observability, .opendistro_security]
[2023-09-07T15:34:18,206][INFO ][o.o.c.s.MasterService    ] [indexer-d1] node-left[{indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true} reason: disconnected], term: 68, version: 15939, delta: removed {{indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true}}
[2023-09-07T15:34:18,270][INFO ][o.o.c.s.ClusterApplierService] [indexer-d1] removed {{indexer-d2}{64jy12WdR0eV2_fBYU5wqA}{56AxYY-jRgi1r-PUyaApRw}{172.16.2.136}{172.16.2.136:9300}{dimr}{shard_indexing_pressure_enabled=true}}, term: 68, version: 15939, reason: Publication{term=68, version=15939}
[2023-09-07T15:34:18,285][INFO ][o.o.a.c.ADClusterEventListener] [indexer-d1] Cluster node changed, node removed: true, node added: false
[2023-09-07T15:34:18,285][INFO ][o.o.a.c.HashRing         ] [indexer-d1] Node removed: [64jy12WdR0eV2_fBYU5wqA]
[2023-09-07T15:34:18,286][INFO ][o.o.a.c.HashRing         ] [indexer-d1] Remove data node from AD version hash ring: 64jy12WdR0eV2_fBYU5wqA
[2023-09-07T15:34:18,286][INFO ][o.o.a.c.ADClusterEventListener] [indexer-d1] Hash ring build result: true
[2023-09-07T15:34:18,286][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:34:18,287][INFO ][o.o.c.r.DelayedAllocationService] [indexer-d1] scheduling reroute for delayed shards in [59.8s] (148 delayed shards)
[2023-09-07T15:34:18,300][INFO ][o.o.i.s.IndexShard       ] [indexer-d1] [.opensearch-observability][0] primary-replica resync completed with 0 operations
[2023-09-07T15:34:18,300][INFO ][o.o.i.s.IndexShard       ] [indexer-d1] [.opendistro_security][0] primary-replica resync completed with 0 operations
[2023-09-07T15:34:18,316][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:34:37,817][INFO ][o.o.j.s.JobSweeper       ] [indexer-d1] Running full sweep
[2023-09-07T15:34:39,085][WARN ][o.o.c.r.a.AllocationService] [indexer-d1] [.opendistro_security][0] marking unavailable shards as stale: [HBavrEhcSFSvhiIkFtHjlw]
[2023-09-07T15:34:39,144][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:34:39,439][WARN ][o.o.c.r.a.AllocationService] [indexer-d1] [.opensearch-observability][0] marking unavailable shards as stale: [aAHzDp9ITwCDzrXl8hbnAA]
[2023-09-07T15:34:39,497][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:34:49,056][WARN ][o.o.s.a.BackendRegistry  ] [indexer-d1] No 'Authorization' header, send 401 and 'WWW-Authenticate Basic'
[2023-09-07T15:34:49,073][WARN ][o.o.s.a.BackendRegistry  ] [indexer-d1] No 'Authorization' header, send 401 and 'WWW-Authenticate Basic'
[2023-09-07T15:35:18,217][INFO ][o.o.p.PluginsService     ] [indexer-d1] PluginService:onIndexModule index:[.kibana_2/kVu9440FTv-oM7IseJuWqg]
[2023-09-07T15:35:18,237][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:35:18,294][WARN ][o.o.c.r.a.AllocationService] [indexer-d1] [.kibana_2][0] marking unavailable shards as stale: [635P-g0jT6qAFocpGDQZhQ]
[2023-09-07T15:35:18,350][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration
[2023-09-07T15:35:18,442][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [indexer-d1] Detected cluster change event for destination migration

indexer-d2:

[2023-09-07T15:33:16,679][INFO ][o.o.j.s.JobSweeper       ] [indexer-d2] Running full sweep
[2023-09-07T15:34:18,157][INFO ][o.o.n.Node               ] [indexer-d2] stopping ...
[2023-09-07T15:34:18,158][INFO ][o.o.s.a.r.AuditMessageRouter] [indexer-d2] Closing AuditMessageRouter
[2023-09-07T15:34:18,159][INFO ][o.o.s.a.s.SinkProvider   ] [indexer-d2] Closing DebugSink
[2023-09-07T15:34:18,161][INFO ][o.o.c.c.Coordinator      ] [indexer-d2] cluster-manager node [{indexer-d1}{gQQ4mVt5TAOT6IwxyAf5Zw}{k9_YNutaQemKYUazhJ_7Og}{172.16.2.135}{172.16.2.135:9300}{dimr}{shard_indexing_pressure_enabled=true}] failed, restarting discovery
org.opensearch.transport.NodeDisconnectedException: [indexer-d1][172.16.2.135:9300][disconnected] disconnected
[2023-09-07T15:34:18,547][INFO ][o.o.n.Node               ] [indexer-d2] stopped
[2023-09-07T15:34:18,547][INFO ][o.o.n.Node               ] [indexer-d2] closing ...
[2023-09-07T15:34:18,552][INFO ][o.o.s.a.i.AuditLogImpl   ] [indexer-d2] Closing AuditLogImpl
[2023-09-07T15:34:18,557][INFO ][o.o.n.Node               ] [indexer-d2] closed

Regards
Nico

Ifeanyi Onyia Odike

unread,
Sep 12, 2023, 4:26:39 AM9/12/23
to Wazuh | Mailing List
Hi Nico,

Can you make indexer-d2 a cluster manager as well?

indexer-m = cluster_manager,data,ingest,remote_cluster_client
indexer-d1 = cluster_manager,data,ingest,remote_cluster_client
indexer-d2 = cluster_manager,data,ingest,remote_cluster_client

Nico

unread,
Sep 12, 2023, 9:46:20 AM9/12/23
to Wazuh | Mailing List
Hi,

d2 is a cluster manager, i changed it before i tested it again.

Ifeanyi Onyia Odike

unread,
Sep 12, 2023, 2:21:22 PM9/12/23
to Wazuh | Mailing List
Hi Nico,

I'm not sure what this issue is, but I will bring it to my team and give you a response tomorrow.

Regards,

Ifeanyi Onyia Odike

unread,
Sep 13, 2023, 10:00:00 AM9/13/23
to Wazuh | Mailing List
Hi Nico,

Please send the configuration for:
/etc/wazuh-indexer/opensearch.yml

Regards.

Ifeanyi Onyia Odike

unread,
Sep 13, 2023, 10:04:53 AM9/13/23
to Wazuh | Mailing List
Here is a detailed configuration for the indexer:

https://opensearch.org/docs/latest/tuning-your-cluster/index/

Once your config is shared, I can compare with the baselines from the reference above.

Reply all
Reply to author
Forward
0 new messages