All shard Error!

1,113 views
Skip to first unread message

Süleyman Pamuk

unread,
Aug 28, 2023, 11:39:59 AM8/28/23
to Wazuh | Mailing List
Hi everyone;
When i restart wazuh-dashboard service i got this error. Someone may help me?
MobaXterm_XRMwByeGqL.png

Leonardo Quiceno

unread,
Aug 28, 2023, 6:31:12 PM8/28/23
to Wazuh | Mailing List
Hi Süleyman, sorry for the delay.

I have been investigating, and I have seen that some users have encountered these same messages, and the solution seems to be as follows:

---------------------------------------------------------------------------------------------------------------------------
For some reason there wasn't a kibana alias:Broken install:
command: curl -u admin:<admin_password> -ks -XGET "https://localhost:9200/_cat/aliases?v" message: alias index filter routing.index routing.search is_write_index
Working install:
command: curl -u admin:<admin_pass> -ks -XGET "https://localhost:9200/_cat/aliases?v" message: alias index filter routing.index routing.search is_write_index
.kibana .kibana_1 - - - -


Solution was:

  1. Stop the service
systemctl stop wazuh-dashboard

    2. Manually create the alias (as per elasticsearch docs):
curl -X POST -u admin:<adminpass> -ks -XPOST "https://localhost:9200/_aliases?pretty" -H 'Content-Type: application/json' -d' { "actions": [ { "add": { "index": ".kibana_1", "alias": ".kibana" } } ] } '
    3. Start the service back again:
systemctl start wazuh-dashboard
---------------------------------------------------------------------------------------------------------------------------


You can check the conversation thread and the issue that provided a solution for this in the following link: https://github.com/wazuh/wazuh/issues/17773

In the other way, with the information you shared, it's difficult to determine the specific reason for this error message. If you find that the solution I provided doesn't apply to your issue, Could you please share the config of the dashboard? It should be in /usr/share/wazuh-dashboard/data/wazuh/config/wazuh.yml. Please do not send passwords. Thanks!

Regards
Leo.

Süleyman

unread,
Aug 29, 2023, 3:46:54 AM8/29/23
to Wazuh | Mailing List
Hi Leonardo

Thank you for your answer. The steps you suggested were not a solution for me. I am making the .yml file you requested. MobaXterm_vdcK8cNNmh.png


/usr/share/wazuh-dashboard/data/wazuh/config/wazuh.yml
---
#
# Wazuh dashboard - App configuration file
# Copyright (C) 2015-2022 Wazuh, Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# Find more information about this on the LICENSE file.
#
# ======================== Wazuh dashboard configuration file ========================
#
# Please check the documentation for more information on configuration options:
# https://documentation.wazuh.com/current/installation-guide/index.html
#
# Also, you can check our repository:
# https://github.com/wazuh/wazuh-kibana-app
#
# ------------------------------- Disable roles -------------------------------
#
# Defines which Elasticsearch roles disable Wazuh
# disabled_roles:
#      - wazuh_disabled
#
# ------------------------------- Index patterns -------------------------------
#
# Default index pattern to use.
#pattern: wazuh-alerts-*
#
# ----------------------------------- Checks -----------------------------------
#
# Defines which checks must to be consider by the healthcheck
# step once the Wazuh dashboard starts. Values must to be true or false.
#checks.pattern : true
#checks.template: true
#checks.fields  : true
#checks.api     : true
#checks.setup   : true
#checks.metaFields: true
#checks.timeFilter: true
#checks.maxBuckets: true
#
# --------------------------------- Extensions ---------------------------------
#
# Defines which extensions should be activated when you add a new API entry.
# You can change them after Wazuh dashboard starts.
# Values must to be true or false.
#extensions.pci       : true
#extensions.gdpr      : true
#extensions.hipaa     : true
#extensions.nist      : true
#extensions.tsc       : true
#extensions.audit     : true
#extensions.oscap     : false
#extensions.ciscat    : false
#extensions.aws       : false
#extensions.gcp       : false
#extensions.virustotal: false
#extensions.osquery   : false
#extensions.docker    : false
#
# ---------------------------------- Timeout ----------------------------------
#
# Defines maximum timeout to be used on the Wazuh dashboard requests.
# It will be ignored if it is bellow 1500.
# It means milliseconds before we consider a request as failed.
# Default: 20000
#timeout: 20000
#
# -------------------------------- API selector --------------------------------
#
# Defines if the user is allowed to change the selected
# API directly from the Wazuh dashboard top menu.
# Default: true
#api.selector: true
#
# --------------------------- Index pattern selector ---------------------------
#
# Defines if the user is allowed to change the selected
# index pattern directly from the Wazuh dashboard top menu.
# Default: true
#ip.selector: true
#
# List of index patterns to be ignored
#ip.ignore: []
#
# ------------------------------ wazuh-monitoring ------------------------------
#
# Custom setting to enable/disable wazuh-monitoring indices.
# Values: true, false, worker
# If worker is given as value, the app will show the Agents status
# visualization but won't insert data on wazuh-monitoring indices.
# Default: true
#wazuh.monitoring.enabled: true
#
# Custom setting to set the frequency for wazuh-monitoring indices cron task.
# Default: 900 (s)
#wazuh.monitoring.frequency: 900
#
# Configure wazuh-monitoring-* indices shards and replicas.
#wazuh.monitoring.shards: 1
#wazuh.monitoring.replicas: 0
#
# Configure wazuh-monitoring-* indices custom creation interval.
# Values: h (hourly), d (daily), w (weekly), m (monthly)
# Default: w
#wazuh.monitoring.creation: w
#
# Default index pattern to use for Wazuh monitoring
#wazuh.monitoring.pattern: wazuh-monitoring-*
#
# --------------------------------- wazuh-cron ----------------------------------
#
# Customize the index prefix of predefined jobs
# This change is not retroactive, if you change it new indexes will be created
# cron.prefix: wazuh
#
# --------------------------------- wazuh-sample-alerts -------------------------
#
# Customize the index name prefix of sample alerts
# This change is not retroactive, if you change it new indexes will be created
# It should match with a valid index template to avoid unknown fields on
# dashboards
#alerts.sample.prefix: wazuh-alerts-4.x-
#
# ------------------------------ wazuh-statistics -------------------------------
#
# Custom setting to enable/disable statistics tasks.
#cron.statistics.status: true
#
# Enter the ID of the APIs you want to save data from, leave this empty to run
# the task on all configured APIs
#cron.statistics.apis: []
#
# Define the frequency of task execution using cron schedule expressions
#cron.statistics.interval: 0 */5 * * * *
#
# Define the name of the index in which the documents are to be saved.
#cron.statistics.index.name: statistics
#
# Define the interval in which the index will be created
#cron.statistics.index.creation: w
#
# Configure statistics indices shards and replicas.
#cron.statistics.shards: 1
#cron.statistics.replicas: 0
#
# ------------------------------ wazuh-logo-customization -------------------------------
#
#Define the name of the app logo saved in the path /plugins/wazuh/assets/
#customization.logo.app: ''
#
#Define the name of the sidebar logo saved in the path /plugins/wazuh/assets/
#customization.logo.sidebar: ''
#
#Define the name of the health-check logo saved in the path /plugins/wazuh/assets/
#customization.logo.healthcheck: ''
#
#Define the name of the reports logo (.png) saved in the path /plugins/wazuh/assets/
#customization.logo.reports: ''
#
# ---------------------------- Hide manager alerts ------------------------------
# Hide the alerts of the manager in all dashboards and discover
#hideManagerAlerts: false
#
# ------------------------------- App logging level -----------------------------
# Set the logging level for the Wazuh dashboard log files.
# Default value: info
# Allowed values: info, debug
#logs.level: info
#
# -------------------------------- Enrollment DNS -------------------------------
# Set the variable WAZUH_REGISTRATION_SERVER in agents deployment.
# Default value: ''
#enrollment.dns: ''
#
# Wazuh registration password
# Default value: ''
#enrollment.password: ''
#-------------------------------- API entries -----------------------------------
#The following configuration is the default structure to define an API entry.
#
#hosts:
#  - <id>:
      # URL
      # API url
      # url: http(s)://<url>

      # Port
      # API port
      # port: <port>

      # Username
      # API user's username
      # username: <username>

      # Password
      # API user's password
      # password: <password>

      # Run as
      # Define how the app user gets his/her app permissions.
      # Values:
      #   - true: use his/her authentication context. Require Wazuh API user allows run_as.
      #   - false or not defined: get same permissions of Wazuh API user.
      # run_as: <true|false>
hosts:
  - default:
     url: https://localhost
     port: 55000
     username: wazuh-wui
     password: wazuh-wui
     run_as: false

#hideManagerAlerts: true


29 Ağustos 2023 Salı tarihinde saat 01:31:12 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:

Leonardo Quiceno

unread,
Aug 29, 2023, 11:39:36 AM8/29/23
to Wazuh | Mailing List
Hi Süleyman,

Ok, thanks for sharing the information (I don't see anything strange in the information you gave me), since the solution I had proposed didn't work for you, we must identify in your specific case what is happening, for which I will ask you for some information as I need it, the idea is to discard variables.

Please run the following command and share with me the information that it gives you.

Süleyman

unread,
Aug 31, 2023, 3:16:23 AM8/31/23
to Wazuh | Mailing List
Hi Leonardo

I got this error.

root@wazuh-sw:~# curl -k -u user:Password https://localhost:9200/_cluster/health?pretty
{
  "error" : {
    "root_cause" : [
      {
        "type" : "cluster_manager_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "cluster_manager_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

29 Ağustos 2023 Salı tarihinde saat 18:39:36 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:

Leonardo Quiceno

unread,
Aug 31, 2023, 1:13:20 PM8/31/23
to Wazuh | Mailing List
Hi Süleyman,

So, we will need to review the entire log file to get some idea of what might be happening. Please execute the following command:


cat /var/log/wazuh-indexer/wazuh-cluster.log | grep -i -E "error|warn|shards"

And share the output with me.

Süleyman

unread,
Sep 4, 2023, 12:00:27 PM9/4/23
to Wazuh | Mailing List
Hi Leonardo
The outputs you requested are attached.

31 Ağustos 2023 Perşembe tarihinde saat 20:13:20 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:
MobaXterm_Wazuh_20230904_185740.rtf

Leonardo Quiceno

unread,
Sep 5, 2023, 2:02:54 PM9/5/23
to Wazuh | Mailing List
Hi Süleyman,

Thank you for the information. I will analyze it with my team, and I will get back to you with a response as soon as possible!!

Leonardo Quiceno

unread,
Sep 12, 2023, 5:49:35 PM9/12/23
to Wazuh | Mailing List
Hi Süleyman,

Sorry for the delay. Analyzing the file you sent, it seems like there is an issue with the communication between the Wazuh indexer nodes. The problem might lie in the configuration of the nodes or there could be an issue with the network connecting them. It appears that the nodes cannot ping each other on port 9300, and this should work in both directions. In other words, node-1 should be able to ping node-2 on port 9300, and vice versa. You can confirm this with a simple curl command. From node-1, run "curl node-2:9300" If it succeeds, then try "curl node-1:9300" from node-2. (You should replace node-1 and node-2 with the names you've given to your nodes.)

I'm not sure how many nodes your cluster has, so testing this between a few nodes as a sample should be sufficient. If you see that they can communicate with each other without any issues, then we need to start checking the configuration of each component (Indexer, Server, and Dashboard) to ensure they are correct.


I will remain attentive to your feedback.

Regards,
Leo

Süleyman

unread,
Sep 13, 2023, 4:53:43 AM9/13/23
to Wazuh | Mailing List
Hello Leo

Thank you for your help. There is only 1 server in the structure. When I do "curl 127.0.0.0.1:9300" I get "curl: (52) Empty reply from server".
13 Eylül 2023 Çarşamba tarihinde saat 00:49:35 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:

Leonardo Quiceno

unread,
Sep 13, 2023, 1:13:24 PM9/13/23
to Wazuh | Mailing List
Hi Süleyman,

Okay, let's review a bit what structure you have in place to address the issue correctly. According to the information you've sent, you have a cluster with multiple indexer nodes, one server, and one dashboard. Are these central components on different machines? You can check this in the config.yml file. The idea is to verify your setup in this file and let me know. 
Perhaps we have been approaching the problem incorrectly. You can refer to this information: 

Süleyman

unread,
Sep 14, 2023, 4:52:20 AM9/14/23
to Wazuh | Mailing List
Hello Leo

In my server structure, all components are on a single virtual server. There is an indexer, a dashboard, a manager and a filebeat component. Before I reported this error, it has been working for about 1 year in a healthy condition. There is still no problem in data processing and other side components. I don't have a visual interface only because the dashboard service gives "all shards error" error. When I check the "config.yml" file, I see one node.
13 Eylül 2023 Çarşamba tarihinde saat 20:13:24 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:

Wazuh | Mailing List

unread,
Sep 18, 2023, 6:18:07 PM9/18/23
to Wazuh | Mailing List
Hi Süleyman,

If you tell me that you've been working without any issues for a year, we need to identify what happened to start experiencing this issue. Please tell me if you updated the Wazuh version or if you remember what was the last thing you did before encountering this error. The idea is to try to eliminate as many variables as possible. 

It would also be very helpful if you could answer the following questions:
  • What version of Wazuh are you using?
  • Do you remember what type of Wazuh installation you performed? (Quickstart or Installation of each component separately)
  • How many agents do you have connected to the Wazuh manager?
  • Do you know what version of Wazuh the agents are running?
Sorry for asking so many questions, but based on what we've seen, I still don't have a clear understanding of what's happening in your particular case.

Regards,
Leo

Süleyman

unread,
Oct 2, 2023, 3:41:23 AM10/2/23
to Wazuh | Mailing List
Hello Leo

Sorry for the late reply.

Actually, it just kind of happened. While everything was normal, I saw that the logs were not coming in the interface. After that, I did an update. Then I read a topic where it was said that increasing the number of shards could be a solution. So I increased the number of shards from 3 to 10 and I was able to see the logs in the interface for about 4-5 days. When I encountered the same error again, I restarted my server and I could not access the interface again. When I checked my services, I saw the all shard errors error.

My server has Ubuntu version 22.04
I am using Wazuhun version 4.5.2
I did it as a separate installation.
I have around 150 devices in total.
My agents usually have version 4.3.


Thank you for your help.

19 Eylül 2023 Salı tarihinde saat 01:18:07 UTC+3 itibarıyla Wazuh | Mailing List şunları yazdı:

Leonardo Quiceno

unread,
Oct 2, 2023, 6:59:51 PM10/2/23
to Wazuh | Mailing List
Hi Süleyman,

Since you performed a separate installation of each component, could you try uninstalling the dashboard and then reinstalling it? This might help resolve the issue you're facing. To uninstall only the dashboard component, you can follow the instructions in the following documentation: Uninstall the Wazuh Dashboard.

To reinstall it, you can follow the steps outlined in this link: Installing the Wazuh Dashboard - Step by Step.

Let me know if you find this information helpful.

Leonardo Quiceno

unread,
Oct 6, 2023, 2:58:33 PM10/6/23
to Wazuh | Mailing List
Hi again Süleyman,

I was analyzing your case a bit more and based on the information you give me about the increase you did on the shards and that it worked fine for a few days after that, this could mean that the issue you are experiencing is that the cluster currently has reached the maximum number of shards allowed. This can happen when the cluster is under heavy load or when there are too many indices created.
if this is what is happening, you could solve it by optimize the shard allocation by merging smaller indices into larger ones.

Here's how you can address this issue:
Running the commands from Dev Tools on the OpenSearch menu.

1) Investigate the number of shards per index: You can use the following command to check the number of shards for each index:

- GET _cat/shards?v

This will give you a list of all the shards in your cluster, including the index they belong to, their state, and other information. If you see indices with a large number of shards, this might be a sign that you're over-sharding.


2) Reduce the number of shards: If you have indices with a large number of shards, consider reducing them. You can do this by changing the number of primary shards when creating a new index:

PUT /my_new_index
{
  "settings" : {
    "index" : {
      "number_of_shards" : 1,
      "number_of_replicas" : 1
    }
  }
}

This command will reduce the number of shards and replicas in my_new_index. As for the recommended values, it depends on your use case and the resources available in your cluster:

     a) index.number_of_shards: The optimal number of primary shards depends on the amount of data in your index. As a general rule, aim for shard sizes between 10GB and 50GB. For example, if you expect your index to grow to 200GB,               setting index.number_of_shards to 4 would be a good start.

     b) index.number_of_replicas: The number of replica shards should be determined based on the required search throughput and data resiliency. Each replica provides a copy of the primary shard for failover and increases your capacity to           serve search requests. A common practice is to set index.number_of_replicas to at least 1 for production environments.


3) Delete unused indices: If there are indices that you no longer need, you can delete them to free up shards. Here's how you can delete an index:

DELETE /my_unused_index


4) Increase the shard limit: As a last resort, you can increase the maximum limit of open shards. However, be aware that having too many shards in a cluster can lead to poor performance and stability issues. If you decide to increase the limit, you can do so by updating the cluster.max_shards_per_node setting:

PUT _cluster/settings
{
  "persistent": {
    "cluster.max_shards_per_node": 2000
  }
}


This command will increase the maximum limit of open shards per node to 2000. Adjust the number according to your cluster's capacity.

Remember, it's important to monitor your shard usage and plan your indices and shard allocation according to your data and query requirements to avoid such issues.

Additionally, you can consider scaling up the cluster by adding more nodes or upgrading the hardware to handle the increased workload.


You can also use these commands if you do not have access to GUI.

curl -X GET "https://<indexer-IP>:9200/_cat/shards?v" -u <username>:<password> -k -H

Or with localhost
curl -k -u user:passwd -XGET "https://localhost:9200/_cat/shards?v"

Please let me know if there's anything else I can help you with.

Süleyman

unread,
Oct 11, 2023, 10:07:14 AM10/11/23
to Wazuh | Mailing List
Hello Leonardo

I've decided to cleanly reinstall Wazuh again. Thank you for your help.

6 Ekim 2023 Cuma tarihinde saat 21:58:33 UTC+3 itibarıyla Leonardo Quiceno şunları yazdı:
Reply all
Reply to author
Forward
0 new messages