ossec-csyslogd not running...

1,162 views
Skip to first unread message

Robert H

unread,
Feb 7, 2018, 5:24:51 PM2/7/18
to Wazuh mailing list
I have an Wazuh 2.0 manager with about 15 agents on it sending syslog to itself to forward to a SIEM.  I noticed an Out of Memory error in the SIEM syslog connector events and the Wazuh manager's memory usage went from 5-7 GB down to 550 MB.  I noticed the Wazuh manager events stopped flowing to the SIEM via the syslog.

Then saw the syslogd was not running.

ossec-csyslogd not running...

The OS has 16GB of RAM which should be plenty.  Are there memory or performance options in the configuration somewhere to adjust so this csyslogd does not stop?

After I restart the manager everything works again.

Regards,
Robert

Santiago Bassett

unread,
Feb 7, 2018, 6:21:29 PM2/7/18
to Robert H, Wazuh mailing list
Hi Robert,

we don't know what could be causing this behavior. Would it be possible for you to upgrade the manager to the latest version and see if the problem persists? (agents do not need to be upgraded, as the managers are backwards compatible).

Best regards,

Santiago.


--
You received this message because you are subscribed to the Google Groups "Wazuh mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wazuh+unsubscribe@googlegroups.com.
To post to this group, send email to wa...@googlegroups.com.
Visit this group at https://groups.google.com/group/wazuh.
To view this discussion on the web visit https://groups.google.com/d/msgid/wazuh/29ad9798-317c-4278-a86a-ac9050fb5733%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert H

unread,
Feb 7, 2018, 7:16:11 PM2/7/18
to Wazuh mailing list
Hi Santiago,
Thanks for the quick reply.  We have to use/stay on 2.0 for a while.  We are testing the 3.x manager/agent now, but not ready to move to it yet.  As a workaround I wrote a short script to check if the ossec-csyslogd is not running and restart the manager if so.

Regards,
Robert

Robert H

unread,
Feb 8, 2018, 7:02:40 PM2/8/18
to Wazuh mailing list
Hi Santiago,
I wanted to ask another question on this.  As mentioned we are sending syslog out of the manager to a SIEM.  With only about 10 active agents on the manager the last couple of days I have seen the RAM usage go up and up.  The server has 16GB of RAM and I noticed that the usage had gone up to over 15GB.  I sorted top by the top Memory % processes and the ossec-csyslogd was the clear leader using 88% of the RAM.  I restarted the manager and the RAM usage went back down to about 1.5GB.  It's slowly increasing about 500MB an hour.  

Is there a way to start the manager or process and limit the memory usage to xx% of the system memory?  Also, this is with Wazuh manager 2.0.  Do you know if this behavior still exits in the 2.1 or 3.1 versions?

Thanks,
Robert

Santiago Bassett

unread,
Feb 14, 2018, 5:21:06 AM2/14/18
to Robert H, Wazuh mailing list
Hi Robert, 

apologies for the late response. 

This behavior makes me think that maybe ossec-csyslogd is suffering some kind of memory leak. I will ask our development team to take a look at this and be back to you. In the meanwhile, as a quick and dirty workaround, my advice would be to restart that process (via cron task) every couple hours.

If this is actually a bug, I am convinced we will find it and fix it.

Best regards


--
You received this message because you are subscribed to the Google Groups "Wazuh mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wazuh+unsubscribe@googlegroups.com.
To post to this group, send email to wa...@googlegroups.com.
Visit this group at https://groups.google.com/group/wazuh.

rafael...@wazuh.com

unread,
Feb 21, 2018, 3:00:05 PM2/21/18
to Wazuh mailing list
Hi Robert,

we are trying to reproduce the problem.

Could you please provide your ossec.conf file from your manager?
Have you tried the 3.2 version of the manager? If yes, do you encounter the same issue?

Best regards.

Robert H

unread,
Feb 21, 2018, 7:48:34 PM2/21/18
to Wazuh mailing list
Hi guys,
Yes, the problem is still occurring.  I also found in ossec blog posts back to 2011/2012 there is a memory leak in the csyslogd.  We currently cannot upgrade the manager to the 3.x series.  We are just now starting to load up thousands of clients/agents on the planned setup.  Manager is 2.0, agents are 2.1

I do have a cronjob scheduled to restart the manager once a day.

I've also seen recently a second daemon is having issues and stops running.  ossec-remoted not running...

So I have a simple script checking every 5 minutes to see if either remoted or csyslogd are not running and if so restart the manager.

The second problem, with stopping remoted, seems to correlated to this error.  When remoted stops,  the ossec.log has this error.  All Windows systems.

If seems to be different agents each time:
2018/02/21 16:30:19 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '139' (host '10.x.x.x')
2018/02/21 16:42:05 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '032' (host '10
2018/02/21 16:42:05 ossec-remoted(1243): ERROR: Couldn't receive message from peer.
2018/02/21 16:42:09 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '085' (host '10
2018/02/21 16:42:09 ossec-remoted(1243): ERROR: Couldn't receive message from peer.
2018/02/21 16:42:09 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '062' (host '10

My script will detect the above and restart the manager.  However this is not a very comfortable position to be in.  I have seen posts on the ossec group about the remoted daemon stopping and everyone suggests checking for duplicate IPs.  I have checked and with 200 agents so far, there are no duplicate IPs in the client.keys file.


Back to the csyslogd now:
A programmer posted a suggested fix for the memory leak in this post from 2012, but I have not had time to check it out.


///////////////////////////////////////////////////

Do you have any information that might be helpful for these issues?  

Thanks,
Robert

rafael...@wazuh.com

unread,
Feb 22, 2018, 9:13:01 AM2/22/18
to Wazuh mailing list
Hi Robert,

thank you for the information you provided, we have checked that ossec fork and our code has already the fix.
The memory leak must be somewhere else.

It would be gratefully if you could post you syslog configuration and your global configuration of the Gossec.conf file.

Thank you!


On Wednesday, February 7, 2018 at 11:24:51 PM UTC+1, Robert H wrote:

Robert H

unread,
Feb 22, 2018, 11:55:06 AM2/22/18
to Wazuh mailing list
Hi Rafael,
Thanks for the update.  Glad to know about the code has been improved.  

Is this what you're referring to for the global conf?  We are not using email notifications.  

<ossec_config>
  <global>
    <jsonout_output>yes</jsonout_output>
    <alerts_log>yes</alerts_log>
    <logall>no</logall>
    <logall_json>no</logall_json>
    <email_notification>no</email_notification>
    <smtp_server>smtp.example.wazuh.com</smtp_server>
    <email_from>oss...@example.wazuh.com</email_from>
    <email_to>reci...@example.wazuh.com</email_to>
    <email_maxperhour>12</email_maxperhour>
  </global>


 <syslog_output>
    <server>127.0.0.1</server>
    <level>3</level>
    <format>cef</format>
  </syslog_output>

We are sending the syslog output to itself where an HP smartconnector is receiving it and forward it a SIEM.

Here's an example of the mem usage.

# free -m
             total       used       free     shared    buffers     cached
Mem:         16080      15892        188          0         12        423    (after the manager restarts, the used will reduce to 1.5GB from 15.8 GB)
-/+ buffers/cache:      15456        624  

From top:
 8664 ossecm    20   0 14.4g  14g  664 S  0.0 91.6   0:38.46 ossec-csyslogd    (%mem is 91%)  the next highest mem use is 2.6%


Regards,
Robert

Robert H

unread,
Feb 22, 2018, 7:14:07 PM2/22/18
to Wazuh mailing list
Also Rafael,
Do you have any information about why the remoted service would be stopping?  It seems to me it correlates to this error in the ossec.log.

2018/02/21 16:30:19 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '139' (host '10.x.x.x')
2018/02/21 16:42:05 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '032' (host '10
2018/02/21 16:42:05 ossec-remoted(1243): ERROR: Couldn't receive message from peer.
2018/02/21 16:42:09 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '085' (host '10
2018/02/21 16:42:09 ossec-remoted(1243): ERROR: Couldn't receive message from peer.
2018/02/21 16:42:09 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '062' (host '10

Also,
Yesterday we tried connecting a couple hundreds agents using the api.  The manager seems to be having some issues with stability (maybe in part due to the remoted issue), but what maximum number of agents would you install via the bat/powershell script method using the api?  We have added 10 or 20 before and it seemed to be okay.  Should adding 100 or 300 in a single deployment overload the manager?

Thanks,
Robert

rafael...@wazuh.com

unread,
Feb 23, 2018, 9:42:47 AM2/23/18
to Wazuh mailing list
Hi Robert,

thanks for posting your configuration. We weren't able to reproduce the issue so I wanted to know if you are able to have a meeting next week so we can check and see the issue in your system.

Im awaiting for your response.

Best regards.


On Wednesday, February 7, 2018 at 11:24:51 PM UTC+1, Robert H wrote:

Robert H

unread,
Feb 23, 2018, 1:53:22 PM2/23/18
to Wazuh mailing list
Hi Rafael,
Thanks for the suggestion.  We may like to do that in the near future.  First a couple updates.

As for the csyslogd, I realized last night that something I'd considered during the initial testing months ago may be contributing to the syslog memory issue.  When the manager was originally installed and a manager image created, the enable remote syslog option was selected.  There is no log information coming to the manager via syslog.  So I think it's possible the issue is the manager was set to receive syslog and is sending out syslog to itself (meant to go to an ArcSight connector and into that SIEM.  (We will also connect this to the Kibana/Wazuh app).  So the manager was sending syslog out to itself and receiving syslog at the same time.  I have removed the <remote>syslog  configuration and will monitor it to confirm if the memory issue continues or is gone.

I think I have nailed the second issue, with remoted, down with this example.  I enabled logging on the manager.  This is from the ossec.log file.    When the ERROR occurs it knocks out remoted.  

2018/02/23 10:33:33 ossec-remoted: DEBUG: Agent <name removed> sent HC_STARTUP from 0.0.0.0.
2018/02/23 10:33:33 ossec-remoted: DEBUG: New TCP connection at 10.x.x.x.
2018/02/23 10:33:33 ossec-remoted: DEBUG: Agent <name removed> sent HC_STARTUP from 0.0.0.0.
2018/02/23 10:33:33 wazuh-modulesd:database: DEBUG: Synchronizing file '<removed>/ossec/queue/agent-info/<name removed>-10.x.x.x'
2018/02/23 10:33:33 ossec-remoted: DEBUG: New TCP connection at 10.x.x.x.
2018/02/23 10:33:33 ossec-remoted: DEBUG: Agent <name removed> sent HC_STARTUP from 0.0.0.0.
2018/02/23 10:33:33 ossec-remoted(1403): ERROR: Incorrectly formatted message from agent '259' (host '10.x.x.x').

I checked the status and it's stopped running and events stopped flowing
 managerstatus 
Deleting PID file '<removed>/ossec/var/run/ossec-remoted-30790.pid' not used...
ossec-monitord is running...
ossec-logcollector is running...
ossec-remoted not running...
ossec-syscheckd is running...
ossec-analysisd is running...
ossec-maild not running...
ossec-execd is running...
wazuh-modulesd is running...
ossec-csyslogd is running...

I waited a couple minutes and the alerts started flowing again.  I checked the manager and remoted was running now.

I recorded when my script checks to see if remoted is running and writes a timestamp when it restarts the manager.  It runs */6 minutes.  
This timestamp shows the script restarted the manager which got remoted running again and alerts flowing again.

Fri Feb 23 10:36:05 PST 2018

I then checked the manager status after the restart and it's running and alerts are flowing again.
 managerstatus 
ossec-monitord is running...
ossec-logcollector is running...
ossec-remoted is running...
ossec-syscheckd is running...
ossec-analysisd is running...
ossec-maild not running...
ossec-execd is running...
wazuh-modulesd is running...
ossec-csyslogd is running...

Do you have any ideas about why this is happening?

Regards,
Robert

Robert H

unread,
Feb 25, 2018, 8:48:17 PM2/25/18
to Wazuh mailing list
Hi Rafael,
Yes, we are interested in a meeting.  Please let me know the time frames available.  Also, we have made some progress by finding this information, but we still have a problem with remoted and syslogd stopping.  We did a pilot group of about 30 agents and it went fine.  Then we added 300 agents and now have agents connecting and disconnecting and taking up all the 1514 ports/sockets.  We will see all 300 agents showing disconnected, but many many Established connections from agent computers on 1514 at the manager.

This is the post about adjusting CentOS 6, NIC settings.  

Thanks,
Robert

rafael...@wazuh.com

unread,
Mar 1, 2018, 3:52:48 AM3/1/18
to Wazuh mailing list
Hi Robert,

sorry for the late response, it has been a very busy week. From what you posted, have you any updates for csyslogd?

We can have a meeting tomorrow or next week if you can't. The timeframes are from 11:00 - 13:30, 15:00 - 19:00 UTC + 1.

Best regards.

On Wednesday, February 7, 2018 at 11:24:51 PM UTC+1, Robert H wrote:

Robert H

unread,
Mar 6, 2018, 2:25:09 PM3/6/18
to Wazuh mailing list
Hi Rafael,
Luis and I are working together.  We received information about bugs/issues with using the tcp protocol for agent communication in versions 2.x.  Since we are at the beginning of the project, we decided to remove all the agents and upgrade everything to the current 3.2.1 version.  So hopefully, the syslogd memory leak and the remoted mal-formed message and tcp race condition will just go away.

Thanks for your help and information.

Regards,
Robert
Reply all
Reply to author
Forward
0 new messages