Improvement Wazuh and Development Language Pack

2,099 views
Skip to first unread message

Mert Nar

unread,
Mar 19, 2020, 5:22:01 AM3/19/20
to Wazuh mailing list
Hello All,
I want to contribute Wazuh development process by adding some new moduls and turkish language pack. So, I need to set up a development environment for my forked wazuh github sources (wazuh, wazuh-app, wazuh-kibana). How can i prepare an complete development environment to test all distributed packages in one project? How to run wazuh project in a development environment?
I want to do these:
- At the manager side, I want to add a new modul to improve correlation engine with machine learning techniques. I have some idea about them and maybe we write an academic paper for it but i need to prepare test environment.
- Is it possible to develop a language pack? Is there any data packet which contain complete all english words? Or are there any static files which contain all menu and interface links and statements?

Thanks,
Mert

Nicolas Papp

unread,
Mar 19, 2020, 10:33:30 AM3/19/20
to Mert Nar, Wazuh mailing list
Hi Mert,
My name is Nicolas and I am part of the core development team in Wazuh. It is great that you want to contribute to the project and we encourage you to ask for all the help and guidance you need. The first thing i recommend you to do is to set up 2 VMs, one with a manager and one with an agent. You can rely on our documentation to do that. I recommend you to start your changes in core, and then move forward Kibana and App once all your changes have been tested for core.


You have also a link for installing the ElasticSearch stack in the following link: https://documentation.wazuh.com/3.9/installation-guide/installing-elastic-stack/index.html

It is my understanding that you want to add a new module for analysisd, which is our correlation engine. For that you can do it inside the analysisd module itself or as a stand-alone module in wazuh-modules. You can fork our code from Github https://github.com/wazuh/wazuh and do all the changes you want there. You can also work with the ruleset  repository to define new rules.  

Regarding a language pack,it will not be an easy task since we do not really have a functionally you can extend to add a new languague, you will probably need to redefine:
- All the log messages used in core (which are in src/error_messages folder).
- All the rulesets descrptions
- All the instalation messages
and probably redefine APIs and ElasticSearch templates.
 
I do not want to overwhelm you with information in just one email. I recommend you to join our Slack channel so we can have a more fluent conversation and we can help you out throughout the process. I can help you get started with the core part and when this is done we can have people from other teams pitch in to help you take your changes all the way up to the Kibana GUI.

Best Regards,
Nicolas Papp

--
You received this message because you are subscribed to the Google Groups "Wazuh mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wazuh+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wazuh/f5833ba2-dd48-47c0-9525-0cd5acc2ce56%40googlegroups.com.

Nicolas Papp

unread,
Mar 24, 2020, 10:43:18 AM3/24/20
to Mert Nar, Wazuh mailing list
Hello Mert,
In this case I wouldn ́t recommend you logcollector then to pre-process the information you want. You should probably define a new module on wazuh_modules since the input sources you are mentioning are not all in Logcollector. For example:
- Network traffic -> We have an integration with Suricata, which is another open source project https://documentation.wazuh.com/3.11/learning-wazuh/suricata.html?highlight=suricata

You should probably hook up to those sources, and then in the module translate it to your vector space (You mentioned n-gram, markov chain I am guessing it could also be petri nets, state machines, etc...). Then you can add a specific decoder on the manager end (decoders/plugins) to process that and define the rules in xml format to check for matches.

The only thing that I don't understand is how you will evaluate this markov chains. Maybe if you send me some of the literature you have been reading I could understand the idea a little better and give you better advice. I worked for a few years in telecommunications and many of the stochastic processing algorithms are described by markov chains too so I am familiarized with the concept but I still do not completely understand how you plan to use it here.

Best Regards,
Nicolas


On Tue, Mar 24, 2020 at 4:30 AM Mert Nar <mrt....@gmail.com> wrote:
Hi Nicolas,
Thank you for your email. When I review logcollector c code directory, I have seen that it is about reading file and log in system. There are 6 log types and each function reads logs if it is active in config, However I cant understand how to collect network traffic and agent running process list with their API (or system function) calling.
Let me explain what I want to do here. I am an academic person. I have seen in literature there are so many methods that are not in use in real life. All of them are free for contributing to Cyber Security like Wazuh. This junction point can be an opportunity for merging them. So, how to do that?
They have dataset. They extract feature and test their proposed method with as far as more parameter and mathematical background. Then they explain everything and publish the paper as a last point of theory. This should be start point of practice if we create an environment which they test their method in wazuh.
This is only possible with adding two freedom choice for acedemicians. One is for feature and another is for model. Feature is about logcollector. They can use different data type in their model, like network traffic, API-call function, disassembly of executable file, etc. They will want to treat their data vector in some processes like n-gram, markov chain etc. These should be coded in the module placed in analysisd.
Firstly I want to implement my model which is based on Markov chain. It has two different version, with using network traffic data and with API-calls. When I succeed to get these data and send to analysisd, I will implement the model and attach to analysisd.
I hope that this is not dream and it can be possible. 
Thanks,
Mert

Nicolas Papp <nicola...@wazuh.com>, 22 Mar 2020 Paz, 07:10 tarihinde şunu yazdı:
Hi Mert, there is an Illustration in  https://documentation.wazuh.com/3.10/user-manual/capabilities/log-data-collection/how-it-works.html that can give you a little more perspective:

image.png
What Logcollector does is basically read log files. Those log files could be in both the manager or the agent so both components have them. In the case of an agent the logline is inserted into an event and sent to the manager. After that the event is received by
Analysisd that does all the decoding in rule matching, there is where you probably want to hook your module. If a match is produced in analysisd an alert is generated an even you can configure a response to be executed in the agent. Check up this ink https://documentation.wazuh.com/3.10/user-manual/capabilities/active-response/how-it-works.html if you want more insights on that.
This illustration sums it up pretty well:
image.png
Please let me know if there is anything else I can help with.
Best Regards,
Nicolas

On Fri, Mar 20, 2020 at 8:23 PM Mert Nar <mrt....@gmail.com> wrote:
Hi Nicolas,
Thank you for your fast response and informative email. It has been a very good start for me. 
Before I tried to understand the codes of wazuh, I have already set up these virtual machines. So, it is okay for me to understand what it is.
But I cant understand logcollector module completely. How does it work? What actually does it collect as log? There are some defined rules and decoders, okay. but what is the relation log and event?  after collecting them, how to relate log (or event) with rule and decode in code? 
are log and event different or same?
Thank you for your time, really appreciate it
Mert

Nicolas Papp <nicola...@wazuh.com>, 19 Mar 2020 Per, 17:33 tarihinde şunu yazdı:
Reply all
Reply to author
Forward
0 new messages