Way to search and count different texts in different logfiles

36 views
Skip to first unread message

Danny de Waard

unread,
Sep 24, 2020, 5:27:41 AM9/24/20
to Prometheus Users
Hi All,

I am looking for a good way to search and count for texts in logfiles.
For instance i want to count ERROR, WARN, INFO words in a logfile
But also i want to be able to count all the different response codes of a acces or ssl-request logfile.
So how many 200, 304, 404 and so on and maybe also sum the transfered bits as an average.

I now have a perl script that does a count based on regex and does this every minute but im really looking for a exporter that can do this and publish it like for instance node_exporter.

Is there any exporter that can do this in a siple way?

Stuart Clark

unread,
Sep 24, 2020, 5:32:09 AM9/24/20
to Danny de Waard, Prometheus Users
This sounds like a good fit for mtail or the grok exporter:

https://github.com/google/mtail
https://github.com/fstab/grok_exporter

--
Stuart Clark

Danny de Waard

unread,
Sep 24, 2020, 6:57:45 AM9/24/20
to Prometheus Users
I did look at grok but it was soo complex looking.
I'm now looking at mtail but my first question is, can i monitor multiple files with multiple rules in one mtail

for instacne i want to count [WARN ] and [ERROR] tags in 5 different logfiles and metric output has to be seperated like that

So
{LogA_ERROR_Metric} nn
{LogA_WARN_Metric} nn

{LogB_ERROR_Metric} nn
{LogB_WARN_Metric} nn

{LogC_ERROR_Metric} nn
{LogC_WARN_Metric} nn

Op donderdag 24 september 2020 om 11:32:09 UTC+2 schreef Stuart Clark:

Danny de Waard

unread,
Sep 24, 2020, 7:21:15 AM9/24/20
to Prometheus Users
Okay... i made a simple tes like this.

counter error_messages_in_log
counter warn_messages_in_log
counter total_messages_in_log

# To make ex_test.go happy
#strptime("2017-12-07T16:07:14Z", "2006-01-02T15:04:05Z07:00")

/(.*)/ {
  $1 =~ /ERROR/ {
    error_messages_in_log++
  }
  $1 =~ /WARN/ {
    warn_messages_in_log++
  }
  total_messages_in_log++
}

and it counts ERROR and WARN in a file and puts metrics up.
It does this for 1 file, how can i monitor multiple files like this?
Can i incorperate filenames?

Op donderdag 24 september 2020 om 12:57:45 UTC+2 schreef Danny de Waard:

Michael Ströder

unread,
Sep 24, 2020, 7:52:43 AM9/24/20
to Danny de Waard, Prometheus Users
On 9/24/20 1:21 PM, Danny de Waard wrote:
> Okay... i made a simple tes like this.
>
> counter error_messages_in_log
> counter warn_messages_in_log
> counter total_messages_in_log

Why not use labels for the log level?

Here's what I'm doing in my Æ-DIR for counting messages per log level
written by various Python software:

https://gitlab.com/ae-dir/ansible-ae-dir-server/-/blob/master/templates/mtail/aedir_proc.mtail.j2

Ciao, Michael.

Danny de Waard

unread,
Sep 25, 2020, 9:28:22 AM9/25/20
to Prometheus Users
That helped me some Michael.

I have made a complete regex for my ssl_request_log and i am able to use these fields.

counter apache_http_requests_total by request_method, http_version, response_code
counter apache_http_bytes_total by request_method, http_version, response_code
counter apache_http_ip_total by request_method, http_version, IP
counter apache_http_endpoint_total by request_method, http_version, endpoint

/^(?P<IP>\d+\.\d+\.\d+\.\d+)\ +(?P<hostname>[0-9A-Za-z-\.]+)\ +-\ +-\ +\[(?P<timestamp>\d{2}\/\w{3}\/\d{4}:\d{2}:\d{2}:\d{2} \+\d{4})\]\ +(?P<tls_version>TLSv\d\.\d)\ +(?P<cipher>[0-9A-Za-z-]+)\ +\"(?P<request_method>[A-Z]+)\ +(?P<endpoint>[A-Za-z\/_\.]+)\ +(?P<http_version>HTTP\/\d\.\d)\"\ +(?P<response_code>\d{3})\ +(?P<response_size>[\d-]+)\ +(?P<time_to_serve>\d+)\ +\"(?P<url>[0-9A-Za-z-\.:?=\/_&+]+)\"\ +\"(?P<browser>[0-9A-Za-z-\.:?=\/_;\(\) ]+)\"\ +\"(?P<stat1>[\w-]+)\"\ +\"(?P<stat2>[\w-]+)\"/ {

  apache_http_requests_total[$request_method][$http_version][$response_code]++
  $response_size > 0 {
      apache_http_bytes_total[$request_method][$http_version][$response_code] += $response_size
      apache_http_ip_total[$request_method][$http_version][$IP]++
      apache_http_endpoint_total[$request_method][$http_version][$endpoint]++
  }
}

So that is working.

Now i'm left with the question how to most simply use a mtail prog to scan for text in different files with different setups.
I now have this:

counter ERROR_in_log by log_file
counter WARN_in_log by log_file
counter API_Inc_in_log by log_file
counter total_hits_in_log by log_file


/(.*)/ {
  $1 =~ /ERROR/ {
    ERROR_in_log[getfilename()]++
  }
  $1 =~ /WARN/ {
    WARN_in_log[getfilename()]++
  }
  $1 =~ /Incoming API request from/ {
    API_Inc_in_log[getfilename()]++
  }
  total_hits_in_log[getfilename()]++
}

But i think it can be better streamlined....
Also i suspect if a logline has ERROR and WARN in it the WARN is not counted.
So if i put in more elements some might be skipped.

Op donderdag 24 september 2020 om 13:52:43 UTC+2 schreef Michael Ströder:

Michael Ströder

unread,
Sep 25, 2020, 10:06:45 AM9/25/20
to Danny de Waard, Prometheus Users
On 9/25/20 3:28 PM, Danny de Waard wrote:
> That helped me some Michael.
>
> I have made a complete regex for my ssl_request_log and i am able to use
> these fields.

I didn't get that you're parsing Apache httpd logs.

There are examples available for various service logs:

https://github.com/google/mtail/tree/master/examples

Ciao, Michael.

Danny de Waard

unread,
Sep 25, 2020, 10:26:44 AM9/25/20
to Michael Ströder, Prometheus Users
I did see those as well. My apache log is somewhat different.
But now I'm interested in part 2 of my question.
Can that be more mean and lean...

Op vr 25 sep. 2020 16:06 schreef Michael Ströder <mic...@stroeder.com>:
Reply all
Reply to author
Forward
0 new messages