Are there json format files available somewhere for built-in formats?

169 views
Skip to first unread message

Piotr Dobrogost

unread,
Dec 3, 2020, 9:08:59 AM12/3/20
to lnav
Hi,

What timestamp formats are accepted by built-in "generic_log" format and is this documented somewhere?
At https://lnav.readthedocs.io/en/latest/formats.html# there is a format called "block_log" but there is no "generic_log" which is being displayed by lnav when opening log files I'm working with.
Btw, what is the difference between "generic_log" and "block_log"?

Are there json format files available somewhere for built-in formats?

Regards,
Piotr Dobrogost

Timothy Stack

unread,
Dec 3, 2020, 1:09:24 PM12/3/20
to Piotr Dobrogost, lnav
On Thu, Dec 3, 2020 at 6:09 AM Piotr Dobrogost <p.dob...@gmail.com> wrote:
Hi,

What timestamp formats are accepted by built-in "generic_log" format and is this documented somewhere?

The default set of timestamp formats used by all formats is here:


The formats are "compiled" at build time into functions so the format
string is not interpreted at run time.

At https://lnav.readthedocs.io/en/latest/formats.html# there is a format called "block_log" but there is no "generic_log" which is being displayed by lnav when opening log files I'm working with.

The builtin formats that are regex-based are here:


The "generic_log" format is implemented by this C++ code:


Btw, what is the difference between "generic_log" and "block_log"?

The block_log is used for files where a log message is a line with just
the date followed by the actual data.  So, something like:

    Mon Sep 24 21:02:12 PDT 2018

    Removing old temporary files:

    Cleaning out old system announcements:

    Removing stale files from /var/rwho:
 
Whereas the generic_log is used for files where there is a timestamp
followed by a log level and some text.  The generic_log should only
be used if all of the log types failed to match.

Are there json format files available somewhere for built-in formats?

The format files are written to ~/.lnav/formats/default/*.sample when lnav
starts up.
Regards,
Piotr Dobrogost

tim

Piotr Dobrogost

unread,
Dec 3, 2020, 2:54:19 PM12/3/20
to lnav
On Thursday, December 3, 2020 at 7:09:24 PM UTC+1 Tim wrote:
On Thu, Dec 3, 2020 at 6:09 AM Piotr Dobrogost wrote:
Hi,

What timestamp formats are accepted by built-in "generic_log" format and is this documented somewhere?

The default set of timestamp formats used by all formats is here:


What function do you use to parse these? I'm asking in the context of specifiers used which seem to be a superset of those used with strptime as shown here https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html

Am I seeing right that 2020-12-02 03:47:17,360 format is not on the list?
If so then how lnav still manages to treat log file with timestamps in such format as "generic_log" format?

Is there any provision for timestamps enclosed in delimiters?
For instance our log format has timestamps enclosed in square brackets - "[2020-12-02 03:47:17,360]".


Regards,
Piotr Dobrogost

Timothy Stack

unread,
Dec 3, 2020, 3:38:29 PM12/3/20
to Piotr Dobrogost, lnav
On Thu, Dec 3, 2020 at 11:54 AM Piotr Dobrogost <p.dob...@gmail.com> wrote:
On Thursday, December 3, 2020 at 7:09:24 PM UTC+1 Tim wrote:
On Thu, Dec 3, 2020 at 6:09 AM Piotr Dobrogost wrote:
Hi,

What timestamp formats are accepted by built-in "generic_log" format and is this documented somewhere?

The default set of timestamp formats used by all formats is here:


What function do you use to parse these? I'm asking in the context of specifiers used which seem to be a superset of those used with strptime as shown here https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html

The parser for each conversion is in here:


The parser used at runtime for a given format is:


The date_time_scanner in the following file does the work of checking
a string against a collection of formats:

Am I seeing right that 2020-12-02 03:47:17,360 format is not on the list?

The "%Y-%m-%d %H:%M:%S" format should match that, the millisecond
portion is handled automatically by the date_time_scanner.

If so then how lnav still manages to treat log file with timestamps in such format as "generic_log" format?

Is there any provision for timestamps enclosed in delimiters?
For instance our log format has timestamps enclosed in square brackets - "[2020-12-02 03:47:17,360]".

The generic_log will try a few different formats, there are some that have
brackets:

If you have a custom log format, have you tried to define a custom format
for lnav?  I can help with that if you provide some sample log messages.

tim

 

 

Regards,
Piotr Dobrogost

--
You received this message because you are subscribed to the Google Groups "lnav" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lnav+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lnav/70764f40-05d8-499e-a6ac-943dec8665a1n%40googlegroups.com.

Piotr Dobrogost

unread,
Dec 4, 2020, 2:31:33 PM12/4/20
to lnav
On Thursday, December 3, 2020 at 9:38:29 PM UTC+1 Tim wrote:
On Thu, Dec 3, 2020 at 11:54 AM Piotr Dobrogost wrote:
On Thursday, December 3, 2020 at 7:09:24 PM UTC+1 Tim wrote:
On Thu, Dec 3, 2020 at 6:09 AM Piotr Dobrogost wrote:
Hi,

What timestamp formats are accepted by built-in "generic_log" format and is this documented somewhere?

The default set of timestamp formats used by all formats is here:


What function do you use to parse these? I'm asking in the context of specifiers used which seem to be a superset of those used with strptime as shown here https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html

The parser for each conversion is in here:


Thank you for providing references to source code. Still, I guess conversions you support in timestamps' formats are based on some well known ones used when parsing date and time. If so can you point me to any documentation on these conversions? I guess it would be easier for most people to read their description in plain English rather than dive into source code to find out what they are supposed to mean.
 

Am I seeing right that 2020-12-02 03:47:17,360 format is not on the list?

The "%Y-%m-%d %H:%M:%S" format should match that, the millisecond
portion is handled automatically by the date_time_scanner.

If milliseconds are handled elsewhere (external to formats) how is it that some timestamp formats in Makefile.am include .%f .%L conversions?
Talking about milliseconds; is comma supported everywhere period is after %S conversion? I see only one pair of formats in the Makefile.am which seems to support this:
"%d %b %Y %H:%M:%S.%L"
"%d %b %Y %H:%M:%S,%L"
What about SCRUB_PATTERN (https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L86) which seems to handle only comma and not period?
 

The generic_log will try a few different formats, there are some that have


How can I know from within UI that lnav managed to parse timestamps in my log file correctly?

I would suggest keeping order of characters in the character class in the patterns at https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L92 the same across patterns so that the difference would be easier to notice. For instance the difference between [\\w:,/\\.-]+ and the following [\\w: \\.,/-]+ is only extra space in the latter yet as other characters are in different order it's harder to compare them and notice the difference.

Is the plan to use raw strings for regexes (to improve readability) everywhere? I noticed it's being used at https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L102 but not in other places.

If you have a custom log format, have you tried to define a custom format
for lnav?  I can help with that if you provide some sample log messages.

Thank you very much for your offer to help with a custom format. I guess that's what might be necessary in the end. However I wanted to know what's available out of the box as our log format is pretty standard (comes from Python logging module in the standard library) thus I somehow expected it to be handled already.

Best regards,
Piotr Dobrogost

Timothy Stack

unread,
Dec 4, 2020, 3:45:28 PM12/4/20
to Piotr Dobrogost, lnav
On Fri, Dec 4, 2020 at 11:31 AM Piotr Dobrogost <p.dob...@gmail.com> wrote:
On Thursday, December 3, 2020 at 9:38:29 PM UTC+1 Tim wrote:

The parser for each conversion is in here:


Thank you for providing references to source code. Still, I guess conversions you support in timestamps' formats are based on some well known ones used when parsing date and time. If so can you point me to any documentation on these conversions?

I think it's covered under the "timestamp-format" description in here:

 
I guess it would be easier for most people to read their description in plain English rather than dive into source code to find out what they are supposed to mean.
 

Am I seeing right that 2020-12-02 03:47:17,360 format is not on the list?

The "%Y-%m-%d %H:%M:%S" format should match that, the millisecond
portion is handled automatically by the date_time_scanner.

If milliseconds are handled elsewhere (external to formats) how is it that some timestamp formats in Makefile.am include .%f .%L conversions?

Might just be an oversight or cases where the automatic handling doesn't
quite work.
 
Talking about milliseconds; is comma supported everywhere period is after %S conversion?

It's not necessarily after the '%S', the date_time_scanner just looks at what
is remaining after the previous parsing has finished.
 
I see only one pair of formats in the Makefile.am which seems to support this:
"%d %b %Y %H:%M:%S.%L"
"%d %b %Y %H:%M:%S,%L"
What about SCRUB_PATTERN (https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L86) which seems to handle only comma and not period?

The scrub functionality was an early feature that was turned off and hasn't been
re-enabled.  It is somewhat obsoleted by the ":hide-fields" command.

The generic_log will try a few different formats, there are some that have


How can I know from within UI that lnav managed to parse timestamps in my log file correctly?

Press 'p' to have lnav display the results of parsing the top line in the view.  It
looks like this:

Screen Region 2020-12-04 at 12.30.21.png

The "Received Time" is the timestamp that lnav computed.  Note that the
year was correctly determined even though it is not in the syslog message
timestamp.  In cases like that, lnav uses the timestamp from the file's
last-modified-time (or gzip header).

The other information shows the regex pattern that is being used and the
extracted fields and their values.

You could also execute a SQL query and see the results there:

  ;SELECT * FROM generic_log LIMIT 10

I would suggest keeping order of characters in the character class in the patterns at https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L92 the same across patterns so that the difference would be easier to notice. For instance the difference between [\\w:,/\\.-]+ and the following [\\w: \\.,/-]+ is only extra space in the latter yet as other characters are in different order it's harder to compare them and notice the difference.

Is the plan to use raw strings for regexes (to improve readability) everywhere? I noticed it's being used at https://github.com/tstack/lnav/blob/8494aefd5072ddbc689957f3e4210f550ec7d49b/src/log_format_impls.cc#L102 but not in other places.

lnav is an old code base that has existed since before C++11.  Migrating
it to new language features is an ongoing process.
 
If you have a custom log format, have you tried to define a custom format
for lnav?  I can help with that if you provide some sample log messages.

Thank you very much for your offer to help with a custom format. I guess that's what might be necessary in the end. However I wanted to know what's available out of the box as our log format is pretty standard (comes from Python logging module in the standard library) thus I somehow expected it to be handled already.

I didn't think python's default log format even included a timestamp...  Many
python programs end up customizing their log format, so I wouldn't know
how to handle it in anything but generic_log.

tim

Best regards,
Piotr Dobrogost

--
You received this message because you are subscribed to the Google Groups "lnav" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lnav+uns...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages