Syslog with @cee json and fluentd.

Christian Hedegaard

unread,

Jan 28, 2014, 9:05:12 PM1/28/14

to flu...@googlegroups.com

Guys, we have a system currently that uses rsyslog to pass json messages using the @cee syntax. We currently have a pipe that takes these and puts them into an elasticsearch cluster. We would like to replace this pipeline with fluentd but aren’t sure what the best way to do this is.

Does the default in_syslog plugin support @cee json and will it put them into an ES index properly formatted and in the logstash format, or will we have to stop using json and find maybe something like a fluentd plugin for rsyslog or our application that currently formats the json for syslog?

Am I making sense? It’s the end of the day and I’ve been working on Elasticsearch all day long so my thoughts are kinda cloudy J

Masahiro Nakagawa

unread,

Jan 29, 2014, 3:25:31 AM1/29/14

to flu...@googlegroups.com

Hi Christian,

I don't understand rsyslog's @cee json syntax.

Could you show me the example?

Body is json or entire content is json?

<6>Sep 11 00:00:00 localhost logger: {message json}

or

{"pri": 6, "time": ..., "host": "localhost"}

?

If you can change the format to simple syslog format,

it seems easy to adapt fluentd.

Thanks

Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Christian Hedegaard

unread,

Jan 29, 2014, 2:05:07 PM1/29/14

to flu...@googlegroups.com

http://www.rsyslog.com/tag/cee/

Here is an example @cee log message from our chat server:

Jan 29 19:03:17 chat-01 borogrove: @cee: {"es-service": "chat", "from": "blah", "relayed-for": "CHS", "to": "BOB", "ts": "2014-01-29 19:03:17.789740", "logged_at": 1391022197, "type": "presence", "msg": "disconnected"}

“@cee” denotes to syslog and/or a parser the beginning of JSON.

Kiyoto Tamura

unread,

Jan 29, 2014, 2:08:39 PM1/29/14

to flu...@googlegroups.com

Hi Christian,

Just to clarify, this is basically a syslog format where the "message" section is a JSON, right?

Kiyoto

Christian Hedegaard

unread,

Jan 29, 2014, 2:16:31 PM1/29/14

to flu...@googlegroups.com

Correct.

Right now we use this with a simple in-house parser to take these messages and stick them into elasticsearch using the bulk api. The parser looks at the “es-service” field and inserts messages into that index (in this case “chat”). From what I’ve seen with fluentd, however, you can send syslog to fluentd and then into ES but I’ve not found a way to send messages into specific indexes based on the content of the message.

We have a large ES cluster and we put certain data into certain indexes. Everything currently flows through syslog, but I’m trying to change that. I want to use fluentd as our pipeline, which ultimately will not just be used for syslog but other things such as app tracebacks and other messages in our environment.

What I want to be able to do is have syslog relay all these messages into fluentd on the local host (for example: on a chat server), and the chat-specific messages would go into a chat index in ES for searching, but the rest of the system logs would just be put to disk and backed up into S3.

So far all I can figure out is how to put everything to disk/S3 and/or everything into elasticsearch. I can’t figure out how to tag certain messages one way and other messages a different way so that when they end up on our centralized server some make it into ES and some make it to disk.

Compounded on top of all of this is that we have a lot of messages going through syslog with the @cee json structure, which is what allows us to cherry-pick messages and put them into different ES indices. I want to do this with fluentd instead.

Masahiro Nakagawa

unread,

Feb 3, 2014, 2:35:23 AM2/3/14

to flu...@googlegroups.com

> Compounded on top of all of this is that we have a lot of messages going through syslog with the @cee json structure, which is what allows us to cherry-pick messages and put them into different ES indices. I want to do this with fluentd instead.

I see.
I think supporing cee json is not difficult but I'm not sure cee json is popular or not.
Current supported formats by parser are popular format, e.g. json, apache, csv, tsv, etc.

If syslog with json is popular, we can support such parser in syslog format.

Masahiro

Kiyoto Tamura

unread,

Feb 3, 2014, 2:52:50 AM2/3/14

to flu...@googlegroups.com

For what it's worth, if we can support this format, nxlog -> Fluentd will be much easier and robust (http://docs.fluentd.org/articles/windows)

Kiyoto

Tim Gunter

unread,

Jun 17, 2014, 6:33:17 PM6/17/14

to flu...@googlegroups.com, chede...@red5studios.com

Hey all.

I wrote a fluent output filter to decode arbitrary fields in messages. I am using it to decode JSON encoded syslog message fields in my own configurations.