kafka-connect-syslog - How to preserve the original syslog event without any modification

1,017 views
Skip to first unread message

Guilhem Marchand

unread,
Aug 19, 2018, 6:19:26 PM8/19/18
to Confluent Platform

Hello,

I am trying to implement a syslog data collection to Kafka using the Kafka connect with the syslog connector from:

https://github.com/rmoff/kafka-connect-syslog

After lots and lots of attempts, I came to successfully produce and consume the syslog data, BUT the issue I see is that in the end the message always a given level of modifications compared to an origin syslog event.

Example of a raw event sent from a rsyslog client:

<86>Aug 19 21:51:34 ip-10-0-0-xxx sshd[31126]: pam_unix(sshd:session): session closed for user ubuntu

With the string converter, using the following properties:

key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false

Example of events produced to the kafka topic:

Struct{date=Sun Aug 19 21:23:33 UTC 2018,facility=3,host=ip-10-0-0-6,level=6,message=ip-10-0-0-6 systemd[1]: Started Session 897 of user ubuntu.,charset=UTF-8,remote_address=ip-172-18-0-1.eu-west-2.compute.internal/172.18.0.1:35656,hostname=ip-172-18-0-1.eu-west-2.compute.internal}

With the JSON converter, using the following properties:

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false

Example:

{"date":1534714091000,"facility":10,"host":"ip-10-0-0-6","level":6,"message":"ip-10-0-0-6 sshd[30008]: pam_unix(sshd:session): session closed for user ubuntu","charset":"UTF-8","remote_address":"ip-172-18-0-1.eu-west-2.compute.internal/172.18.0.1:47500","hostname":"ip-172-18-0-1.eu-west-2.compute.internal"}

With Avro converter. using the following properties:

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

Example:

� ip-10-0-0-6 sudo:   ubuntu : TTY=pts/2 ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/vi /usr/share/kafka-connect-syslog/config/syslog.properties
UTF-8 rip-xxxxxxx.eu-west-2.compute.internal/172.18.0.1:52186 Pip-xxxxxxx.eu-west-2.compute.internal


So far the best result would be with the JSON converter, but that is still not the original data as streamed by Syslog.

Is it possible to preserve and produce the origin syslog even without modifying its structure ?

Thank you for your help

Guilhem





shobha warrier

unread,
Feb 14, 2019, 12:58:15 AM2/14/19
to Confluent Platform
We are facing issues while configuring the syslog  connector for remote server. It works fine in local but in remote it is not working . any details will help.

Cyrus Vafadari

unread,
Feb 14, 2019, 10:41:54 AM2/14/19
to confluent...@googlegroups.com
Hm, it seems the link to the connector repo you linked is down. How similar is it to Confluent's newly released preview connector? https://www.confluent.io/connector/kafka-connect-syslog/

Cyrus

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/b2848d3e-5593-49f7-bb20-5ad7439de811%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

BadwolF ForeveR

unread,
Mar 6, 2019, 4:39:52 AM3/6/19
to Confluent Platform
Just went through the documentation of Confluent's new syslog connector. It seems like you would not be able to get the source message. Syslog Connector is performing a SMT (Single Message Transform) and is transforming the message in a structured format (perhaps to be supported by Json and Avro). Also, recommendation is to run it in a standalone connector with one thread accepting and producing messages to Kafka, which does not sound scalable considering the volume of syslogs from devices.

I would recommend using a multi-threaded custom implementation to parallel process incoming messages and then produce (store) messages to Kafka as bytes/String (bytes would be better). I have developed such a custom ingestion framework (not open source; client specific) to ingest syslogs from juniper, mcafee, palo-alto, fortinet and other systems. Not sure which language you are comfortable with but you can start looking into Actor model (in any language; say Scala, in which I built). You can have TCP and/or UDP ingestion out-of-the-box (we did both; with preference on UDP considering 350K EPS to 500K EPS.)

Cannot give definitive figures about syslog connector per se but Kafka works like a charm. Just make sure to have proper number of partitions in topics (we had 500 for one source generating 6 billion events per day to support a latency (SLA) of 1 minute.

Hope it helps.
Reply all
Reply to author
Forward
0 new messages