FluentD pods state is CrashLoopBackOff when setting workers in conf file

mohit jain

unread,

Oct 8, 2020, 1:54:45 AM10/8/20

to Fluentd Google Group

Hi,

I am using fluentd "fluentd-1.11.1" and ruby="2.4.10", when I am configuring <system> worker 4 </system>, pods are not coming up, it is just restarting the pods and then moving to crashloopbackoff state. I have checked the logs but at the log level I didnt find any error message. Can anyone guide me why I am facing this issue.

mohit jain

unread,

Oct 9, 2020, 12:40:14 AM10/9/20

to Fluentd Google Group

worker configuration ( <system> worker 4 </system> ) is working fine with fluentd 1.9.2 version.

Mr. Fiber

unread,

Oct 12, 2020, 6:02:23 PM10/12/20

to Fluentd Google Group

> <system> worker 4 </system>

Is this typo? 'workers' is correct, not 'worker'.

One popular case is you use in_tail like plugin which doesn't support multi worker mode.

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/f04719bd-29fa-46ed-a7de-8c42c9692148n%40googlegroups.com.

mohit jain

unread,

Oct 13, 2020, 4:27:37 AM10/13/20

to Fluentd Google Group

Its my mistake (its workers only, not worker), with workers it is not working in "fluentd-1.11.1" but with "1.9.2" version it is working fine.

Mr. Fiber

unread,

Oct 15, 2020, 11:20:06 AM10/15/20

to Fluentd Google Group

Could you paste reproducible conf to check the problem?

To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/0e26153e-8c98-4ba6-a98d-fabaaf7be052n%40googlegroups.com.

mohit jain

unread,

Oct 19, 2020, 9:55:20 AM10/19/20

to Fluentd Google Group

Hi, PFA conf file which I am using for my case. with this configuration you can reproduce the issue.

workers 4

</system>

@type systemd

path /var/log/journal

tag journal

<entry>

fields_strip_underscores true

fields_lowercase true

</entry>

</source>

# system-fluentd config

# drop fluentd logs

@type null

</match>

# forward all tenants logs to public elasticsearch

# do not split per type now as standard k8s logs do not have type

@type elasticsearch_dynamic

@log_level info

include_tag_key true

host elasticsearch

port 9200

logstash_format true

logstash_prefix log-${tag_parts[2]}

reload_connections false

reconnect_on_error true

reload_on_failure true

request_timeout 10s

@type file

chunk_limit_size 8MB

path /var/log/td-agent/es-fluentd-buffer/xxx.logging.all

</buffer>

</match>

mohit jain

unread,

Oct 21, 2020, 10:01:52 PM10/21/20

to Fluentd Google Group

Hello, Can you please help me on this issue.

Mr. Fiber

unread,

Oct 23, 2020, 8:58:47 PM10/23/20

to Fluentd Google Group

Hmm... your configuration doesn't work with v1.9.2 on my local environment.

2020-10-24 00:56:24 +0000 [error]: #0 config error file="f.conf" error_class=Fluent::ConfigError error="Plugin 'systemd' does not support multi workers configuration (Fluent::Plugin::SystemdInput)"

Could you recheck multi worker mode and systemd input plugin combo doesn't work with v1.9.2?

To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/aa2fea08-2a98-4f86-a3f2-daac56cb2bb1n%40googlegroups.com.

mohit jain

unread,

Oct 24, 2020, 12:33:19 AM10/24/20

to Fluentd Google Group

Can you please check this configuration, it is working with fluentd 1.9.2 version.

workers 4

</system>

@type prometheus

</source>

@type prometheus_monitor

host ${hostname}

</labels>

</source>

@type elasticsearch_dynamic

@log_level info

include_tag_key true

host elasticsearch

port 9200

logstash_format true

logstash_prefix log-${tag_parts[2]}

reload_connections false

reconnect_on_error true

reload_on_failure true

request_timeout 10s

@type file

chunk_limit_size 8MB

path /var/log/td-agent/es-fluentd-buffer/xxx.logging.all

</buffer>

</match>

Mr. Fiber

unread,

Oct 26, 2020, 7:19:58 AM10/26/20

to Fluentd Google Group

Yes. It works because prometheus plugin works with multi-worker mode.

To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/2a7c211f-03c8-417f-bc15-c4b1d4641b61n%40googlegroups.com.

mohit jain

unread,

Oct 26, 2020, 7:26:13 AM10/26/20

to Fluentd Google Group

Okay, but the same configuration is not working with the fluentd 1.11.1 version. May I know why the same configuration is not working with fluentd 1.11.1 version? and how can I solve this issue if I am using fluentd 1.11.1 version.

Mr. Fiber

unread,

Oct 26, 2020, 10:16:03 AM10/26/20

to Fluentd Google Group

> how can I solve this issue if I am using fluentd 1.11.1 version.

I'm not sure because your configuration works with fluentd v1.11.1 on my environment.

Maybe, the problem is not the configuration.

$ fluentd -c p.conf
2020-10-26 11:35:39 +0000 [info]: parsing config file is succeeded path="p.conf"
2020-10-26 11:35:39 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.2.2'
2020-10-26 11:35:39 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.8.4'
2020-10-26 11:35:39 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.2'
2020-10-26 11:35:39 +0000 [info]: gem 'fluentd' version '1.11.1'
2020-10-26 11:35:40 +0000 [info]: using configuration file: <ROOT>

<system>
workers 4
</system>
<source>
@type prometheus
</source>
<source>
@type prometheus_monitor
<labels>
host ${hostname}
</labels>
</source>
<match xxx.logging.**>
@type elasticsearch_dynamic
@log_level "info"
include_tag_key true

host "127.0.0.1"

port 9200
logstash_format true
logstash_prefix "log-${tag_parts[2]}"
reload_connections false
reconnect_on_error true
reload_on_failure true
request_timeout 10s
<buffer>
@type "file"
chunk_limit_size 8MB

path "./es-fluentd-buffer/xxx.logging.all"
</buffer>
</match>
</ROOT>
2020-10-26 11:35:40 +0000 [info]: starting fluentd-1.11.1 pid=8662 ruby="2.7.2"
2020-10-26 11:35:40 +0000 [info]: spawn command to main: cmdline=["/path/to/2.7.2/bin/ruby", "-Eascii-8bit:ascii-8bit", "/path/to/2.7.2/bin/fluentd", "-c", "p.conf", "--under-supervisor"]
2020-10-26 11:35:46 +0000 [info]: adding match pattern="xxx.logging.**" type="elasticsearch_dynamic"
2020-10-26 11:35:48 +0000 [warn]: #2 Detected ES 7.x: `_doc` will be used as the document `_type`.
2020-10-26 11:35:49 +0000 [warn]: #1 Detected ES 7.x: `_doc` will be used as the document `_type`.
2020-10-26 11:35:49 +0000 [warn]: #0 Detected ES 7.x: `_doc` will be used as the document `_type`.
2020-10-26 11:35:49 +0000 [info]: adding source type="prometheus"
2020-10-26 11:35:49 +0000 [warn]: #3 Detected ES 7.x: `_doc` will be used as the document `_type`.
2020-10-26 11:35:49 +0000 [info]: #2 starting fluentd worker pid=8696 ppid=8662 worker=2
2020-10-26 11:35:49 +0000 [info]: #2 fluentd worker is now running worker=2
2020-10-26 11:35:49 +0000 [info]: adding source type="prometheus_monitor"
2020-10-26 11:35:49 +0000 [info]: #0 starting fluentd worker pid=8694 ppid=8662 worker=0
2020-10-26 11:35:49 +0000 [info]: #1 starting fluentd worker pid=8695 ppid=8662 worker=1
2020-10-26 11:35:49 +0000 [info]: #3 starting fluentd worker pid=8697 ppid=8662 worker=3
2020-10-26 11:35:49 +0000 [info]: #0 fluentd worker is now running worker=0
2020-10-26 11:35:49 +0000 [info]: #3 fluentd worker is now running worker=3
2020-10-26 11:35:49 +0000 [info]: #1 fluentd worker is now running worker=1

To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/7e201ef9-b0c4-444c-b110-b8b3fc2f0b64n%40googlegroups.com.

mohit jain

unread,

Nov 3, 2020, 3:05:30 AM11/3/20

to Fluentd Google Group

Hi,

I tried worker parameter with below configuration, I executed case1 and case2, but it is not working with fluentd 1.11.1 (no of worker should be more than 1), pods are existing with exit code 137. Can you please suggest me how I can resolve that issue.

.

Case1: Worker with fluentd 1.11.1 version
I am using fluentd 1.11.1 version, in configuration file I added worker functionality with http fluentd input plugin. so when I deploy the fluentd chart, pod will start for few seconds and exiting with exit code 137. To resolve this error I increased pod memory to 2Gb (I configured 4 workers), but after increasing the memory issue remains same. PFB the configuration which I used.

Case2: Worker with fluentd 1.9.2 version
The below configuration I used for fluentd 1.9.2 as well, it is working fine and in this case also I am using 4 workers and the pod memory is 500Mi and cpu as 500m.

Configuration

workers 4

</system>

@type http

@log_level error

@id input_http_ipv4

port 9000

bind 0.0.0.0

</source>

@type http

@log_level error

@id input_http_ipv6

port 9000

bind ::

</source>

@type record_transformer

enable_ruby

date ${ require 'date'; DateTime.rfc3339(record["measurement_time"]).strftime('%Y-%m-%d') }

</record>

</filter>

@type record_transformer

enable_ruby

date ${ require 'date'; DateTime.rfc3339(record["alarm_time"]).strftime('%Y-%m-%d') }

</record>

</filter>

@type record_transformer

enable_ruby

date ${ require 'date'; DateTime.rfc3339(record["log_event_time_stamp"]).strftime('%Y-%m-%d') }

</record>

</filter>

@type elasticsearch_genid

hash_id_key _hash

</filter>

@type elasticsearch_genid

hash_id_key _hash

</filter>

@type elasticsearch_genid

hash_id_key _hash

</filter>

@type copy

<store>

@type file

@log_level error

path /data/fluentdlogs/pm

timekey 1d

flush_thread_count 4

chunk_limit_size 4MB

overflow_action block

flush_mode interval

flush_interval 5s

total_limit_size 1GB

</buffer>

@type elasticsearch

@log_level error

index_name ${tag}-${dnf_name}-${date}

type_name pm_data

host elasticsearch

port 9200

id_key _hash # specify same key name which is specified in hash_id_key

remove_keys _hash # Elasticsearch doesn't like keys that start with _

logstash_format false

bulk_message_request_threshold 5M

request_timeout 30s

reconnect_on_error true

reload_on_failure true

reload_connections false

</store>

</match>

@type copy

<store>

@type file

@log_level error

path /data/fluentdlogs/fm

timekey 1d

flush_thread_count 4

chunk_limit_size 4MB

overflow_action block

flush_mode interval

flush_interval 5s

total_limit_size 2GB

</buffer>

@type elasticsearch

@log_level error

index_name ${tag}-${dnf_name}-${date}

type_name fm_data

host elasticsearch

port 9200

id_key _hash # specify same key name which is specified in hash_id_key

remove_keys _hash # Elasticsearch doesn't like keys that start with _

logstash_format false

bulk_message_request_threshold 5M

request_timeout 30s

reconnect_on_error true

reload_on_failure true

reload_connections false

</store>

</match>

@type copy

<store>

@type file

@log_level error

path /data/fluentdlogs/logs

timekey 1d

flush_thread_count 4

chunk_limit_size 4MB

overflow_action block

flush_mode interval

flush_interval 5s

total_limit_size 5GB

</buffer>

@type elasticsearch

@log_level error

index_name ${tag}-${facility}-${dnf_name}-${date}

type_name logs_data

host elasticsearch

port 9200

id_key _hash # specify same key name which is specified in hash_id_key

remove_keys _hash # Elasticsearch doesn't like keys that start with _

logstash_format false

bulk_message_request_threshold 5M

request_timeout 30s

reconnect_on_error true

reload_on_failure true

reload_connections false

</store>

</match>

Thanks in advance

Reply all

Reply to author

Forward