Getting "queue size exceeds limit" error

2,773 views
Skip to first unread message

Huy Nguyen

unread,
Feb 26, 2014, 6:44:15 PM2/26/14
to flu...@googlegroups.com
I'm getting queue size exceeds limit problem with fluentd. Which configuration or mechanism should I look into to understand more and readjust to prevent this from happen?

This is the error and the my fluentd configuration: https://gist.github.com/nvquanghuy/35aba82d7e3dbe24938e


Thank you!

Kiyoto Tamura

unread,
Feb 26, 2014, 8:36:51 PM2/26/14
to flu...@googlegroups.com
Hi Huy,

Thanks for the report.

In general, this means that the buffer_queue_limit is not sufficient. So, one solution is to raise the number for out_tdlog.

As far as the root cause is concerned, there are two possibilities:

1. more messages are coming into Fluentd than before (= incoming throughput is increasing)
2. the output plugin isn't flushing chunks fast enough (= outgoing throughput is decreasing)

Has there been any change recently?

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Check out Fluentd, the open source data collector for high-volume data streams

Huy Nguyen

unread,
Feb 26, 2014, 9:08:48 PM2/26/14
to flu...@googlegroups.com
Thanks a lot Kiyoto-san for your help. We restructure our infrastructure a bit. I will explain in a later post. It's possible that the message stream is choking somewhere.

On a related note, is this also part of the same issue?

2014-02-27 01:52:35 +0000 [warn]: Size of the emitted data exceeds buffer_chunk_limit.
2014-02-27 01:52:35 +0000 [warn]: This may occur problems in the output plugins ``at this server.``
2014-02-27 01:52:35 +0000 [warn]: To avoid problems, set a smaller number to the buffer_chunk_limit
2014-02-27 01:52:35 +0000 [warn]: in the forward output ``at the log forwarding server.``

Masahiro Nakagawa

unread,
Feb 27, 2014, 9:11:31 AM2/27/14
to flu...@googlegroups.com
Hi Huy,

This error happened when your one chunk is larger than buffer_chunk_limit of destination server.

You should decrease the buffer_chunk_limit of agent server and
increase the buffer_chunk_limit of destination server.

This case sometimes occurred when the user uses the MongoDB plugin.
MongoDB plugin sets the buffer_chunk_limit to 8MB, but agent server sends more larger chunk.


Christian Hedegaard

unread,
Feb 27, 2014, 1:48:26 PM2/27/14
to flu...@googlegroups.com

Guys, I’m having a very similar problem. I’m using the syslog input plugin, and forwarding to a central log server.

 

Here are the errors and my configs:

https://gist.github.com/chschs/2c16fc9aa888b5e8352a

 

Would love some advice here.

Kiyoto Tamura

unread,
Feb 27, 2014, 2:29:15 PM2/27/14
to flu...@googlegroups.com
Hi Christian,

As I understand, your setup is

(in_syslog -> out_forward) -> (in_forward -> output plugins)
<----------Server A--------->  <---------Server B---------->

...and you have the configuration for Server A in you gist, right?

As Masa said, when out_forward of Server A has smaller chunk size than output plugins of Server B, the above error happens consistently.

So, it's helpful to know how the data is being processed on Server B (the machine with ip=10.x.x.x in your example) to troubleshoot this issue.

Thanks,

Kiyoto


Christian Hedegaard

unread,
Feb 27, 2014, 2:37:51 PM2/27/14
to flu...@googlegroups.com

It’s actually the same config.

 

Every host listens using the in_forward on localhost:24224. Every host listens on localhost:5140 using in_syslog. Every host forwards syslog to localhost:5140. Every host forwards its fluent messages to the log server, including the log server itself. This is how we’ve built our pipeline, so that we can still get the log servers messages into fluent as well. In other words, the log server is also  a client

 

The only difference on the log server is an additional config that matches on syslog to put messages into elasticsearch.

 

Here’s that config:

 

<match syslog.*.{warn,eror,crit,alert,emerg}>

  type elasticsearch

  logstash_format true

  logstash_prefix syslog

  index_name syslog

  type_name syslog

  flush_interval 3

  host search

  port 9200

</match>

 

<match yell.error>

  type elasticsearch

  logstash_format true

  logstash_prefix yell_error

  index_name yell_error

  type_name error

  flush_interval 3

  host search

  port 9200

</match>

Kiyoto Tamura

unread,
Feb 27, 2014, 3:20:57 PM2/27/14
to flu...@googlegroups.com
Hi Christian,

Thanks for the information.

So, it's like this:

(in_syslog/forward -> out_forward) -> (in_syslog/forward -> elasticsearch)
<----------Server A-------------->    <------------Server B------------->

I suggest that you decrease the buffer_chunk_limit value for out_forward on Server A (I suppose there are multiple ones) and increase the buffer_chunk_limit value for out_elasticsearch on Server B.

So, as a config file, it should look like something like this

Server A:

<source>
  type syslog
  port 5140
  tag syslog
</source>
<source>
  type forward
  port 24224
</source>
<match log>
  type forward
  port 24224
  buffer_chunk_limit 4m # the actual value isn't as important. need to be smaller than Server B's
</match>
<match syslog.*.*>
  type forward
  port 5140
  buffer_chunk_limit 4m
</match>

Server B
<source>
  type forward
  port 24224
</source>
<source>
  type forward
  port 5140
</source>

<match syslog.*.{warn,eror,crit,alert,emerg}>

  type elasticsearch

  logstash_format true

  logstash_prefix syslog

  index_name syslog

  type_name syslog

  flush_interval 3

  host search

  port 9200

</match>

Kiyoto Tamura

unread,
Feb 28, 2014, 1:25:35 AM2/28/14
to flu...@googlegroups.com
Hey Christian,

Sorry, I actually misread your error log.

What it is complaining is queue_size_limit (which is the issue Huy shared in his first email on this thread).

For that, you need to increase the size of buffer_queue_limit (the default value is 256), NOT buffer_chunk_limit.

Let us know if you have more questions.
Reply all
Reply to author
Forward
0 new messages