[rabbitmq-discuss] Can RabbitMQ handle big messages?

15,674 views
Skip to first unread message

Zabrane Mickael

unread,
Mar 10, 2012, 1:26:55 PM3/10/12
to rabbitmq...@lists.rabbitmq.com
Hi there,

I've a set (~1 million) of high resolution PNG files  each one between 100MB and 1GB size.
I'd like to know if RabbitMQ is capable of handling such files as messages?

What's the max allowed message size?

Thx ...

Regards,
Zabrane

Jerry Kuch

unread,
Mar 10, 2012, 1:34:43 PM3/10/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com
Hi, Zabrane:  In theory the AMQP protocol IIRC allows crazy large message payloads (2^64 bytes I believe).  In practice though, that's madness since you end up with potential copying and buffering along the way that could make a broker very unhealthy.

For moving bulk binary data as in your app, you'd be wise to fragment at the producer and reassemble at the destination/consumer. Keeping your chunks in the 100KB range might be a decent place to start experimenting.  

Sent from my iPhone (Brevity and typos are hopefully the result of 1-fingered typing rather than rudeness or illiteracy).

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq...@lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Zabrane Mickael

unread,
Mar 10, 2012, 2:46:31 PM3/10/12
to Jerry Kuch, rabbitmq...@lists.rabbitmq.com
Thanks Jerry.

I'll experiment with that advice and come back soon.

Regards,
Zabrane

Zabrane Mickael

unread,
Mar 12, 2012, 2:33:25 AM3/12/12
to Jerry Kuch, rabbitmq...@lists.rabbitmq.com
Hi guys,

Before trying to rewrite the wheel, anyone faced this problem before and would like to
share some code (ie. splitting a big file are constructing it when reading from RabbitMQ queue)?

Thanks ...

Regards,
Zabrane


On Mar 10, 2012, at 7:34 PM, Jerry Kuch wrote:

Josh Geisser

unread,
Mar 12, 2012, 4:24:40 AM3/12/12
to Zabrane Mickael, Jerry Kuch, rabbitmq...@lists.rabbitmq.com

No code avilable anymore but I used to test >=512MB messages, no problems at all.

(Except of coarse it takes ages to upload/download, and getting any %-complete is almost impossible)

 

Used py-AMQP then.

 

FYI & cheers

Josh

 


-- 
----
ASG at hnet

Michael Klishin

unread,
Mar 12, 2012, 4:28:26 AM3/12/12
to Zabrane Mickael, Rabbit-Mq Discuss-Mailing List
Zabrane Mickael:

> Before trying to rewrite the wheel, anyone faced this problem before and would like to
> share some code (ie. splitting a big file are constructing it when reading from RabbitMQ queue)?

You can use Nanite's file streaming implementation as example:
https://github.com/ruby-amqp/nanite/blob/master/lib/nanite/streaming.rb

Keep in mind that the project is old but the chunking part is just as relevant
today.

MK

Zabrane Mickael

unread,
Mar 12, 2012, 6:57:33 AM3/12/12
to Michael Klishin, Rabbit-Mq Discuss-Mailing List
Thanks for the Nanite pointer Michael.

Regards,
Zabrane

Emile Joubert

unread,
Mar 12, 2012, 8:26:34 AM3/12/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com
Hi Zabrane,

On 10/03/12 18:34, Jerry Kuch wrote:
> Hi, Zabrane: In theory the AMQP protocol IIRC allows crazy large
> message payloads (2^64 bytes I believe). In practice though, that's
> madness since you end up with potential copying and buffering along the
> way that could make a broker very unhealthy.

I have successfully processed messages as large as 2Gb using RabbitMQ,
where 2Gb was about 5% of the total RAM. If the ratio between message
size and total RAM stays low then you can send even larger messages, up
to the limit Jerry mentioned.


-Emile

Irmo Manie

unread,
Mar 12, 2012, 8:51:17 AM3/12/12
to Emile Joubert, rabbitmq...@lists.rabbitmq.com
RabbitMQ should actually not be used for big file transfers or only
with great care and fragmenting the files into smaller separate
messages.
When running a single broker instance, you'd still be safe, but in a
clustered setup, very big messages will break the cluster.

Clustered nodes are connected via 1 tcp connection, which must also
transport a (erlang) heartbeat. If your big message takes more time to
transfer between nodes than the heartbeat timeout (anywhere between
~20-45 seconds if I'm correct), the cluster will break and your
message is lost.

The preferred architecture for file transfer over amqp is to just send
a message with a link to a downloadable resource and let the file
transfer be handle by specialized protocol like ftp :-)


- Irmo

Zabrane Mickael

unread,
Mar 12, 2012, 9:00:09 AM3/12/12
to Irmo Manie, rabbitmq...@lists.rabbitmq.com
> Clustered nodes are connected via 1 tcp connection, which must also
> transport a (erlang) heartbeat. If your big message takes more time to
> transfer between nodes than the heartbeat timeout (anywhere between
> ~20-45 seconds if I'm correct), the cluster will break and your
> message is lost.

I see.

> The preferred architecture for file transfer over amqp is to just send
> a message with a link to a downloadable resource and let the file
> transfer be handle by specialized protocol like ftp :-)


Nice idea. I'll try to implement that.

Regards,
Zabrane

Carl Hörberg

unread,
Mar 12, 2012, 9:23:00 AM3/12/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com
is it possible to configure a max message size limit?

Emile Joubert

unread,
Mar 12, 2012, 10:13:45 AM3/12/12
to Carl Hörberg, rabbitmq...@lists.rabbitmq.com
Hi Carl,

On 12/03/12 13:23, Carl Hörberg wrote:
> is it possible to configure a max message size limit?

No, but you can limit the maximum fragment size during connection tuning.

-Emile

Carl Hörberg

unread,
Mar 12, 2012, 10:39:45 AM3/12/12
to Emile Joubert, rabbitmq...@lists.rabbitmq.com
if Irmo is correct, wouldn't that be considered a bug?

or can fragment size tuning prevent that? if so, what might be a decent value?

Tony Garnock-Jones

unread,
Mar 12, 2012, 11:31:12 AM3/12/12
to Irmo Manie, rabbitmq...@lists.rabbitmq.com
On 12 March 2012 08:51, Irmo Manie <irmo....@gmail.com> wrote:
Clustered nodes are connected via 1 tcp connection, which must also
transport a (erlang) heartbeat. If your big message takes more time to
transfer between nodes than the heartbeat timeout (anywhere between
~20-45 seconds if I'm correct), the cluster will break and your
message is lost.

Wow! Is that really the case? Erlang's distribution breaks if no heartbeats have been received *even if there's traffic coming in on the wire*? Sounds like an Erlang bug. Or perhaps there's some subtlety in the design I'm not seeing!

Regards,
  Tony
--
Tony Garnock-Jones
tonygarn...@gmail.com
http://homepages.kcbbs.gen.nz/tonyg/

Zabrane Mickael

unread,
Mar 12, 2012, 11:42:52 AM3/12/12
to Tony Garnock-Jones, rabbitmq...@lists.rabbitmq.com
It's not a bug. You have to be aware of it to build reliable systems in case of multiple connected Erlang nodes.

Regards,
Zabrane

Jerry Kuch

unread,
Mar 12, 2012, 11:50:19 AM3/12/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com
It still seems slightly strange though, since if an Erlang node is
actively reading bytes off a connection, it seems like that could serve
as proxy for the heartbeat, which would only really be essential when
nothing else is going on... But there could easily be some subtlety of
the design or implementation that's eluding me...

Jerry

Matthew Sackman

unread,
Mar 12, 2012, 12:02:14 PM3/12/12
to rabbitmq...@lists.rabbitmq.com
On Mon, Mar 12, 2012 at 04:42:52PM +0100, Zabrane Mickael wrote:
> It's not a bug. You have to be aware of it to build reliable systems in case of multiple connected Erlang nodes.

For those of us struggling to follow this, if you're currently in the
act of receiving data from node X, why can't you assume node X is still
alive? I.e. what is wrong with treating arbitrary data from node X as
evidence it's still alive, in lieu of a heartbeat from node X?

Matthew

Zabrane Mickael

unread,
Mar 12, 2012, 12:23:59 PM3/12/12
to Matthew Sackman, rabbitmq...@lists.rabbitmq.com
On Mar 12, 2012, at 5:02 PM, Matthew Sackman wrote:

On Mon, Mar 12, 2012 at 04:42:52PM +0100, Zabrane Mickael wrote:
It's not a bug. You have to be aware of it to build reliable systems in case of multiple connected Erlang nodes.

For those of us struggling to follow this, if you're currently in the
act of receiving data from node X, why can't you assume node X is still
alive? I.e. what is wrong with treating arbitrary data from node X as
evidence it's still alive, in lieu of a heartbeat from node X?

Matthew Sackman

unread,
Mar 12, 2012, 12:37:15 PM3/12/12
to rabbitmq...@lists.rabbitmq.com
On Mon, Mar 12, 2012 at 05:23:59PM +0100, Zabrane Mickael wrote:
> http://learnyousomeerlang.com/distribunomicon

Mmm, the "Bandwidth is infinite" section mentions this bug in Erlang,
but says nothing about the justification for it:

"Worse than that, Erlang knows whether nodes are alive or not by sending
a thing called heartbeats. Heartbeats are small messages sent at a
regular interval between two nodes basically saying "I'm still alive,
keep on keepin' on!". They're like our Zombie survivors routinely
pinging each other with messages; "Bill, are you there?" And if Bill
never replies, then you might assume he's dead (our out of batteries)
and he won't get your future communications. Anyway, heartbeats are sent
over the same channel as regular messages.

"The problem is that a large message can thus hold heartbeats back. Too
many large messages keeping heartbeats at bay for too long and either of
the nodes will eventually assume the other is unresponsive and
disconnect from each other. That's bad. In any case, the good Erlang
design lesson to keep this from happenning is to keep your messages
small. Everything will be better that way."

Somewhat ironic that AMQP itself does understand that receiving any data
from a peer indicates the peer is alive, and furthermore has the
ability to multiplex messages so that a single large message doesn't
block other messages.

Tony Garnock-Jones

unread,
Mar 12, 2012, 12:39:03 PM3/12/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com

Yes, that repeats the information that Irmo started this subthread with. It doesn't address Matthew's question at all, though.

Perhaps the erlang list is a better place for us to be asking about this, Matthew, since it's not directly about Rabbit - are you on that list? I'm not currently subscribed.

Tony

Zabrane Mickael

unread,
Mar 13, 2012, 1:41:25 AM3/13/12
to Tony Garnock-Jones, rabbitmq...@lists.rabbitmq.com
This leads me to this question:

Let assume I'm able to ensure that all my messages are less than 100Kb.

How many messages one RabbitMQ mode can handle at any given time? Is there any limitation?

Regards,
Zabrane

Jerry Kuch

unread,
Mar 13, 2012, 2:03:05 AM3/13/12
to Zabrane Mickael, rabbitmq...@lists.rabbitmq.com
Hi, Zabrane:

Ultimately you'll be limited by disk space. If a queue gets large with messages
that are either unconsumed, or delivered but not ACKed, and the broker determines
that it's under memory pressure, it will page messages to files on disk, blocking
producers in the meantime using TCP back pressure. The mechanism is discussed here:

http://www.rabbitmq.com/memory.html

In practice you don't want to routinely be flirting with the memory watermark, and
as a rule, its value is probably best left at the default 0.40 level. In production
you should make sure your monitoring/alerting system is watching broker memory usage,
and probably the lengths and memory consumption of queues of importance to your app.
If queues are getting uncharacteristically backed up, it's often because something
has changed or gone wrong (unexpected producer load, crashed or buggy consumers,
etc.).

Best regards,
Jerry

----- Original Message -----
From: "Zabrane Mickael" <zabr...@gmail.com>
To: "Tony Garnock-Jones" <tonygarnockj...@gmail.com>
Cc: rabbitmq...@lists.rabbitmq.com
Sent: Monday, March 12, 2012 10:41:25 PM
Subject: Re: [rabbitmq-discuss] Can RabbitMQ handle big messages?

Regards,
Zabrane

http://learnyousomeerlang.com/distribunomicon

_______________________________________________

Zabrane Mickael

unread,
Mar 13, 2012, 2:09:22 AM3/13/12
to Jerry Kuch, rabbitmq...@lists.rabbitmq.com
Crystal clear. Thanks Jerry.

Regards,
Zabrane

Viraj Gupte

unread,
Aug 6, 2014, 11:08:49 AM8/6/14
to rabbitmq...@googlegroups.com, rabbitmq...@lists.rabbitmq.com, zabr...@gmail.com
Hi Emile Joubert,
How much time did it take to process messages with size in 2 GB. Did you try working in a clustered environment?
Reply all
Reply to author
Forward
0 new messages