BasicPublish with .Net byte[]

71 views
Skip to first unread message

kevin chandler

unread,
Aug 12, 2014, 10:29:33 PM8/12/14
to rabbitm...@googlegroups.com
So I have a typical client that publishes messages to a broker using BasicPublish.  What is not typical is that this client will publish around 5000 messages a minute when it is busy.  This client publishes a Json serialized class as its message body.  When running a test, sometimes a record or 2 will get corrupted.  This is identified by non Base64 data in the body of the message.  These corruptions are very inconsistent and which record(s) is/are corrupt varies too.  After turning on the RabbitMQ tracing (from the UI's admin page), it shows the corrupt data in its Received Message tracing.  I added some tracing of my own to the client application immediately before calling BasicPublish.  This new tracing indicates that the data going into BasicPublish looks great.  I have multiple traces showing good data going in and the RabbitMQ tracing showing bad data added to the broker.  

After some thought, I considered this might be a case of .Net and the unmanaged Rabbit software not sharing memory pointers correctly.  Maybe the Garbage Collection is freeing some of the pointers before the unmanged code is done with them.  I decided to just keep my message body's byte[] data that I passed to BasicPublish around in a List<byte[]> to prevent the garbage collector from freeing that memory.  This worked.  Multiple runs without a single failure..

1.  What does the ampq attribute class around the Private BasicPublish do?
2.  Have others encountered this problem?
3.  If so, how did you work around preventing the GC from freeing your data prior to the unmanaged Rabbit code completing its tasks?

Thanks in advance for your help,
Kevin

Michael Klishin

unread,
Aug 13, 2014, 12:36:29 AM8/13/14
to kevin chandler, rabbitm...@googlegroups.com
 On 13 August 2014 at 06:29:39, kevin chandler (kevin.ch...@gmail.com) wrote:
> > After some thought, I considered this might be a case of .Net
> and the unmanaged Rabbit software not sharing memory pointers
> correctly. 

What is that "unmanaged Rabbit software"?

> Maybe the Garbage Collection is freeing some of the
> pointers before the unmanged code is done with them. I decided
> to just keep my message body's byte[] data that I passed to BasicPublish
> around in a List to prevent the garbage collector from
> freeing that memory. This worked. Multiple runs without a single
> failure..

This is quite interesting and I don't think I've seen this with any other
client. 5000 messages a second is not a lot, by the way.

> 1. What does the ampq attribute class around the Private BasicPublish
> do?

Do you mean _Private_BasicPublish in the generated protocol serialisation
code or BasicPublish in IModel?

The latter is used by our codegen tool to generate methods that serialise
and deserialise protocol methods. Those attributes don't have anything
to do with memory management and are not used at runtime, only when
the client is built.

> 2. Have others encountered this problem?

I personally have never seen it reported.

> 3. If so, how did you work around preventing the GC from freeing
> your data prior to the unmanaged Rabbit code completing its tasks?

Again, what is the "unmanaged Rabbit code"? RabbitMQ .NET client is pure C#.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

kevin chandler

unread,
Aug 13, 2014, 8:49:49 AM8/13/14
to rabbitm...@googlegroups.com, kevin.ch...@gmail.com
Thanks for responding Michael,

What is that "unmanaged Rabbit software"?

Maybe my understanding of the Rabbit software is wrong but the .Net Client wrapper for Rabbit is .Net managed source code. It seems as though the BasicPublish calls all go down to a _Private_BasicPublish call.  This call does not seem to be in the .Net wrapper code.  With the attribute that we talked about below, I am under the assumption that the implementation of _Private_BasicPublish is not under .Net's control (unmanaged).  Previously I have interfaced with c++ libraries and COM objects that require attributes on these unmanaged methods that describe how the .Net garbage collection is to handle memory pointers passed into these methods.  From what I have seen in my testing it is acting as though the .Net garbage collection does not believe that anyone is using these memory pointers (the body of the message) and is freeing them even though the core Rabbit software is still using them.  As I said before, I might be way off base.


This is quite interesting and I don't think I've seen this with any other
client. 5000 messages a second is not a lot, by the way.

I am glad I am not treading on new ground.
 
> 1. What does the ampq attribute class around the Private BasicPublish  
> do?

Do you mean _Private_BasicPublish in the generated protocol serialisation
code or BasicPublish in IModel?

I was talking ab out the _Private_BasicPublish.  I was tired of looking at it last night and I was trying to go off my memory.
 

Again, what is the "unmanaged Rabbit code"? RabbitMQ .NET client is pure C#.
From what I am seeing is that RabbitMQ .Net is a managed code wrapper around an implementation in another language.  At least this is what I am seeing after looking at the source code.  I really saw no real implementation in any of the c# code.  The c# code just repackaged it for the _Private_BasicPublish which seems to be somewhere else.  If I am wrong and this is all managed code, that would be excellent since it would be much easier to debug.

Michael Klishin

unread,
Aug 13, 2014, 8:59:57 AM8/13/14
to kevin chandler, rabbitm...@googlegroups.com
On 13 August 2014 at 16:49:55, kevin chandler (kevin.ch...@gmail.com) wrote:
> > Maybe my understanding of the Rabbit software is wrong but the
> .Net Client wrapper for Rabbit is .Net managed source code. It
> seems as though the BasicPublish calls all go down to a _Private_BasicPublish
> call. This call does not seem to be in the .Net wrapper code. With
> the attribute that we talked about below, I am under the assumption
> that the implementation of _Private_BasicPublish is not under
> .Net's control (unmanaged).

It is created by code generator. Build the client from source with VS/msbuild.exe/xbuild, then see under gensrc.

kevin chandler

unread,
Aug 13, 2014, 10:29:55 AM8/13/14
to rabbitm...@googlegroups.com, kevin.ch...@gmail.com
Thanks Michael,

We have gotten that code built.  It clears up some things.  Now it is scary that maybe this is a communication problem or a problem on the server side of the world.  Stay tuned.  I might have more details/questions as to what is happening.

THanks again,

Michael Klishin

unread,
Aug 13, 2014, 10:32:51 AM8/13/14
to kevin chandler, rabbitm...@googlegroups.com
On 13 August 2014 at 18:30:00, kevin chandler (kevin.ch...@gmail.com) wrote:
> > Now it is scary that maybe this is a communication problem or
> a problem on the server side of the world. Stay tuned. I might have
> more details/questions as to what is happening.

RabbitMQ treats message bodies as opaque byte streams. I have doubts it is
a Rabbit issue: if it was, it would have been reported to us many times
by now.

But if you can produce a test case that reproduces your problem, we'll
be happy to investigate. 

gatesvp

unread,
Aug 13, 2014, 7:09:55 PM8/13/14
to rabbitm...@googlegroups.com
My first guess here would be multiple Messages on the same Channel simultaneously. That would explain why data going to Rabbit looks good and data coming in to Rabbit looks bad.

I just spent some time with this on my own system and here is what I had to build out:
  • ConnectionFactory => generally 1 per service/webapp per server.
  • IConnection => generally 1 per service, generated by the ConnectionFactory, ensure it open before you create an IModel (Channel) from it
  • IModel => 1 per thread, do not share, ensure it is Dispose() correctly
Does this match up with your system behaviour?

5k messages / minutes is a relatively light load, but it is enough that Channels shared between threads would cause an issue.

Regards;
@gatesvp

Michael Klishin

unread,
Aug 14, 2014, 1:13:15 AM8/14/14
to gatesvp, rabbitm...@googlegroups.com
On 14 August 2014 at 03:10:01, gatesvp (gat...@gmail.com) wrote:
> > 5k messages / minutes is a relatively light load, but it is enough
> that Channels shared between threads would cause an issue.

Publishing on a shared channel is more likely to result in
incorrect (out-of-order)  frame interleaving and a connection error, not incorrect message bodies
on the consumer end.

kevin chandler

unread,
Sep 3, 2014, 10:30:57 AM9/3/14
to rabbitm...@googlegroups.com, gat...@gmail.com
Thanks Michael and gatesvp for your help,

I certainly had some threading issues.  Clients were using my support libraries in ways I had not considered.  Once I solved those problem, this problems seems to have gone away.  I need to be more defensive in my coding.  Your support/comments were very valuable to me during this investigation.

Thanks to all,
Kevin
 
Reply all
Reply to author
Forward
0 new messages