Rhino ESB serializer collection limitation

25 views
Skip to first unread message

René M. A

unread,
Jun 17, 2011, 10:39:40 AM6/17/11
to rhino-t...@googlegroups.com
Is there a reason why the serializer has limitation stating that collections cannot contain more than 256 elements?

I have a scenario where the message contains many small elements and they must be in the same message.

Tom Cabanski

unread,
Jun 17, 2011, 11:14:10 AM6/17/11
to Rhino Tools Dev
The limitation is there because of sizing concerns when it comes to
the message put on the message queue. I realize the limit is kind of
arbitrary because, as you point out, if the collection contains really
small things, serializing more than 256 should not be an issue. I
don't recall if there is a way to override this via configuration. I
do know we worked around some limitations in the serializer for
xdocument in one of our projects by registering an
IValueConvertor<T>. The inerface is quite simple. I imagine you
could do the same for your class.

René M. A

unread,
Jun 17, 2011, 1:41:14 PM6/17/11
to rhino-t...@googlegroups.com
Thanx. checked the source code, the limit is hardcoded.

I realize the message size concern, but I do not think that this specific limt provides much value?

If the RSB committers agree with me I can provide a pull request to fix this?

If not I can work around it the way you suggest.

Corey Kaylor

unread,
Jun 17, 2011, 1:45:46 PM6/17/11
to rhino-t...@googlegroups.com
No, providing a fix for something that was put in intentionally would probably not be pulled in.


--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rhino-tools-dev/-/13Aa-rAcBugJ.
To post to this group, send email to rhino-t...@googlegroups.com.
To unsubscribe from this group, send email to rhino-tools-d...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.


Tom Cabanski

unread,
Jun 17, 2011, 2:14:26 PM6/17/11
to Rhino Tools Dev
Corey,

If the default limit remained as is but there was a configuration
switch to change it, you think that would be something that would get
pulled back into the main code base?

On Jun 17, 12:45 pm, Corey Kaylor <co...@kaylors.net> wrote:
> No, providing a fix for something that was put in intentionally would
> probably not be pulled in.
>
> On Fri, Jun 17, 2011 at 11:41 AM, René M. A <renemygindander...@gmail.com>wrote:
>
>
>
> > Thanx. checked the source code, the limit is hardcoded.
>
> > I realize the message size concern, but I do not think that this specific
> > limt provides much value?
>
> > If the RSB committers agree with me I can provide a pull request to fix
> > this?
>
> > If not I can work around it the way you suggest.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Rhino Tools Dev" group.
> > To view this discussion on the web visit
> >https://groups.google.com/d/msg/rhino-tools-dev/-/13Aa-rAcBugJ.
> > To post to this group, send email to rhino-t...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > rhino-tools-d...@googlegroups.com.
> > For more options, visit this group at
> >http://groups.google.com/group/rhino-tools-dev?hl=en.- Hide quoted text -
>
> - Show quoted text -

Corey Kaylor

unread,
Jun 17, 2011, 2:40:44 PM6/17/11
to rhino-t...@googlegroups.com
Probably not, overriding it for specific cases makes sense and there are hooks as described already to do so. Keep in mind that saying you need to send more than 256 items in a collection is essentially saying that all of these items must be transactionally consistent. Is this really the case?

René M. A

unread,
Jun 17, 2011, 4:33:52 PM6/17/11
to rhino-t...@googlegroups.com
Yes, the message contain commands which must be executed against a number (potentially many) of hardware units on a low bandwith network. The ids of the requested hardware units is also part of the message. the service receiving the message knows the network and how to efficiently execute the commands in the network based on the list of hardware unit ids.

So to make it short, the list must be kept together. I know I can split the message and use saga's to combine it again or implement a serializer extension, just thought it was bit much work to do simply to transfer a list of ids.

Glad we had the discussion though. Thanks for the input.

Ayende Rahien

unread,
Jun 17, 2011, 5:19:06 PM6/17/11
to rhino-t...@googlegroups.com
I put that limit in, and made it hard enough to bypass for a reason.
Have you considered the implications of this?
Usually when you have collections of ids, it means a query per id, and when you have that many queries, it is VERY expansive. It also opens up for a message with thousands and tens of thousands (or more) items.
Unbounded result sets are a very problematic (and common) thing.

I understand the requirement as set, but what about the other stuff that is involved?

--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rhino-tools-dev/-/NAVpW8C_xcYJ.

René M. A

unread,
Jun 18, 2011, 2:17:50 PM6/18/11
to rhino-t...@googlegroups.com
We are also concerned with the other stuff involved, e.g. saving the request so that it can be correlated with the results at a later time, validating the request in terms of whether the specific commands make sense for the hardware units requested etc. 
This is expensive for the large requests, but those requests (often with thousands of ids) is usually only carried out once a day and often during the night. It is acceptable that this request takes hours to complete (worst case up to 6-8 hours, most of the time spend in the network). The priority here is that the radio network, is utilized efficiently, because if not, the request will take more than the night to complete.

During the day several smaller request will be made against more specific hardware ids. These request can contain between 1-500 ids and again the network is the primary bottleneck.

I understand your concern regarding unbounded result sets, and hopefully we will be able to design it differently in the future, as the hardware get smarter and better  but for now I can't see how we can do it much differently, but performance is a major concern for us. So as to whether the collection size limit of 256 items is reasonable or not, I guess it depends, I still believe we have a valid scenario where it should be bypassed.

Ayende Rahien

unread,
Jun 18, 2011, 2:36:58 PM6/18/11
to rhino-t...@googlegroups.com
Are you seriously talking about a _single transaction_ taking 6 - 8 _hours_ ??
That is going to cause a _whole lot_ of problems. And I find it pretty hard to believe that what you have here is an OLTP process of some kind.
What is the message? What is the processing associated with it?

Basically, what are you trying to _do_ ?

As an aside, please note that there is a difference between the notion of a message in RSB terms and the network utilization involved.
Using Rhino Queues, for example, we will bundle multiple messages into a single round trip to the remote server, not make a single call per message.


--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rhino-tools-dev/-/J-KXC0uju4cJ.

René M. A

unread,
Jun 18, 2011, 3:46:43 PM6/18/11
to rhino-t...@googlegroups.com
No the entire processing end-to-end can take 6-8 hours worst case. It does not happen in a single transaction.

Here is what happens:

A client puts in a request for a job which must be executed in a low bandwidth network (e.g. radio), reading/writing configuration and other values from a list of utility meters (once a day, this list can be very long, thousands).
The application receiving the request saves it in order to be able to link the results to the original request when they arrive at a later time. The application sends the request as a message using RSB to a service which hosts a component that knows about the network and how to efficiently execute the request. This component is a legacy component and beyond the scope of our project. The service starts an async job using the legacy component. During job execution results from the individual meters are available to the service which sends these results back, using RSB to the application which received and saved the original request. The application saves these results as they arrive (results typically arrive in small chunks).

Whats important in relation to my original question, is that the entire message (or messages if split into a sequence) must be received by the service before the legacy component is invoked. It is way too expensive in terms of unefficient bandwidth usage to execute smaller parts of the request as separate jobs against the legacy component. 
So what I refer to as a single transaction is passing the entire message to the service. It does not have to be a single physical transaction as long as I know when all the ids of the meters involved in the request have been received.

I know there a several ways this can be dealt with using RSB to bypass the 256 collection size limit in the serializer, I just felt that it was a very artificial limit and thats why I asked of it still made sense to maintain it. I am still not convinced that it is the responsibility of the serializer to enforce such a limit?

Ayende Rahien

unread,
Jun 19, 2011, 1:40:04 AM6/19/11
to rhino-t...@googlegroups.com
If not in the serializer, where would you put it?

--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rhino-tools-dev/-/vgF0GWTBK_AJ.

René M. Andersen

unread,
Jun 20, 2011, 2:13:48 AM6/20/11
to rhino-t...@googlegroups.com
Well, I probably would not put in the RSB at all. I think it depends too much on the context which limits that should be imposed and what their tresholds should be. Regarding unbouded result sets, this is something that should be taken care of/considered in the application no matter which limitations exists in the RSB.

If I should put in the RSB, the serializer seems like a fine place to put it, since it has knowledge about the fact that a collection in fact do exist on the message.

This is however not a big issue for me, since it is possible to find a way around it and I do understand the concern for unbounded result sets.
Reply all
Reply to author
Forward
0 new messages