I'd like to start a new thread on the pros/cons of changing ROS' message / serialization layer. As a lot of people have noticed, the current solution is good but not ideal and there are several open source libraries that attempt to solve the same problem. The goal for this discussion is to gather feedback / use cases with the aim of creating a concrete specification for messages in ROS 2.0.
Obviously no library is perfect, and so we're going to have to weigh a bunch of different options. To start off, here is a long and undoubtedly incomplete list of criteria to judge serialization libraries:
ease of use (e.g., defining new messages, manipulating message in code)
size of messages on wire and in memory
ability to evolve message definitions / backwards compatibility
speed of serializing / deserializing messages
data types supported (e.g., numeric types, strings, lists, dictionaries)
effort to maintain library and integrate with ROS
number of language bindings and ease of porting to new systems
minimum hardware specs (can it run on arm or other embedded devices)
size and healthiness of the project
reflection (in the CS sense)
default values
support for namespaces
time to compile generated messages
ease of providing compatibility with current ROS message definitions
At a high level, it seems like we can do one of two things. One is to decide on The One True Serialization Layer that ROS uses. If you don't speak it, you can't communicate with anything in ROS. The other is to work out a way to support multiple serialization libraries. One important thing to keep in mind when thinking about these two approaches is what the user level API will look like (e.g.., what type of object is given to callbacks that the typical user writes).
The clear advantage of choosing The One True Serialization Layer is simplicity. Interoperability between nodes written by different people is made easier, and we can probably pick something that works for almost everyone. The user level API will be similar to what ROS currently does: users are given direct access to whatever message type we select. The problem with this approach is that the best serialization library today may not serve our needs in 3 years, and introducing major changes will be difficult.
The other high level choice is to make the serialization layer modular / pluggable so that it can support multiple serialization libraries. This idea could take on a bunch of different forms. One would be to change the concept of topics so that the type includes the encoding (e.g., topic /sensors/camera_img is of type protobuf:sensor_msgs/Image and topic /sensors/camera_img2 is of type rosmsg:/sensor_msgs/Image). A slight variation of this would be to have nodes deal with the details of serialization so that when a subscriber connects to a publisher they negotiate the serialization format in addition to the transport protocol. Nodes could support multiple formats (e.g., protobuf, msgpack, rosmsg etc.) and users could optionally specify which encoding they'd like for their particular application.
For the multiple serialization solutions, it would probably be a bad idea to directly expose the fact that there are multiple message formats as that would require users to register a new callback for each message type. This would place a higher burden on development, and hurt interoperability. Instead, it seems like if we go the modular route we should define a general "ROS message API" that abstracts away the details of different libraries. This API would be define what user level callbacks are given. This approach would let users get the intrinsic advantages of a serialization library (e.g., that it encodes certain types of data very efficiently), but at the cost of hiding specific features because the API doesn't support it (e.g., not every serialization library supports a set datatype or versioning for messages). I imagine that there would would also be an "abstraction penalty" in any API that supports conversion between multiple underlying message representations, however this penalty might be insignificant.
Given all of this, a couple of questions for people are:
What requirements do you have for a message library?
Would you prefer to have a single serialization approach or have support for multiple representations?
Have you run into any other fundamental problems with the way ROS currently serializes messages?
I’ve started a ROS NG wiki page: http://www.ros.org/wiki/sig/NextGenerationROS Currently it has a link to this google group and a link to my own analysis of the features provided by various serialization libraries (e.g., protobuf, rosmsg, msgpack):
http://www.ros.org/wiki/sig/NextGenerationROS/MessageFormats
Cheers,
Ben
I'd like to start a new thread on the pros/cons of changing ROS' message / serialization layer. As a lot of people have noticed, the current solution is good but not ideal and there are several open source libraries that attempt to solve the same problem. The goal for this discussion is to gather feedback / use cases with the aim of creating a concrete specification for messages in ROS 2.0.
To start off, here is a long and undoubtedly incomplete list of criteria to judge serialization libraries:
reflection (in the CS sense)
At a high level, it seems like we can do one of two things. One is to decide on The One True Serialization Layer that ROS uses. If you don't speak it, you can't communicate with anything in ROS. The other is to work out a way to support multiple serialization libraries. One important thing to keep in mind when thinking about these two approaches is what the user level API will look like (e.g.., what type of object is given to callbacks that the typical user writes).
The clear advantage of choosing The One True Serialization Layer is simplicity. Interoperability between nodes written by different people is made easier, and we can probably pick something that works for almost everyone. The user level API will be similar to what ROS currently does: users are given direct access to whatever message type we select. The problem with this approach is that the best serialization library today may not serve our needs in 3 years, and introducing major changes will be difficult.
The other high level choice is to make the serialization layer modular / pluggable so that it can support multiple serialization libraries. This idea could take on a bunch of different forms. One would be to change the concept of topics so that the type includes the encoding (e.g., topic /sensors/camera_img is of type protobuf:sensor_msgs/Image and topic /sensors/camera_img2 is of type rosmsg:/sensor_msgs/Image). A slight variation of this would be to have nodes deal with the details of serialization so that when a subscriber connects to a publisher they negotiate the serialization format in addition to the transport protocol. Nodes could support multiple formats (e.g., protobuf, msgpack, rosmsg etc.) and users could optionally specify which encoding they'd like for their particular application.
For the multiple serialization solutions, it would probably be a bad idea to directly expose the fact that there are multiple message formats as that would require users to register a new callback for each message type. This would place a higher burden on development, and hurt interoperability. Instead, it seems like if we go the modular route we should define a general "ROS message API" that abstracts away the details of different libraries. This API would be define what user level callbacks are given. This approach would let users get the intrinsic advantages of a serialization library (e.g., that it encodes certain types of data very efficiently), but at the cost of hiding specific features because the API doesn't support it (e.g., not every serialization library supports a set datatype or versioning for messages). I imagine that there would would also be an "abstraction penalty" in any API that supports conversion between multiple underlying message representations, however this penalty might be insignificant.
Given all of this, a couple of questions for people are:
What requirements do you have for a message library?
Would you prefer to have a single serialization approach or have support for multiple representations?
Have you run into any other fundamental problems with the way ROS currently serializes messages?
Ok, so i guess it is up to me to chime in for the real-time/embedded crowd.
However, you've also brought up physical types, which would be useful when dealing with robots. Which ones though? My vote is for SI units and SI derived units, but that still leaves a moderately long list to chose from:
http://en.wikipedia.org/wiki/SI_base_unit
http://en.wikipedia.org/wiki/SI_derived_unit
Ok, so i guess it is up to me to chime in for the real-time/embedded crowd. Please note that these are my opinions and others with embedded experience may have different answers. Also, I am a cpp guy and do not know squat about python.
Let me get Ben's questions answered first:1) support for key-value types and bitfields. bitfields can be key for reducing message size and memory usage.serialization should be binary. xml/json/bson are highly inefficient.Your list left off the ability to set value ranges. i.e. min and max value for a radian measurement This is a rare capability in IDLs.2) multiple serialization methods as the infrastructure is key for future flexibility and expansion.if multiple however, the end user should select how their system is configured. the negotiation of types makes sense. but development and debugging of a robot or team will be much easier if the dev team can choose 1 and enforce it regardless of where their packages came from.Protobuf, thrift, msgpack all look good to me. use of a 3rd party format increases interoperability. So just pick one to start with and prove out the pluggable infrastructure. If it is necessary to transition to another down the road, it will be easier this way.3) problems with how ros serializes messages? yes. see below.Thoughts on other peoples posts:Cem mentioned a couple of important things: unique id's and versioning. I disagree on the use of UUIDs for message ids. To have repeatable UUID creation, name based UUIDs need to be used which are just a hash of the namespace+name in the 128bit UUID data format. I use something similar in hashing the namespace+name into a 32bit unsigned int, and adding another 32bit uint holding the hashed IDL message definition as the version (in my case the IDL is xml). 32bits is large enough to hold the data with few collisions, but small enough to send with each message. why send 128bits when 64bits has more relevant information? It also has the advantage of being usable by switch statements on processors without 128bit int support (unless something happened with 128bit ints that i am unaware of). Something as simple as using a switch vs cascading if statements can have a performance impact on small systems with limited speed & memory.I investigated HDF5 for DAQ files (er like a rosbag) and found the api to be cumbersome when you want to serialize dynamic length arrays, basically it requires you to allocate temporary arrays then serialize the temporary array.Introspection: From what I can see, protobuf uses a separate definition structure which I assume is compiled into the message code which simplifies the process. If someone is using C or C++ they most likely will be doing so for efficency, and reflection is not an efficient data access method. It may be good enough to let the development language handle it. Especially for C I highly doubt someone would use introspection. Also, supporting char/short and smaller types is important at the embedded level to reduce memory footprint, as would be bitfields.
I agree with Damon on separating serialization from transport and from the user level. Industry is starting to implement comm plugins to their middleware layers. We have a version, I hear iRobot also has a version. This is possible if the data and serialization mechanisms are separate. One of my main criticisms of roscpp is that it is not modular at all. There is no reason the message generator could not output a serialization plugin which roscpp then loads. The issue of changing serialization formats then becomes a message generator problem.Realtime/embedded:I think it will be important to define what the scope of ros 2.0 will be w.r.t realtime/embedded. ROS1.0 specifically states that it does not support real-time operation. And that's ok. There are methods & techniques to handle moving data across the realtime boundary. In my case, I feel it is important to understand the concerns as it may yield more efficient code for everyone as how the data is sent has implications upon the serialization approach.The key is to minimize latency for message access. For realtime, it is an important in order to meet timing constraints. Embedded it is important as lower latency comes from shorter code paths which in turn means less cpu cycles and power usage. One basic method for this is to never allocate memory after startup. Another is to avoid serialization whenever possible. For example, serialization of high def stereo images is expensive. I continue to run into people who think GigE cameras are just cameras that have a GigE jack. In reality they require that much bandwidth per camera.At GDRS we use RCSLIB, an open source comms package maintained by NIST. http://www.isd.mel.nist.gov/projects/rcslib/. Being designed for realtime, NML does nice things such as using shared memory for processes on the same system, using pre-allocated fixed sized buffers, and sending the data from these buffers to remote systems via the appropriate comm layer. All of this is transparent to the user, hidden behind standard write/read calls (there are no callbacks). RCSLIB also constrains messages to be Plain Old Data. This means that to send a point cloud between two processes is essentially a memcpy. This of course places an added design burden on the developer w.r.t message definition and how to lay out the communication channels.For example: using roscpp, and sending a stereo image pair has multiple memory allocations:- allocate the message wrapper- allocate 2 vectors of image data (assuming you use vector::reserve() and don't use push_back() to fill the vector)- copy the message if nodelets are used- iterate through the message to calculate its size, allocate buffer to hold serialized message- allocate deserialization buffer on subscriber side- allocate message wrapper- allocate each vector of image data to hold de-serialized imagesSending the same message through RCSLIB is zero allocations.These aspects of pre-allocating memory, PoD, and minimizing serialization have one very important side issue: the majority of existing ros messages were not designed with this in mind. To address this issue would be non-trivial: require a review & modification core ros messages, a codegenerator upgrade, nodehandle updates, updating all nodes which use these messages. Think how common ros::Header is, with its use of string.
Let me get Ben's questions answered first:
1) support for key-value types and bitfields. bitfields can be key for reducing message size and memory usage.
Your list left off the ability to set value ranges. i.e. min and max value for a radian measurement This is a rare capability in IDLs.
I agree with Damon on separating serialization from transport and from the user level. Industry is starting to implement comm plugins to their middleware layers. We have a version, I hear iRobot also has a version. This is possible if the data and serialization mechanisms are separate. One of my main criticisms of roscpp is that it is not modular at all. There is no reason the message generator could not output a serialization plugin which roscpp then loads. The issue of changing serialization formats then becomes a message generator problem.
Realtime/embedded:I think it will be important to define what the scope of ros 2.0 will be w.r.t realtime/embedded. ROS1.0 specifically states that it does not support real-time operation. And that's ok. There are methods & techniques to handle moving data across the realtime boundary. In my case, I feel it is important to understand the concerns as it may yield more efficient code for everyone as how the data is sent has implications upon the serialization approach.
On Thu, May 30, 2013 at 11:15 PM, Bob Dean <bob....@gmail.com> wrote:
Let me get Ben's questions answered first:1) support for key-value types and bitfields. bitfields can be key for reducing message size and memory usage.What exactly would you like to see for bitfield support? I guess I'm thinking that you can do the following in a ROS message definition:bitfield.msg:uint8_t opts;uint8_t READ = 1;uint8_t WRITE = 2;uint8_t FLUSH = 4;and then in code have something likeif (bitfield_msg.opts & bitfield_msg.READ) {bitfield_msgs.opts |= bitfield_msg.FLUSH...}But does this not cover what you're thinking of?Your list left off the ability to set value ranges. i.e. min and max value for a radian measurement This is a rare capability in IDLs.This is an interesting idea. Currently ROS doesn't have an explicit mechanism to "validate" or limit a field's value. Do you have a pointer to an IDL that supports something like this? What happens when your code tries to set a field to an invalid value?
REP 117 http://www.ros.org/reps/rep-0117.html seems like its covering a similar area. Code that processes range messages like sensor_msgs/LaserScan is supposed to follow well specified application logic. At first blush, REPs like this seem better than augmenting the IDL as the question of what you do when a message's field has bad data seems application -- not message type -- dependent.
I agree with Damon on separating serialization from transport and from the user level. Industry is starting to implement comm plugins to their middleware layers. We have a version, I hear iRobot also has a version. This is possible if the data and serialization mechanisms are separate. One of my main criticisms of roscpp is that it is not modular at all. There is no reason the message generator could not output a serialization plugin which roscpp then loads. The issue of changing serialization formats then becomes a message generator problem.
The lack of modularity / separation in ROS' client libraries and specification definitely seems like a major them of what people want to change.Realtime/embedded:I think it will be important to define what the scope of ros 2.0 will be w.r.t realtime/embedded. ROS1.0 specifically states that it does not support real-time operation. And that's ok. There are methods & techniques to handle moving data across the realtime boundary. In my case, I feel it is important to understand the concerns as it may yield more efficient code for everyone as how the data is sent has implications upon the serialization approach.+100 for making sure ROS has an explicit scope and a set of things that it's designed to be good at.Cheers,Ben
--
--
The example idl for both would be JAUS, which is now behind a pay wall. I have an in house idl I can draw examples from. Hopefully iPhone typing will be legible.Jays calls them "bit ranges". Our in house idl defines them as such:<bitrange name="flags" type="unsigned char" ><bitvalue name="changed" comment="set if object has changed"/><bitvalue name="reset" comment="if set, reset device"/><bitvalue name="controller id" size="4"/></bitrange>
Thus example has two single bit sized fields and one 4bit field. The message generator uses accessors for data access. Setters check range values if necessary and return false. To support bit fields in c++, the appropriate masks and shift operators are generated as are accessors for each but field. Such asbool TestMsg::flags() // full field valueBool TestMsg::flagsChanged() // getBool TestMsg::flagsChanged(bool newValue) // setBool TestMsg::flagsControllerId(unsigned char newValue) // setunsigned char TestMsg::flagsControllerId() // get
I vote for an accessor based approach. We'll have greater flexibility that way. That said, I don't know enough about highly constrained systems to know if this will work. Bob, do you have any opinions?
---10 to making breaking changes to an API affecting every ROS program for non-compelling reasons. "Greater flexibility" is nice, but hardly worth all that effort.Since we also want a Plain Old Data representation for low-end messages, requiring retro-fitting of accessors seems wrong.Also, these messages need to be represented in several languages. Let's keep it simple and not get too bogged down in optimizing certain cases for one or two of them.
joq
On Tuesday, June 4, 2013 10:31:24 AM UTC-4, Jack O'Quin wrote:-10 to making breaking changes to an API affecting every ROS program for non-compelling reasons. "Greater flexibility" is nice, but hardly worth all that effort.Since we also want a Plain Old Data representation for low-end messages, requiring retro-fitting of accessors seems wrong.Also, these messages need to be represented in several languages. Let's keep it simple and not get too bogged down in optimizing certain cases for one or two of them.
At some point a line needs to be drawn as to what is ROS1.0 and what is ROS2.0. The only guarantee about that line is that there will be API breaking changes in some fashion with ROS2.0, and a migration path would need to be defined as well as a mechanism to interoperate between the two (such as supporting rosmsg1.0 in ros 2.0).
POD support itself is a API breaking move, common ros messages such as Header and PointCloud2 are not POD. POD also does not negate usefulness accessors for C++, as C structs and C++ classes will have the same memory layout assuming certain requirements are met such as not using virtual methods or dynamic length arrays.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ros+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ros+unsubscribe@googlegroups.com.
From a CS/elegance side putting the data behind accessors provides great flexibility from the viewpoint of the framework. It provides the flexibility to change the system at the cost of removing that flexibility from the user. The user group whom this will most effect is those doing very high bandwidth communications where one of two copies are a significant cost. The use case of image processing and pipelining jumps to my mind as one of the most common use case in this realm.We need to keep in mind these tradeoffs in our design.
Opinion/rant/pet peeve: I do object strongly to using get/set in accessor names if it is otherwise obvious from the language syntax what is occurring. Using the Point class above, x() obviously returns the value x, and x(123) sets it. There is code out there where substantial portions of the text is either "get" or "set", which is ri-diculous. The api is also closer to the way "struct" version of the class would be used. (I will stop ranting now, but email me if you want to know why _ for member names is potentially more ri-dic-u-lous)
I am not sure I understand your point Tully, unless it is my inexperience with python. in C++ the accessors are effectively optimized away or there are methods to construct them such that access is as fast as direct member access. No need to copy data...
On Tuesday, June 4, 2013 2:28:40 PM UTC-4, Tully Foote wrote:From a CS/elegance side putting the data behind accessors provides great flexibility from the viewpoint of the framework. It provides the flexibility to change the system at the cost of removing that flexibility from the user. The user group whom this will most effect is those doing very high bandwidth communications where one of two copies are a significant cost. The use case of image processing and pipelining jumps to my mind as one of the most common use case in this realm.We need to keep in mind these tradeoffs in our design.Tully
--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
Once the format of the messages is fixed and unambiguous, we can access the contents how we feel like.
So is the consensus that no matter which message architecture we'd like to use, if there isn't an existing API which we like, we would write our own API which wraps or implements it?On Thu, Jun 6, 2013 at 6:23 AM, cfka...@gmail.com <cfka...@gmail.com> wrote:
Once the format of the messages is fixed and unambiguous, we can access the contents how we feel like.
On 06.06.2013 18:22, Michael Gratton wrote:
On 07/06/13 03:41, Dirk Thomas wrote:
So, I'd like to put API discussions on the backburner while we
figure out the wire format of the messages themselves.
Exactly this, yes.
I can only emphasize that this should not (!) be the goal here. One
of the most important goals for ROS 2.0 is the reuse of existing
libraries to reduce long term maintenance. Defining again our own
wire format and implementing again our own serialization is not
desirable.
I can understand the need to reduce maintenance overhead, however it is
exactly the composition of nodes into a graph and interoperable passing
of messages between them that defines ROS as what it is. If you take
that away ROS becomes just a meta-distribution of vaguely related
software packages, with little value-add.
If you think that any of my proposed options would reduce interoperability than we seem to be not on the same page.
The option with the pluggable serialization would even increase interoperability because it becomes much more easy to integrate systems which might not be able to use the "one-and-only" library we might end up selecting.
This is why I recommend the pluggable serialization over the one-and-only serialization:
the user land code stays generic and therefore agnostic to the serialization and that enables long term interoperability - even in the case if in the future we would add another serialization library or switch the default one all existing and future nodes are transparently working together.
It would be exactly reinventing the wheel.So it seems strange to to hitch the cart to something that (as you point
out) is inevitably going to require some compromise anyway. Surely if
there is one thing that the project wants to get absolutely right, it is
interoperability and fit-for-purpose of message passing between nodes.
To me that seem to imply maintaining it in-house.
If you think of this as a third option, where the major (only?) downside
is maintenance effort, then that provides a way to evaluate if it is
actually not worthwhile or not, rather than declaring by fiat it is off
the table.
There are serialization libraries out there - and they are really sophisticated.
Considering that we can just create a better (or even equal) solution is not really realistic without spending significant work into a domain which already has adequate solutions.
Reusing is the key here - and that does not imply any reduced interoperability.
Providing a standardized API with thin layers of remapping code from that API to a specific serialization library is orders of magnitude less code and less complexity than writing a custom serialization library.In comparing it with the other two options, making serialisation
pluggable will require a similar order of magnitude of maintenance
effort and greatly harm interoperability, and while choosing one
3rd-party serialisation library is clearly the winner in terms of
maintenance effort, it only maintains interoperability for
supported/suitable platforms and introduces non-trivial risk.
We still have not explicitly defined our requirements and other desiderata. We should do that before going much further.
Dirk is arguing against inventing a *new* serialization library. I agree with him. But, several plausible options have been mentioned already. Surely we can consider each of them as a potential TOTSL without reinventing any wheels?
* rosmsg* thrift* protobuf* msgpack* JAUS ?* DDS ?Please add others if I forgot to mention one you care about.
Requirements:
1) Easy to define new messages in the IDL. I don't want to have to specify much except for a field name and a type.
2) Messages are strongly typed. I want to quickly reference a file that tells me about all the fields in a message.
3) Support for all standard numeric types (fixed width ints and float/double) and strings.
4) Easy to define namespaces.5) Minimal time to generate code and compile into existing applications.
Desiderata:1) A KeyValue data structure. I know this can sometimes run afoul of requirement #3.
2) Versioned messages / fields to enable longer running services. I get by without this for my own stuff, but for writing a public facing API it would be very nice to have.
I'm FINALLY at a real keyboard and can write sensible messages! Hallelujah!
UUID thoughts:
API:
My thought is that there will likely be several competing libraries/APIs as we hash out what is most easily used for each language/domain. Pluggable layers will only work if we fix what we're serializing completely. That means deciding on basic types (e.g., ints, floats, bit vectors, strings, maps, etc.). Do we have that kind of consensus?
If not, then we could do something crazy, as in the next section:
Serialization format:I agree with what many others have said about finding a single library/format, and making that THE format. But, just to play devil's advocate, let's think about what would happen if we went in the exact opposite direction and declare that serialized messages all have exactly two fields:
- A message type identifier (as everyone can probably guess by now, for me that means a 128 bit UUID)
- An arbitrarily long byte array. (I'm assuming that there is a length encoding or termination method so we know where the end is)
Programmers will be 100% responsible for decoding the byte array into whatever in-memory format that is best suited for their own language, or for finding a library that is able to do it for them. I'm sure that there are other advantages and disadvantages to this approach, which brings up the following two questions:
- Do the advantages outweigh the disadvantages?
- Is this approach the best approach?
Per Bob Dean's email, support for real enums would be nice,
although bitfields strike me as being a serialisation detail?
> Desiderata:
> 1) A KeyValue data structure. I know this can sometimes run afoul of
> requirement #3.
On key-value types, I wonder if that is possibly open to abuse of
requirement #2 - in that if your message has a key-value map of some
sort then it can be used to stuff whatever objects in there that you
want. Limiting it to specific key and value types (e.g. a
map<int,string>) might help a bit? Also, if you're going to admit a
key-value map, you may as well add other collections types such as lists
and sets.
> Dirk is arguing against inventing a *new* serialization library. ISo which do and don't meet these requirements and desiderata?
> agree with him. But, several plausible options have been mentioned
> already. Surely we can consider each of them as a potential TOTSL
> without reinventing any wheels?
On Wed, Jun 12, 2013 at 9:23 AM, CFK <cfka...@gmail.com> wrote:
I'm FINALLY at a real keyboard and can write sensible messages! Hallelujah!
UUID thoughts:I like UUIDs and use them in other ROS messages. Here is a small metapackage that defines a ROS UniqueID message, plus C++ and Python methods for using it.
However, I don't understanding something. With pub/sub, why send a type signature in every message?
Currently, I believe the MD5sum is verified between the subscriber and the publisher at connection time. Isn't that better for most of our purposes?
* Can extend messages by adding fields in a compatible way.* Fairly simple IDL.* Works on some real-time supervisors. (Which ones?)
--joq
--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
* Can extend messages by adding fields in a compatible way.* Fairly simple IDL.* Works on some real-time supervisors. (Which ones?)
--joq--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
William WoodallROS Development Team--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Do you mean for the wire format of the message, or for the libraries that encode the message type?
> On Jun 12, 2013, at 7:56 PM, William Woodall wrote:
>
>> On Wed, Jun 12, 2013 at 4:40 PM, Jack O'Quin <jack....@gmail.com> wrote:
>> In my old-fashioned software engineering vocabulary, "requirement" had a very specific meaning: "without this, you have no useful product". Not everyone shares that specialized definition, but it remains an important concept. To make myself clearer, I'll divide some desirable features into three categories: must-have, good-to-have, and nice-to-have.
>>
>> I left some good ideas out because I personally don't consider them important enough to worry about at this high level of discussion. I probably left out some others because I forgot. Some items likely require engineering trade-offs to resolve conflicts between them, but hopefully none are totally contradictory.
>>
>> must-have:
>> -------------
>>
>> * Commercial-friendly, permissive, open-source license (BSD, Apache, MIT, ...).
Actually, now that I think about it, maybe the format should be copyrighted (copylefted?) as well? I mean, is it possible for someone to claim ownership of the format itself after we've defined it, locking everyone out of using it without paying a fee (like Rambus?)? I'd like to avoid that possibility if at all possible.
>> * CPU and storage overhead as good as rosmsg, or close enough.How large is 'very large'? MessagePack handles maps and lists of up to 2^32 items, is that large enough?
>> * Excellent language support for C++ and Python.
>> * Works on Linux, OSX, Windows.
>> * Can represent most or all existing ROS messages. A great deal of thought has gone into many of these messages. They have proven useful for a wide range of robotic systems. That is not a trivial accomplishment.
>> * Messages can define fields using messages defined in another package. This follows from the previous requirement, but is worth explicit mention.
>> * Handle very large messages, for things like maps, images and point clouds.
>> * Can represent a message as an "object" in supported languages.Might be difficult; if I serialize a Python object, and then read it out via a C library, what can I expect?
>> * Detect message version inconsistencies somehow.Can we consider different versions of messages as different types of messages? In that case, typing via UUIDs or some other mechanism can handle it.
On Wed, Jun 12, 2013 at 11:17 AM, Jack O'Quin <jack....@gmail.com> wrote:
However, I don't understanding something. With pub/sub, why send a type signature in every message?pub/sub assumes that we have a connection-oriented transport. What if we're over UDP, or something else that is similar? What if we're bagging all messages that all nodes are publishing, and then want to figure out what each message type was afterwards?
Currently, I believe the MD5sum is verified between the subscriber and the publisher at connection time. Isn't that better for most of our purposes?Unfortunately, no. Assume we have two different message types, each of which contains a single boolean. One message causes the robot to self-destruct, while the other causes it to take out the garbage. The MD5 of the types will be the same, so figuring out which we're talking about is impossible. If we decide to hash the entire message definition, then we'll do better, but I know I've been guilty of writing basic ping-like messages that have no content whatsoever; the information is the topic that it was transmitted on. With UUIDs, every message type will have a different identifier, and it isn't subject to human failings of various kinds.
Is there any reason why XMPP & extensions hasn't been suggested for
master & co? If you leave aside the IM aspects, it is a mature,
lightweight, extensible protocol that is well supported, intrinsically
decentralised (i.e. handles multimaster for free), has a large number of
libraries in a number of languages, and existing extensions for
signalling (i.e. negotiation and establishment of out-of-band streams)
and pubsub. It is also an IEEE standard that that seems to be free of
patent and copyright problems.
The downside is poor support for in-band binary data, but if it is used
as a naming/pubsub/signalling solution then that's not a problem, and
maybe is a benefit in the end for people who want to support pluggable
serialisation libraries anyway.
If the idea is that the ROS 2.0 messaging system will be used for naming
and pubsub as well as passing data between nodes then you have a
chicken-egg problem, which can be avoided a dedicated naming service.
Using XMPP as one, nodes establish a connection to their local master on
startup as an XMPP client - master would act as an XMPP server. Nodes
can then do pubsub and access services a similar way as they do now.
Nodes on different masters can be accessed transparently, communicating
via XMPP's built-in server-to-server federation support. This would
actually be a good way to access parameters and have parameter changes
pushed, too.
please forgive lack of quotations. also, jumping around between topics after reading the last 10 posts or so.
Question: what is the line between serialization and messaging API? I think there is a grey area between the two. There have been requests to not discuss this in this thread, however, there are concepts where api and serialization are tightly coupled.
Message id's (32bit, md5 or uuid) is an example of this. It is likely that the serialization library itself may not provide a message id, or version support, and the ros api would need to add it. Popular approaches would be to calculate these values from the message IDL. Bit fields are another example, encoding them in serialization is just supporting ints. Using them effectively is an API issue.
As there is likely to be a (hopefully thin) codegen wrapper that adds on to serialization, there are requirements for those tools:
- multi language.
example :MD5 generation in the current system is python only. I have tried to recreate the md5 calculation in cpp code, and never got it to work. I actually resorted to writing a ruby script to walk the ros install tree, find message files, and parse out the md5 value, as it was faster than the python tools. A C or CPP library which would provide this functionality without wrapping the python code would be useful.
- minimize client dependencies
- the idl and extensions should be machine readable: http://en.wikipedia.org/wiki/Machine-readable_data As a government contractor, machine readable is usually a requirement for interoperability.
- Deterministic. if i run the tools in Maryland, and person Y runs them in California, the results should be the same. This is one of the reasons i disagree with unique UUIDs. Using a named UUID generation scheme is deterministic. Related to this is that no message is unique, it is always combined with other metadata such as topic name and version information which adds semantic meaning.
Multiplexing messages on the same channel/topic/connection is a common method of interoperability between systems, and is also dependent upon the serialization wrapper functionality we have been discussing. To multiplex, each message needs an additional header which allows the message to be properly delivered. In general this header requires the message type, message size, connection/destination id, and an optional version field. Translating this into ros 1.0 land would be the md5, size, topic name, and the version is currently encoded in the md5. Using ints, this header could be 20 bytes in size. Using ros 1.0 concepts it would be much larger. In the scope of processing a single message, not a big deal. In the scope of hundreds of thousands of messages per second, the added message size becomes non trivial w.r.t. network load.
On Mon, Jun 17, 2013 at 10:31 AM, Bob Dean <bob....@gmail.com> wrote:- Deterministic. if i run the tools in Maryland, and person Y runs them in California, the results should be the same. This is one of the reasons i disagree with unique UUIDs. Using a named UUID generation scheme is deterministic. Related to this is that no message is unique, it is always combined with other metadata such as topic name and version information which adds semantic meaning.
I'm afraid I still have to disagree with you on this. The UUID needs to be different because while the fields are identical between different message types, the meanings may not be. As I mentioned before, the canonical example is the empty message. The meaning may depend on the context in which it is sent (e.g., topic, sender, etc.).
Moreover, a given message may change meaning over time. This might happen while a system is being developed, with the meaning shifting subtly over time, while the content stays the same. In this case, we don't want to reuse the message ID, which would happen if we used the hash of the message as the ID.
On Wed, Jun 19, 2013 at 7:31 AM, CFK <cfka...@gmail.com> wrote:
On Mon, Jun 17, 2013 at 10:31 AM, Bob Dean <bob....@gmail.com> wrote:- Deterministic. if i run the tools in Maryland, and person Y runs them in California, the results should be the same. This is one of the reasons i disagree with unique UUIDs. Using a named UUID generation scheme is deterministic. Related to this is that no message is unique, it is always combined with other metadata such as topic name and version information which adds semantic meaning.
I'm afraid I still have to disagree with you on this. The UUID needs to be different because while the fields are identical between different message types, the meanings may not be. As I mentioned before, the canonical example is the empty message. The meaning may depend on the context in which it is sent (e.g., topic, sender, etc.).-1 for uuids as a way of capturing semantic information. +1 for determinism.tl;dr: I think UUIDs will add complexity, but won't add much in terms of safety.longer version:Currently pubs and subs negotiate and have to agree on the md5sum (representing the message definition) as well as the message type (e.g., sensor_msgs/LaserScan) before exchanging data and I think that is sufficient to eliminate most of the errors you get when using a message passing system. As Cem has pointed out, a remaining problem is that messages might have the same fields and type, but be semantically different. However, I think that's where topic names come in. If an application still really needs to differentiate these messages, they should use something like Jack's UUID library.
Moreover, a given message may change meaning over time. This might happen while a system is being developed, with the meaning shifting subtly over time, while the content stays the same. In this case, we don't want to reuse the message ID, which would happen if we used the hash of the message as the ID.
Is this also a +1 for requiring the ability to evolve message definitions over time (e.g., optional fields in protobuf)?
Optional fields only allow you to add more on at a later date. That is handy, but it still doesn't address the original problem of the shifting of meanings. I'm trying to make the semantic meaning of a message clear despite the fact that its content, message definition and topic may remain the same. MD5 won't let us do that.
If you want to go for a pub/sub negotiation method, then use UUIDs instead of MD5; the number of bits are the same, and you get the freedom to change semantic meaning without being forced to change the fields themselves.