The future of ROS 2.0 protocol changes

450 views

Skip to first unread message

Aaron Sims

unread,

Sep 14, 2014, 7:55:33 PM9/14/14

to ros-sig...@googlegroups.com

Dear ROS Teams,

Thanks for an informative time at ROSCon 2014. I had to leave about an hour and a half early to catch a flight, and I was left with my head spinning about the the decision to change protocols in ROS 2.0. This is an important decision, and I will share my concerns, as well as my thoughts on a direction I personally would of taken with ROS 2.0, and the protocol directions. This topic seems better fit for a blog, however, an open discussion is important.

First let me say the direction of ROS 2.0 makes sense, the complete shift to ROS DDS protocol does not seem wise. There are many good reasons to implement DDS in ROS, however, the way it is implemented can make or break ROS in the future.

Many well intended protocol implementations end up at dead ends. Two examples are:
HTTP NG (Next Generation)
http://www.w3.org/Protocols/HTTP-NG/Activity.html
Gnutella 2
http://en.wikipedia.org/wiki/Gnutella2

I'd like to share the approach of the Happy Artist RMDMIA RCSM (Robot Control System Messenger) API. The RCSM implements a plugin architecture that allows multiple robot operating system clients to be registered/accessed, run simultaneously, and inter operate together, or simply to plugin a single client such as the Happy Artist ROS Client for TCPROS/UDPROS. As a ROS user this allows any protocol to plugin, and work from any 3rd party vendor DDS implementation, TCPROS/UDPROS, or any other protocol. In the Happy Artist RMDMIA implementation the RCSM component serves as an interface to the rest of the system so that protocols can change, while code written for autonomous/user control follows the same pattern, allowing interchangeability, and forward/backward compatibility between other aspects of the system. If ROS were to approach 2.0
like this, libraries like MoveIt (an example of a library name (no idea if it is tied to TCPROS/UDPROS protocols in ROS 1.0 implementation), could run on any communication protocol without ever needing to know anything about the protocols it was operating on. This approach needs to be applied to all areas of the system so that a user of any component can interchange their libraries to ROS 2.0 for 100% forward, and backward compatibility.

I am personally concerned that researchers may find themselves using ROS alternatives due to the time/money that will be lost while they wait for a viable ROS 2.0 solution. Why would a researcher/developer invest resources into a platform that was going to completely change nullifying their personal investment in that platform?

It is evident ROS 2.0 needs support similar to ACID transactions for autonomous robotics that the DDS implementation will support (primarily for people safety, and system stability of commercialized autonomous systems).

It is a bad, bad idea to throw the current ROS users under the bus in the interim. The implementation architecture direction for ROS 2.0 I am suggesting alleviates this scenario, and allows ROS to backtrack a little with its users to continue improving ROS 1.0 functionality in TCPROS, and UDPROS.

In the interim here are some ROS 1.0 improvements that could greatly improve ROS performance with substantial latency decrease, while increasing data bandwidth:

Add support for UDPROS on every ROS supported client, and server with a few minor enhancements. (Some of these suggestions came out of the BOF at ROSCon discussion on latency and performance, and some I have been attempting to subtly hint at with multiple questions on ROS Answers over the last year).

UDP Jumbogram support: This is a very simple enhancement. Allow unbounded UDP Datagram sizes, and during protocol handshake (XMLRPC negotiation in UDPROS), and add a flag to the publisher configuration that says allow UDP jumbogram support, or specify the maximum datagram size. ROS currently puts their datagram size limit around 1500 bytes. If the system hardware supports jumbograms, the only change is supporting a configurable maximum datagram size.

UDP Multicast support: Adding a header attribute called multicast could allow a topic or service (not aware service is currently supported for UDPROS, but it should be) to configure the associated Publisher to perform a broadcast (A concern of multiple subscribers to a topic causing a major problem with system performance can be alleviated with a feature request by a person in the BOF get together for specifying the numeric limit to subscribers of a particular topic). Note: (Based on my personal UDPROS implementation experience, adding Multicast support would be fairly simple to add into the Happy Artist Java client, so I assume the same might apply on other platforms that have UDPROS support).

Maximum Topic Subscribers support: Allow the the system designer to specify the maximum number of subscribers on any given topic. The use case given was a high bandwidth topic that multiple users subscribe to and cause the system to hang. The user wanted to implement a UDPROS topic with Multicast support for a single subscriber. Note: Based on my personal knowledge of the CPP client (somewhat limited), the client uses low level arrays rather than Data Collection Objects, therefore the incremental nature of arrays in CPP code should make this task fairly straight forward to implement.

If you have any questions or would like to have further discussion, I look forward to discussion.

Sincerely,

Aaron

Jonathan Bohren

unread,

Sep 14, 2014, 11:52:19 PM9/14/14

to ros-sig-ng-ros

On Sun, Sep 14, 2014 at 7:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:

If you have any questions or would like to have further discussion, I look forward to discussion.

First, a lot of what you're saying sounds like it's in the same vein as something I posted following ROSCon 2013, and I still agree that there's a lot that can be done with minimal effort to improve ROS 1.x now and as ROS 2.0 starts getting rolled out:

https://groups.google.com/forum/#!topic/ros-sig-ng-ros/cMCVHwaBVIU

In response to my post, Esteve Fernandez put together a simple re-implementation of the ROS core which used ZeroMQ and while it seemed like a great first step, none of these improvements ever made it into the ROS core. I believe most of this is because resources at OSRF are limited and it seems like work which might appear like hacking improvements into the current ROS core might not be as important as potentially re-working it form the ground up.

On Sun, Sep 14, 2014 at 7:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:

It is a bad, bad idea to throw the current ROS users under the bus in the interim. The implementation architecture direction for ROS 2.0 I am suggesting alleviates this scenario, and allows ROS to backtrack a little with its users to continue improving ROS 1.0 functionality in TCPROS, and UDPROS.

I also agree that it's really important not to make things a mess for ROS users in the transition from ROS 1.0 to ROS 2.0. Especially since there's a huge wealth of code out there which isn't even working with current releases.

With respect to improving ROS 1.0 without fundamentally breaking anything that currently works, I think there's plenty motivation and skill out there [1] [2] [3] but it really needs to be organized, focused, and coordinated into making its way into the ROS core releases. If it's going to reach everyone, then this organization needs to come from OSRF or someone else who can be devoted to such an effort full-time.

[1] http://wiki.ros.org/ethzasl_message_transport

[2] http://micros.nudt.edu.cn/

[3] https://github.com/esteve/ros_comm/tree/zeromq_thrift

-jon

Mike Purvis

unread,

Sep 15, 2014, 10:51:14 AM9/15/14

to ros-sig...@googlegroups.com

I appreciated the talks from Dirk and William about the DDS decision and the future of the APIs— I like the look of this future, especially the effort to unify node/nodelet APIs so that the decision to run nodes in many processes vs one is made at launch time rather than at authoring time.

I don't feel at all thrown under the bus by DDS/2.0; much as we did with catkin, Clearpath will continue to use the legacy tools for a period of time while we evaluate the new hotness, and then gradually switch our software over, probably beginning with peripheral drivers and generic components, and finishing with the platform software itself.

That said, some concerns:

As raised during the Q&A, it is critical that ROS 2.0 reverse the trend of making each ROS release harder and harder to get into, learn about, and get started with. Given the state of things presently, it's a very legitimate concern that a further fracturing of documentation/tutorial effort could render ROS 2.0 completely inaccessible to anyone who was not already an expert in ROS 1.x. It'll be important than the majority of documentation explaining ROS 2.0 is targeted at non-users, not migrating ROS 1.0 users.
Along these same lines, the installation needs to get shorter and simpler— rosdep, rosinstall, and wstool should be installed automatically, and the rosdep (xylem?) init/update should be part of its install, not a separate step. For ROS 2.0's supported platforms, I feel we should strive to offer a "one step setup", perhaps a homebrew-like python -e "$(curl -fsSL https://ros.org/install)", or a downloadable "install deb" with an installation walk-through. Power users might still prefer to do everything their own way, but for the majority, especially those getting started for the first time, I feel strongly that it should be a one-click affair.
I'm unsure if the future of ROS community docs are the wiki or some version-controlled sphinx-type thing, but either way, having a ROS1/2 switcher like the rosbuild/catkin one isn't enough. There are still rosbuild-centric tutorial pages on the ROS wiki— how about a warning box which appears automatically on wiki pages which haven't been thumb-upped in the last 6 months, perhaps combined with a "hot list" page that shows warning boxed pages sorted by traffic to them. (This doesn't need to wait for ROS 2.0; it would be great to have immediately...)
ROS is more than just a generic publish/subscribe middleware— a lot more. I'm glad to hear about the intent to fix the parameter service, but what about issues specific to robotics, like runtime dynamic URDF, or multimaster? These are tough problems, and it'd be great to see OSRF take a really active role in solving these, either by spearheading development, or providing direct counsel-to and endorsement-of other groups working on these.

Hope that's helpful. Thanks again for a great ROSCon.

Mike

Brian Gerkey

unread,

Sep 15, 2014, 11:12:51 AM9/15/14

to ros-sig...@googlegroups.com

Looks like we're doing this over on ros-users@, at least for a while.

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Woodall

unread,

Sep 17, 2014, 9:39:16 PM9/17/14

to ros-sig...@googlegroups.com

Thanks Mike, I'll try to address some of these points inline, and hopefully they will cover the salient points. Also I am getting around to answering everyone's emails inline, but it takes a lot of time, so bear with me.

On Mon, Sep 15, 2014 at 7:51 AM, Mike Purvis <mpu...@clearpathrobotics.com> wrote:

I appreciated the talks from Dirk and William about the DDS decision and the future of the APIs— I like the look of this future, especially the effort to unify node/nodelet APIs so that the decision to run nodes in many processes vs one is made at launch time rather than at authoring time.

I don't feel at all thrown under the bus by DDS/2.0; much as we did with catkin, Clearpath will continue to use the legacy tools for a period of time while we evaluate the new hotness, and then gradually switch our software over, probably beginning with peripheral drivers and generic components, and finishing with the platform software itself.

That is how I envision the transition to ROS 2.0: attrition. People should feel free to experiment with ROS 2.0 in their ROS 1 systems and convert as they see fit and as the value proposition of ROS 2.0 improves. In the mean time we are not changing ROS 1, allowing users to focus on their application without worrying about sweeping changes with every new ROS distribution. This cool down of ROS 1 development will also help us fix documentation and tutorials and let them stay fixed.

That said, some concerns:
As raised during the Q&A, it is critical that ROS 2.0 reverse the trend of making each ROS release harder and harder to get into, learn about, and get started with. Given the state of things presently, it's a very legitimate concern that a further fracturing of documentation/tutorial effort could render ROS 2.0 completely inaccessible to anyone who was not already an expert in ROS 1.x. It'll be important than the majority of documentation explaining ROS 2.0 is targeted at non-users, not migrating ROS 1.0 users.

If I think about the areas of ROS which, in my opinion, contribute the most to the steepness of the learn curve, I believe we have ideas for how to improve some of the parts in ROS 2.0. Here are just some examples:

Build system/Build tools: We are working on addressing this in ROS 1, and any improvements in ROS 1 will be available in ROS 2:

We finally ratified and deployed package.xml version 2 which adds the <depend> tag among other improvements.
We have been working on catkin_tools along with help from the community (especially Jonathan Bohren) and it will address many of the usability issues with building and debugging catkin workspaces.
We have a prototype for catkin_simple which automates almost all of the CMake code for catkin packages, which I plan to focus on after we get catkin_tools out and being used.
After those things are in place I plan to work on build system tutorials to help with the current lack of centralized, top to bottom documentation. All of this will go on while we are working on core parts of ROS 2.

Version fragmentation: ROS 1 is suffering from a build up of incremental changes, e.g. "new in X", which will be easier to avoid with ROS 2.0, at least at the beginning. I would also advocate for tying documentation version using the vcs for packages. This allows users to get the right version of documentation for each package (doesn't exclude "new in X" style messages).
After market features: These are things like dynamic_reconfigure and actions, which are second class citizens in ROS 1, which has resulted in making them hard to use. By making them built in concepts or merging them with existing concepts (parameters and dynamic reconfigure) we can make it simple to use and document them.

Along these same lines, the installation needs to get shorter and simpler— rosdep, rosinstall, and wstool should be installed automatically, and the rosdep (xylem?) init/update should be part of its install, not a separate step. For ROS 2.0's supported platforms, I feel we should strive to offer a "one step setup", perhaps a homebrew-like python -e "$(curl -fsSL https://ros.org/install)", or a downloadable "install deb" with an installation walk-through. Power users might still prefer to do everything their own way, but for the majority, especially those getting started for the first time, I feel strongly that it should be a one-click affair.

Since we sort of have a clean slate, we can choose our dependencies more carefully. Also DDS replaces some of our dependencies with one single dependency. Since this will be simplified we can focus on making ROS 2 more portable. We also have plans to make an SDK packaging of variants like desktop and ros-core. This will make it easier to get started with ROS on more platforms, further minimizing the barrier to entry. Many of these things like the SDK packaging could be used on ROS 1 if successful, but it is easier to experiment with ROS 2 since it is in a smaller scope for now.

I'm unsure if the future of ROS community docs are the wiki or some version-controlled sphinx-type thing, but either way, having a ROS1/2 switcher like the rosbuild/catkin one isn't enough. There are still rosbuild-centric tutorial pages on the ROS wiki— how about a warning box which appears automatically on wiki pages which haven't been thumb-upped in the last 6 months, perhaps combined with a "hot list" page that shows warning boxed pages sorted by traffic to them. (This doesn't need to wait for ROS 2.0; it would be great to have immediately...)

I like the idea of a version controlled documentation system on something like readthedocs.org, but I imagine there will be some discussion that needs to be had there.

ROS is more than just a generic publish/subscribe middleware— a lot more. I'm glad to hear about the intent to fix the parameter service, but what about issues specific to robotics, like runtime dynamic URDF, or multimaster? These are tough problems, and it'd be great to see OSRF take a really active role in solving these, either by spearheading development, or providing direct counsel-to and endorsement-of other groups working on these.

These issues can be worked on in parallel with our other ROS 2.0 efforts by interested parties, but I feel strongly that our improvements to the core infrastructure are in order to support future generations of solutions for these issues. Once we get a handle on the core system we can use our time at OSRF to champion the resolution of these issues if no one else has already done it. However, in the mean time I think it is important for us to go depth first on the core issues before looking into the breadth of potential future work.

Hope that's helpful. Thanks again for a great ROSCon.

Mike

--

You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Woodall

ROS Development Team

wil...@osrfoundation.org

http://wjwwood.io/

William Woodall

unread,

Sep 17, 2014, 9:56:30 PM9/17/14

to ros-sig...@googlegroups.com

Hi Jonathan, let me try to touch on a few of these points too:

On Sun, Sep 14, 2014 at 8:51 PM, Jonathan Bohren <jonatha...@gmail.com> wrote:

On Sun, Sep 14, 2014 at 7:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:
If you have any questions or would like to have further discussion, I look forward to discussion.
First, a lot of what you're saying sounds like it's in the same vein as something I posted following ROSCon 2013, and I still agree that there's a lot that can be done with minimal effort to improve ROS 1.x now and as ROS 2.0 starts getting rolled out:
https://groups.google.com/forum/#!topic/ros-sig-ng-ros/cMCVHwaBVIU

In response to my post, Esteve Fernandez put together a simple re-implementation of the ROS core which used ZeroMQ and while it seemed like a great first step, none of these improvements ever made it into the ROS core. I believe most of this is because resources at OSRF are limited and it seems like work which might appear like hacking improvements into the current ROS core might not be as important as potentially re-working it form the ground up.

I agree that there are a lot of things we can could do in ROS 1's transport system. We could replace TCPROS/UDPROS with ZeroMQ and friends, and then gain some improvements in the system, like maybe some performance and less code in roscpp. However, maintaining roscpp isn't the main burden for us, it is the fact that there are improvements we want to make and making them to roscpp either requires modifying the API or extending a custom system which is not being used by anyone but us. So when approaching ROS 2.0 the only options on the table were using ZeroMQ and friends to build a replace with improvements for ROS 1's transport or using something like DDS to replace it all. Iterating on ROS 1 and only making non-breaking changes was never the goal, because I believe the improvements which can be made with this approach are limited in scope.

Also, in case anyone didn't realize, Esteve works for us at OSRF on the ROS team, so maybe he can comment on his work replacing roscpp internals with ZeroMQ and thrift.

On Sun, Sep 14, 2014 at 7:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:
It is a bad, bad idea to throw the current ROS users under the bus in the interim. The implementation architecture direction for ROS 2.0 I am suggesting alleviates this scenario, and allows ROS to backtrack a little with its users to continue improving ROS 1.0 functionality in TCPROS, and UDPROS.

I also agree that it's really important not to make things a mess for ROS users in the transition from ROS 1.0 to ROS 2.0. Especially since there's a huge wealth of code out there which isn't even working with current releases.
With respect to improving ROS 1.0 without fundamentally breaking anything that currently works, I think there's plenty motivation and skill out there [1] [2] [3] but it really needs to be organized, focused, and coordinated into making its way into the ROS core releases. If it's going to reach everyone, then this organization needs to come from OSRF or someone else who can be devoted to such an effort full-time.

We heard overwhelmingly from users that they did not want more breaking changes at the lowest level of ROS 1 and to slow down development, and from others that there were fundamental sort falls in the abilities of ROSTCP/ROSUDP. So given those desires we obviously chose to build the improvements in from a ground up approach, the only questions that remained was what improvements and what dependencies (ZeroMQ vs DDS).

[1] http://wiki.ros.org/ethzasl_message_transport

For this one, they must change the API to support the new features. So this is not a non-breaking change and is explicitly opt-in, which I think is perfectly fine. People interested in their approach can take advantage of it if they like, without us trying to change the API a little at a time.

[2] http://micros.nudt.edu.cn/

I can't find anything there (in english at least) which describes their approach, but if they are doing QoS and realtime they will also be modifying the API.

[3] https://github.com/esteve/ros_comm/tree/zeromq_thrift

Esteve can comment on this if he likes. I don't disagree that it would be nice to replace TCPROS with ZeroMQ and friends, but I don't think it makes sense in a cost/benefit analysis.

-jon

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Woodall

unread,

Sep 17, 2014, 10:50:48 PM9/17/14

to ros-sig...@googlegroups.com

Hi Aaron, I appreciate you taking the time to give us some feedback. Let me see if I can address some of your proposals and questions inline. However, ultimately we may need to break off in to smaller, more focused discussions.

On Sun, Sep 14, 2014 at 4:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:

Dear ROS Teams,

Thanks for an informative time at ROSCon 2014. I had to leave about an hour and a half early to catch a flight, and I was left with my head spinning about the the decision to change protocols in ROS 2.0. This is an important decision, and I will share my concerns, as well as my thoughts on a direction I personally would of taken with ROS 2.0, and the protocol directions. This topic seems better fit for a blog, however, an open discussion is important. 

First let me say the direction of ROS 2.0 makes sense, the complete shift to ROS DDS protocol does not seem wise. There are many good reasons to implement DDS in ROS, however, the way it is implemented can make or break ROS in the future. 

Many well intended protocol implementations end up at dead ends. Two examples are: 
HTTP NG (Next Generation) 
http://www.w3.org/Protocols/HTTP-NG/Activity.html
Gnutella 2
http://en.wikipedia.org/wiki/Gnutella2

Trust me, I was skeptical of DDS in college and when we started looking at it for ROS 2, but having taken a hard look at it and where it has been used, and I have a hard time finding any technical problems with it or situations where its credibility as an evolving and growing standard are diminished. Simply pointing out the fact that there are other protocols which have not succeeded is not good evidence of DDS's future in my opinion.

I'd like to share the approach of the Happy Artist RMDMIA RCSM (Robot Control System Messenger) API.

I tried to find this on your site, but apparently I'm missing it, what does RMDMIA stand for?

 The RCSM implements a plugin architecture that allows multiple robot operating system clients to be registered/accessed, run simultaneously, and inter operate together, or simply to plugin a single client such as the Happy Artist ROS Client for TCPROS/UDPROS.  As a ROS user this allows any protocol to plugin, and work from any 3rd party vendor DDS implementation, TCPROS/UDPROS, or any other protocol. In the Happy Artist RMDMIA implementation the RCSM component serves as an interface to the rest of the system so that protocols can change, while code written for autonomous/user control follows the same pattern, allowing interchangeability, and forward/backward compatibility between other aspects of the system. If ROS were to approach 2.0
 like this, libraries like MoveIt (an example of a library name (no idea if it is tied to TCPROS/UDPROS protocols in ROS 1.0 implementation), could run on any communication protocol without ever needing to know anything about the protocols it was operating on. This approach needs to be applied to all areas of the system so that a user of any component can interchange their libraries to ROS 2.0 for 100% forward, and backward compatibility.

Ok a few points to make here, first this sounds a lot like something our own Morgan Quigley was a fan of, which he called the Multi Middleware Matrix, which he pronounced "Mmm... ROS" :). In this system we would have a middleware abstraction layer which could be implemented by anything, ROS 1's TCP/UDPROS, DDS, ZeroMQ, MQTT, etc...

The problem we identified with this approach was complexity. In order to provide an interface that was this flexible we either needed to make it ridiculously limited and implement all the higher level features above this interface, often reimplementing functionality which some of the middleware's provide but others don't. Or worse, we would have to work hard to make limited protocols like TCPROS implement more advanced features we want to expose in ROS 2 via the abstraction layer.

In the end we decided that making a bridge to ROS 1 was more tractable than a transparent compatibility layer using the MMM. We have a place holder for this in our design website which we haven't had time to fill out, but we intend to: https://github.com/ros2/design/blob/gh-pages/articles/40_ros_with_modular_middleware.md

That being said, we have an interface in ROS 2 currently called the `ros_middleware_interface` which provides this abstraction to some degree. This will be a C interface which hides all DDS details (which are never used above it) and that will be used to create bindings in other languages. So we have this abstraction, but we only plan to support DDS, at least for now.

MoveIt! is designed to be decoupled from ROS, but in practice I don't know how easy it is to decouple from ROS.

I am personally concerned that researchers may find themselves using ROS alternatives due to the time/money that will be lost while they wait for a viable ROS 2.0 solution. Why would a researcher/developer invest resources into a platform that was going to completely change nullifying their personal investment in that platform?

I guess I don't see the logic in this conclusion, because everything we've heard from our research community is that it meets their basic needs and for us to stop changing things. Which we've responded to by slowing the release cycle and allocating the time we spend on ROS 1 for improvements to tools, documentation, and bug fixes. This means that ROS 1 should be more attractive to them, and I don't see why they would seek alternatives, except a smaller percentage of them which will look for a middleware with more features, which is the focus of ROS 2.

It is evident ROS 2.0 needs support similar to ACID transactions for autonomous robotics that the DDS implementation will support (primarily for people safety, and system stability of commercialized autonomous systems).

These more "grown up" middleware features which deal with durability, consistency, etc... are one of the reasons that we selected it. It will help us address the concerns of our other major customers, people building deployed systems and products.

It is a bad, bad idea to throw the current ROS users under the bus in the interim. The implementation architecture direction for ROS 2.0 I am suggesting alleviates this scenario, and allows ROS to backtrack a little with its users to continue improving ROS 1.0 functionality in TCPROS, and UDPROS.

I passionately feel that we are not throwing anyone under the bus. We intend to support good backwards interoperability for ROS 1, because we know that if people cannot try ROS 2 with their ROS 1 code in a mixed system then it will likely be impossible for ROS 2 to get traction.

In the interim here are some ROS 1.0 improvements that could greatly improve ROS performance with substantial latency decrease, while increasing data bandwidth:

Add support for UDPROS on every ROS supported client, and server with a few minor enhancements. (Some of these suggestions came out of the BOF at ROSCon discussion on latency and performance, and some I have been attempting to subtly hint at with multiple questions on ROS Answers over the last year).

This has been an open issue in rospy for a long time. I am told, though I don't know for myself, that it is not trivial to implement this in rospy. Other client libraries are almost completely community controlled, like ROSJava, so it would be up to them to support it if they felt they needed it.

UDP Jumbogram support: This is a very simple enhancement. Allow unbounded UDP Datagram sizes, and during protocol handshake (XMLRPC negotiation in UDPROS), and add a flag to the publisher configuration that says allow UDP jumbogram support, or specify the maximum datagram size. ROS currently puts their datagram size limit around 1500 bytes. If the system hardware supports jumbograms, the only change is supporting a configurable maximum datagram size.

I don't see any reason we couldn't do this, but currently a very small fraction of our users would benefit from it. Even if more users might use ROSUDP if we spent time on it, I am doubtful that it makes sense for us at OSRF to do it, since we try to focus our resources on the items which benefit the most people, unless we are doing some focused work for a client.

UDP Multicast support: Adding a header attribute called multicast could allow a topic or service (not aware service is currently supported for UDPROS, but it should be) to configure the associated Publisher to perform a broadcast (A concern of multiple subscribers to a topic causing a major problem with system performance can be alleviated with a feature request by a person in the BOF get together for specifying the numeric limit to subscribers of a particular topic).  Note: (Based on my personal UDPROS implementation experience, adding Multicast support would be fairly simple to add into the Happy Artist Java client, so I assume the same might apply on other platforms that have UDPROS support).

I think you underestimate the multicast problem. Take a simple feature like latching for example, this requires the notion of a connection (it actually violates the anonymous part of our anonymous pub/sub design). Without a connection behind pub/sub many things in ROS 1 will fall over, not the least of which is the discovery system. ROSUDP works around this by maintaining the notion of a connection over UDP manually, which I think is part of why rospy doesn't support it currently. Extending this manual connection maintenance over UDPM is not a trivial task in my opinion. Others have added UPDM support for ROS but in order to do so needed to introduce a new API in the mean time. Also UDPM something we get for free in DDS, which allows us to focus on other, higher level features. Which I, for one, am glad for, since this is one wheel I don't feel the need to reinvent.

Maximum Topic Subscribers support: Allow the the system designer to specify the maximum number of subscribers on any given topic. The use case given was a high bandwidth topic that multiple users subscribe to and cause the system to hang. The user wanted to implement a UDPROS topic with Multicast support for a single subscriber. Note: Based on my personal knowledge of the CPP client (somewhat limited), the client uses low level arrays rather than Data Collection Objects, therefore the incremental nature of arrays in CPP code should make this task fairly straight forward to implement.

This would be possible to implement, though typically our tools and concepts are not setup to handle this limitation set by publishers. For instance, what happens when the max has already been reached and your try to introspect with rostopic? These types of changes rarely come without many consequences in the ecosystem. But if people are interested in the feature we can discuss it in more detail, probably best in a Github issue.

The user wanted to implement a UDPROS topic with Multicast support for a single subscriber.

I don't understand what you mean by this sentence, could you clarify it for me?

If you have any questions or would like to have further discussion, I look forward to discussion.

Sincerely,

Aaron

--

You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jonathan Bohren

unread,

Sep 18, 2014, 1:13:58 AM9/18/14

to ros-sig-ng-ros

On Sep 17, 2014 10:50 PM, "William Woodall" <wil...@osrfoundation.org> wrote:

>> UDP Multicast support: Adding a header attribute called multicast could allow a topic or service (not aware service is currently supported for UDPROS, but it should be) to configure the associated Publisher to perform a broadcast (A concern of multiple subscribers to a topic causing a major problem with system performance can be alleviated with a feature request by a person in the BOF get together for specifying the numeric limit to subscribers of a particular topic). Note: (Based on my personal UDPROS implementation experience, adding Multicast support would be fairly simple to add into the Happy Artist Java client, so I assume the same might apply on other platforms that have UDPROS support).
>
> I think you underestimate the multicast problem. Take a simple feature like latching for example, this requires the notion of a connection (it actually violates the anonymous part of our anonymous pub/sub design). Without a connection behind pub/sub many things in ROS 1 will fall over, not the least of which is the discovery system. ROSUDP works around this by maintaining the notion of a connection over UDP manually, which I think is part of why rospy doesn't support it currently.

Will, I agree that latching doesn't work with UDPM, but I think the latching pattern doesn't even make sense in the majority of situations where someone will want to use UDPM to decrease bandwidth overhead.

For those unfamiliar with latching, UDPM is useful for high bandwidth / high frequency topics, whereas latching is useful when you have sparse event-like messages on a topic and you're worried that you subscribed too late to get the most recent event.

-j

William Woodall

unread,

Sep 18, 2014, 2:07:16 AM9/18/14

to ros-sig...@googlegroups.com

Sure, maybe latching is a bad example, but it is part of the API and it either has to silently not work when set along with UDPM, or you need to modify the API in some way. There are other features to consider, like the fact that you can ask a publisher how many subscribers it has (which is done by counting the number of subscription connections: http://docs.ros.org/api/roscpp/html/publication_8cpp_source.html#l00370).

I'm not saying it isn't possible to support UDPM, I'm just pointing out that it is more involved task than it had previously been suggested. If you are ok with a new API then the ethzasl_message_transport might suit your needs or we might be able to add on a UDPM system to ROS 1 which does not overlap with the existing pub/sub, but that requires both the publisher and subscriber to both use the UDPM interface.

-j

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Dean

unread,

Sep 18, 2014, 10:18:27 AM9/18/14

to ros-sig...@googlegroups.com

question inlined...

On Wed, Sep 17, 2014 at 10:50 PM, William Woodall <wil...@osrfoundation.org> wrote:

Hi Aaron, I appreciate you taking the time to give us some feedback. Let me see if I can address some of your proposals and questions inline. However, ultimately we may need to break off in to smaller, more focused discussions.

On Sun, Sep 14, 2014 at 4:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:
Ok a few points to make here, first this sounds a lot like something our own Morgan Quigley was a fan of, which he called the Multi Middleware Matrix, which he pronounced "Mmm... ROS" :). In this system we would have a middleware abstraction layer which could be implemented by anything, ROS 1's TCP/UDPROS, DDS, ZeroMQ, MQTT, etc...

The problem we identified with this approach was complexity. In order to provide an interface that was this flexible we either needed to make it ridiculously limited and implement all the higher level features above this interface, often reimplementing functionality which some of the middleware's provide but others don't. Or worse, we would have to work hard to make limited protocols like TCPROS implement more advanced features we want to expose in ROS 2 via the abstraction layer.

In the end we decided that making a bridge to ROS 1 was more tractable than a transparent compatibility layer using the MMM. We have a place holder for this in our design website which we haven't had time to fill out, but we intend to: https://github.com/ros2/design/blob/gh-pages/articles/40_ros_with_modular_middleware.md

That being said, we have an interface in ROS 2 currently called the `ros_middleware_interface` which provides this abstraction to some degree. This will be a C interface which hides all DDS details (which are never used above it) and that will be used to create bindings in other languages. So we have this abstraction, but we only plan to support DDS, at least for now.

Would you please provide some examples or use cases against the Mmm solution? For example, which features would need to be re-implemented? I ask as the RCTA RFrame Mmm style solution is roughly half the lines of code of roscpp. (still trying to figure out if we can release any of it)

I think you underestimate the multicast problem.

General point on this and other "small feature" requests. Many solutions to problems are small code changes. Which then require 6+ months of testing. This is not due to the complexity of the change but rather the complexity of the system and potential side effects which make verifying such changes tedious. In other words, depending on the problem code complexity != system complexity.

William Woodall

unread,

Sep 18, 2014, 2:46:20 PM9/18/14

to ros-sig...@googlegroups.com

On Thu, Sep 18, 2014 at 7:18 AM, Robert Dean <bob....@gmail.com> wrote:

question inlined...

On Wed, Sep 17, 2014 at 10:50 PM, William Woodall <wil...@osrfoundation.org> wrote:
Hi Aaron, I appreciate you taking the time to give us some feedback. Let me see if I can address some of your proposals and questions inline. However, ultimately we may need to break off in to smaller, more focused discussions.

On Sun, Sep 14, 2014 at 4:55 PM, Aaron Sims <aaronhap...@gmail.com> wrote:
Ok a few points to make here, first this sounds a lot like something our own Morgan Quigley was a fan of, which he called the Multi Middleware Matrix, which he pronounced "Mmm... ROS" :). In this system we would have a middleware abstraction layer which could be implemented by anything, ROS 1's TCP/UDPROS, DDS, ZeroMQ, MQTT, etc...

The problem we identified with this approach was complexity. In order to provide an interface that was this flexible we either needed to make it ridiculously limited and implement all the higher level features above this interface, often reimplementing functionality which some of the middleware's provide but others don't. Or worse, we would have to work hard to make limited protocols like TCPROS implement more advanced features we want to expose in ROS 2 via the abstraction layer.

In the end we decided that making a bridge to ROS 1 was more tractable than a transparent compatibility layer using the MMM. We have a place holder for this in our design website which we haven't had time to fill out, but we intend to: https://github.com/ros2/design/blob/gh-pages/articles/40_ros_with_modular_middleware.md

That being said, we have an interface in ROS 2 currently called the `ros_middleware_interface` which provides this abstraction to some degree. This will be a C interface which hides all DDS details (which are never used above it) and that will be used to create bindings in other languages. So we have this abstraction, but we only plan to support DDS, at least for now.

Would you please provide some examples or use cases against the Mmm solution? For example, which features would need to be re-implemented? I ask as the RCTA RFrame Mmm style solution is roughly half the lines of code of roscpp. (still trying to figure out if we can release any of it)

Well, any new feature we want to expose to the user which is not supported by an underlying implementation. For example, if we want to use this MMM and provide a ROS 1 implementation using TCPROS, and then we wanted to expose the reliable communication with DDS through to the ROS 2 API. Now we would need to provide some way of doing reliable pub/sub with ROS 1 to get feature parity. We can choose not to support different options in the API based on the implementation's available, but that just means the API has to be more complex to first check what can be done and how refuse to do it in the case that the underlying implementation doesn't support it. With the bridge it can be more explicitly described what is and isn't supported (in terms of communication features) and we can have clear ways to mitigate or deal with these differences.

Other features which we would like to add to ROS 2, but would cause problems in the MMM API either for implementations or the code which goes on top:

- "Latching" with a history > 1

- Supporting deadline's, causing the publisher to send at a minimum frequency

The other big problem with MMM is that some solutions can clearly be divided into discovery, transport, and serialization, but others cannot, which requires you to support multiple everything, not just transports and serializations. I imagine that the complexity of a discovery system which must work with and aggregate both DDS's discovery system and ROS 1's would be quite large. Also consider incompatible features in message formats which must be accounted for:

- Default values

- Optional fields

- Member vs method based access

I don't think something like MMM is impossible, obviously you seem to have done something like that with RTCA RFrame, but I do believe it is a more complicated design and with our resources I think choosing one solution (DDS) without tying ourselves to it (ros_middleware_interface) and then providing a bridge to ROS 1 is the right strategy.

Just some stats, near as I can estimate roscpp consists of about 17,300 lines of CPP code. Our current code base is around 2,800, but obviously is not nearly as feature-full, so I expect it will rise, but we'll have to see where it ends up. I would guess that the amount of code for an MMM would be much greater if you include the code to support each built-in implementation, but maybe I'm wrong about that.

I think you underestimate the multicast problem.

General point on this and other "small feature" requests. Many solutions to problems are small code changes. Which then require 6+ months of testing. This is not due to the complexity of the change but rather the complexity of the system and potential side effects which make verifying such changes tedious. In other words, depending on the problem code complexity != system complexity.

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aaron Sims

unread,

Sep 18, 2014, 9:47:47 PM9/18/14

to ros-sig...@googlegroups.com

Hi William,

Thanks for taking the time to respond to my concerns. I appreciate your contribution.

I'm not skeptical of DDS. I believe it has real value in the right domains. The right domains might be medical devices, autonomous vehicles, autonomous aircraft, or any mission critical scenario where guaranteed transactions are necessary for safety purposes. That said, if I were in a situation where every microsecond counts, lets say we had a game of laser based dogfights going on with autonomously controlled vehicles, I would chose UDPROS over DDS in a heartbeat, because it would give a substantial time advantage over any DDS implementation. On the data level UDPROS is a very efficient communication protocol. DDS may have performance improvements over TCPROS, but I am certain it wouldn't stand in the same ring as UDPROS protocol for efficiency of data throughput, and low latency.

I am not a proponent of any protocol, but one thing I do know from almost 20 years experience working with Middleware is protocols change. Protocols are not consistent. One user may want to use one protocol, and another person another, and therefore standardizing on a single protocol is a risky proposition for any middleware framework. This is one thing Java did right in frameworks like JDBC, and JNDI, where a user just needs a driver to a particular DB vendor, and everything else just works. I think whomever wrote the TCPROS, and UDPROS protocols was a genius. The deeper I got into those protocols, the more I realized how well thought out the design of those protocols is. UDPROS improved on TCPROS, by moving the Connection Header to the requestTopic XMLRPC request, making the UDP communication an already negotiated connection, minimizing the data communication code logic.

I can't imagine whomever developed the ROS protocols is involved in maintaining the protocols, or even giving input regarding them, otherwise, developers would not be making statements about the Shape Shifter Object being related to dynamic configuration, when it seems the TCPROS probe message is a piece of dynamic configuration by querying all of the Service Types, MD5Sums, Request Type, and Response Type so a Service can be called without needing to know anything about it. All Services, and Topics can be discovered using the XMLRPC methods, and connection headers can be used to obtain the message definitions that define the structure and types associated with any given message.

Its my assumption that the reason UDPROS hasn't been implemented on clients other than CPP is nobody knows how the thing works. I think the assumption that implementing UDPROS clients is harder than it looks is based on lack of understanding of the protocol implementation. My offer is still open to consult other developers whom are interested in adding the protocol to their clients. Now that I have personally implemented UDPROS in Java I know how it works, and truth be told I was stuck over a year trying to get the information I needed on the Answers.ros.org. Once I figured out how the protocol works it is fairly straight forward, and I believe with guidance other devs could implement the protocol in short order.

RMDMIA stands for Robotic Mission Decision Manager Intelligent Agent. A brief side note, I know the DDS implementation could be of benefit to the commercial release of RMDMIA. If I were in an autonomous vehicle, I would want to know those transactions were guaranteed so the vehicle would arrive at its destination safely. In a low cost, autonomous robot, and a low latency sensor based system, I would probably stick with UDPROS. In a low cost robot, I would probably want to stick to TCPROS, and UDPROS.

I'm glad the plan is to have complete backwards compatibility with ROS 1. That clarification makes me happy:)

I am confident others whom commented on the UDP issues, clarified what I was bringing up, so I'll assume this is no longer a point of confusion unless you say otherwise.

Thanks again for all your hard work William! I appreciate your contributions.

Sincerely,

Aaron

Bo Ding

unread,

Sep 24, 2014, 10:57:41 PM9/24/14

to ros-sig...@googlegroups.com

Hi, everyone. If you want to try the DDS protocol in ROS 1.x programs right now, you can goto the micROS RT project website http://cyberdb.github.io/micROS-RT. The goal of this project is to provide scalable, robust and QoS-assuring message delivery capability while keeping compatibility with ROS 1.x programming paradigms.

The features of micROS RT include:

(1)Built-in multicast support. Significant performance advantage can be obtained when there are n subscribers in a ROS topic (n>=2).

(2)Robustness in some adverse network environment. As a mature and industry-level product, DDS is expected to be more robust in some adverse network environment.

(3)QoS assurance in the message delivery process. For example, you can set the transport priority and latency budget of messages on a ROS topic.

(4)Existing ROS programs can easily benefit. No modification is needed and the only thing you should do is to replace a library file in the ROS installation path.

(5)Interoperable with the official ROS kernel. micROS RT can smartly choose the underlying message delivery protocol in virtue of the protocol negotiation mechanism.

We have tested the current version of micROS RT (v0.20beta) by the examples in the roscpp_tutorials package, the turtle_sim package, the turtle_tf package and some real ROS applications. Preliminary performance tests have proved the benefit by introducing multicast and QoS assurance in message delivery under certain circumstances. We are still improving our code and any feedback would be greatly appreciated.

----

Bo Ding

micROS Team, NUDT

bd...@msn.com

William Woodall

unread,

Jun 30, 2015, 5:37:10 PM6/30/15

to ros-sig...@googlegroups.com

Bo,

I just realized no one responded here... Sorry about that, for some reason I thought we had.

I think your approach to integrating DDS into ROS 1 is really a good idea, but we have other things we want to change in the ROS API, so that's why we decided to go for a breaking set of API changes. Additionally, we wanted to tackle systems with less resources like embedded systems and OS-less systems, which assumptions made by ROS 1's implementation make it hard to achieve. So that lead us to approach the problem in a different way. The intrinsic value added by using DDS as a transport in ROS 1 is that you get UDP multicast support, and some of the more straight forward DDS QoS settings like priority which is great.

It'd be great to continue to hear updates from you guys, and I've pointed some people to your project as alternative ways to use DDS with ROS.

--

You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bo Ding

unread,

Jul 3, 2015, 11:22:18 AM7/3/15

to ros-sig...@googlegroups.com

Hi William,

Thanks for your comments. Your words are very much to the point. A major design motivation of micROS RT is that we need some real-time features in data distribution as well as the multicast support in our practice with ROS 1.x. So we enhanced ROS by integrating DDS into it and we hope that our effort can contribute to the robotic community, especially those who have the same requirements like ours.

We are also tracking the development progress of ROS2.0 from its github. The architecture of ROS 2.0 is obviously the result of much thought. We believe that it will be a milestone in the development of robot software infrastructure.

Thanks again!

Sincerely,

Bo Ding

在 2015年7月1日星期三 UTC+8上午5:37:10，William Woodall写道：

Reply all

Reply to author

Forward

0 new messages