[Discourse.ros.org] [Next Generation ROS] Rostopic implementation discussion

83 views
Skip to first unread message

Geoffrey Biggs

unread,
Oct 19, 2016, 3:04:08 AM10/19/16
to ros-sig...@googlegroups.com
gbiggs
October 19

The stretch goals for Beta 1 (due in mid-December) include rostopic. rostopic is a faily vital tool when developing with ROS, and is probably one of the most-used tools in the ROS-based robot developer's toolkit. However, there are some challenges in developing rostopic for ROS 2, particularly the rostopic list command.

In ROS 1, rostopic can easily go off and get the list of known topics from the master, because the master Knows All. This approach is not so straight forward in ROS 2. The use of distributed discovery in DDS means that, by default, no one knows for certain the entire state of the DDS network. Furthermore, when a new participant starts up, it has absolutely no knowledge of the state of the network (beyond what may be hard-coded in) and has to wait for the discovery process to being producing results. This process may be very quick for things on the local machine, but for remote computers it is likely to take a significant time.

If we implement rostopic list by making it start up, wait for a given length of time, then print the result, there is a good chance that it will not give a complete picture of the ROS graph and so will not be a useful tool.

The purpose of this topic is to decide how to implement rostopic such that it provides the same rapid response as ROS 1, while dealing with the distributed nature of the graph in ROS2, and with as complete information as possible given the limitations of distributed discovery.

To get right into it, my proposal is:

  • A daemon is started by something, e.g. roslaunch, or on system start up, or when starting the terminal and sourcing setup.bash, or some kind of start_all_the_ros_stuff_ill_need command.
  • This daemon's job is to listen for changes in the graph. It records all changes and stores the current state of the graph. The recent work on listening for graph changes by @wjwwood would be used.
  • The daemon provides a ROS topic on a well-known topic, e.g. /ros/topics (based on the idea that the /ros namespace is reserved for infrastructure topics). Connecting to this topic will provide the current state of the graph.
    • This topic probably needs to be a service to avoid wasting resources broadcasting something only needed intermittently.
    • If it is a service, there is still value in having a topic that broadcasts on change for long-running tools to listen to.
    • It may be worth considering having a separate domain for the infrastructure topics.
  • Each computer being used should run a copy of the daemon so that connecting to it does not involve traversing the network and so is fast.
  • The rostopic list command, upon starting, connects to the daemon's topic, gets the current graph info, and prints it out.
  • If rostopic cannot find the expected topic, then it prints a warning, then waits a configurable length of time (with a suitable default), running the DDS discovery process and listening for graph information. After the time limit it prints the result.
    • If you see that warning, you should understand that something is broken in your ROS system.
    • This is a downside of the daemon approach: It is a single point of failure for the rostopic functionality.
  • The daemon can also provide the same information over any other useful protocols. One particularly useful one would be a REST service, so that tools could connect and get the information without needing to use DDS or potentially be a part of the ROS graph. For example, a web-based tool.

The daemon mentioned above could be easily extended to also provide information for rosnode, rosservice, and similar commands.

If we use the above approach, I would like to fix the ROS topics provided by the deamon as well as any other protocol interfaces/ports, and data formats (e.g. YAML) so that they are useful by other tools and tool implementors, such as rviz and rqt_graph.

There has also been work done on rostopic and friends by the fine folks at Erle Robotics:

Issue: ros_comm dev. support

opened by vmayoral on 2016-02-23
Hello,
We recently have been using ROS2 actively and look forward to its adoption. Our results are promising however we've reached a...
ready

Their implementation I think uses a listener waiting for information to pour in on a specific topic. However rather than trying to summarise their work myself, it would be great if @vmayoral could drop by and give us a description himself.


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

Víctor Mayoral Vilches

unread,
Oct 19, 2016, 4:26:52 AM10/19/16
to ros-sig...@googlegroups.com
vmayoral
October 19

Hey @gbiggs,

Totally agree with your reasoning above. Good hearing that there's someone else interested on this :).

The issue you pointed out above summarizes our work pretty nicely I believe. We had an initial (more complex) implementation for OpenSplice and then switched to FastRTPS. As for now, we've got simple rostopic list and rosnode list functionalities published and available. Changes are needed in rclpy, rcl, rmw, rmw_<dds_implementation>.

Motivated by your comments I just went ahead and submitted a set of pull requests to integrate the changes needed upstream. Here they are:
- https://github.com/ros2/rclpy/pull/43
- https://github.com/ros2/rcl/pull/84
- https://github.com/ros2/rmw/pull/74
- https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/pull/62

@marguedas, i think you'll be interested in this.
Cheers!

Ingo Lütkebohle

unread,
Oct 19, 2016, 4:49:54 AM10/19/16
to ros-sig...@googlegroups.com
iluetkeb
October 19

A dedicated information service seems to be a good approach, and I'm not overly worried about introducing a point of failure there.

However, when googling for this, some patents pooped up! Checkout https://www.google.ch/patents/US20150055509 and https://www.google.ch/patents/US20110258313 They are not exact matches, but particularly "network assisted peer discovery" is pretty close.

btw, from the first patent, you can find a number of other patents related to DDS. I know this is tangential to your question, but it has me a bit worried here.

Geoffrey Biggs

unread,
Oct 19, 2016, 8:14:36 AM10/19/16
to ros-sig...@googlegroups.com
gbiggs
October 19

Could you give a brief overview of your approach, for a basis of discussion? How much does it differ from my proposal above?

@iluetkeb That's a little disturbing. There's probably prior art (I basically described DNS), but it could put off companies.

Víctor Mayoral Vilches

unread,
Oct 19, 2016, 8:50:49 AM10/19/16
to ros-sig...@googlegroups.com
vmayoral
October 19
gbiggs:

Could you give a brief overview of your approach, for a basis of discussion? How much does it differ from my proposal above?

@gbiggs, the current proposal from our side (shared in the PRs above) uses the FastRTPS primitives to inspect the existing DDS participants/topics and report those using the ROS tooling. This simple approach did the job for us but we understand that a more generic and DDS-vendor agnostic layer might be put in place at some point.

Hope that helped clarifying our approach.

Secretary Birds

unread,
Oct 19, 2016, 11:44:50 AM10/19/16
to ros-sig...@googlegroups.com
SecretaryBirds
October 19

I am very interested in this discussion, as we at ASI have been playing around with ROS2 a lot. We had implemented our own version of rostopic list by using Node::get_topic_names_and_types().

We haven't had the time to really dig into the rcl and rmw layers to understand fully what's going on, but using this call has worked out fairly well. I will attest to @gbiggs statement that using this call comes at the cost of time. We have to let the tool spin for many seconds (currently around 15 seconds) just to be sure we've collected info from everyone in the system. Pretty inconvenient from a timing perspective, but still better than nothing!

Also, I've noticed a new flaw in this approach which I think may be related to my previous discussion here. When we set the Participant Index for OpenSplice to 'none' I think it's causing the get_topic_names_and_types() to come up empty-handed. Or at least, it's coming up empty in a sporadic way. I still have to investigate that some more.

Secretary Birds

unread,
Oct 19, 2016, 11:56:51 AM10/19/16
to ros-sig...@googlegroups.com
SecretaryBirds
October 19

As an aside, we've also got a rostopic echo working on most basic messages. It's a simple python hack-fest where we have a python script that does:

  • takes commandline input for topic name, message type and qos
  • does some smart file searching for .msg files
  • does some string-replacing on a listener-template.py file
  • executes that newly generated listener.py.

This has helped tremendously, even as hacky as it is.

I could post the code if anyone thinks it'll be useful.

Dirk Thomas

unread,
Oct 19, 2016, 12:23:49 PM10/19/16
to ros-sig...@googlegroups.com
dirk-thomas
October 19

I think the idea of a daemon process to optionally provide the requested information faster is a good approach :thumbsup:

I would like to comment on several aspects mentioned in the thread:

  • Imo all of the following cases should work:
    • The daemon got started before (by launch, the system, wherever, this should be an implementation detail).

    • The daemon gets started on demand by the first invocation of a command line tool needing it (slower on first call).
    • No daemon is there but the tool should still provide reasonable results (trade off between wait time and completeness).
  • The command line tools are only one use case of that interface. It should be possible to write code on-top of it which accesses the information without caring about the internal optimization of having a daemon. So all the functionality should be exposed in an API which is then also being used to implement the cli.

  • I am not convinced that the daemon should have a ROS-based interface.In order to call its service the client would still need to wait for some discovery phase. Even if that is local it will take additional time which we want to avoid.
    Therefore I think the daemon should provide its interface using a different protocol. That protocol needs to work without any "discover" and be available across all targeted platforms.

  • With the choice of the protocol comes also the decision how that daemon is being identified. It could be one single daemon per ROS graph or it could be one daemon per host to make the queries "local". If the daemon would e.g. expose its interface via d-bus you would probably run it on each host. Someone mentioned a REST service it could be either a single global one or a local one per host. Anyway we need to determine how the daemon is being referred to in the context of running to separate ROS systems on a single machine.

  • Any of these tools and interfaces should be independent of any specific rmw implementation. If the rmw interface is being implemented by the newest and hotest discovery / marshalling / transport solution the tools should continue to work. The rmw interface might need to be extended to provide all the necessary information.

Barrett Strausser

unread,
Oct 19, 2016, 1:45:56 PM10/19/16
to ros-sig...@googlegroups.com
Barrett_Strausser
October 19

Non-Roboticist outsider looking in.

Service Discovery in a distributed environment while not necessarily a solved problem has many implementations currently in use in production in a variety of companies and deployments.

Of the top of my head I'm thinking about EtcD and Consul.

Has anything like that been considered?

Mikael Arguedas

unread,
Oct 19, 2016, 2:05:21 PM10/19/16
to ros-sig...@googlegroups.com
marguedas
October 19

Glad to see that many people manifested interest in this discussion.
As expected these tools are needed by everybody and it resulted on a lot of prototype solutions being created by each and everyone of us. Hence the need for this discussion to provide a global rmw agnostic solution for these tools.

Thanks @gbiggs for this great summary.

A couple notes / open questions on top of what has been stated:
* Agreed on @dirk-thomas comment that the daemon should be seen as a way to provide information faster rather than as a single-point of failure. The feature should hence work without it (at a performance cost)

  • :+1: on the fact that the ROS side of things should be rmw_implementation agnostic and allow users to develop their own tools or daemons using the protocol of their choice.

  • Protocol-wise I'd be curious to know what the preference of all of you is. It seems to me that REST services became very popular now that most people provide web interfaces to their systems but would it be your preferred choice ?

  • In the case of multiple juxtaposed ROS systems, it comes down to how do we identify "different" ROS systems. An approach could be to have the daemon aggregates all graph information for a given DDS Domain. And thus create a deamon for each DDS Domain used. What other approach for differentiating ROS systems could be used ?

  • Another point being addressed is the tradeoff between duplicating daemons (and information storage) on each host vs having a single global daemon and service. If the default protocol allows it, this could be left up to the end user by providing an option to specify when and how to launch the daemon. And the way to query daemon information should be adapted accordingly. We will still need to specify the default behavior for this.

Geoffrey Biggs

unread,
Oct 20, 2016, 2:53:57 AM10/20/16
to ros-sig...@googlegroups.com
gbiggs
October 20
dirk-thomas:
  • Imo all of the following cases should work:
    • The daemon got started before (by launch, the system, wherever, this should be an implementation detail).
    • The daemon gets started on demand by the first invocation of a command line tool needing it (slower on first call).
    • No daemon is there but the tool should still provide reasonable results (trade off between wait time and completeness).
  • The command line tools are only one use case of that interface. It should be possible to write code on-top of it which accesses the information without caring about the internal optimization of having a daemon. So all the functionality should be exposed in an API which is then also being used to implement the cli.

Thanks for enumerating those. I agree whole-heartedly with all of them. (Especially the last! I implement my tools like that and it's been very useful.)

dirk-thomas:

I am not convinced that the daemon should have a ROS-based interface.In order to call its service the client would still need to wait for some discovery phase. Even if that is local it will take additional time which we want to avoid.
Therefore I think the daemon should provide its interface using a different protocol. That protocol needs to work without any "discover" and be available across all targeted platforms.

I strongly support having a non-ROS based interface. You've listed the advantages already. However perhaps it could be useful to have a ROS-based interface in case someone wants to use the functionality from a node? Perhaps this should be a lower-priority feature unless we find an actual use case for it.

dirk-thomas:

With the choice of the protocol comes also the decision how that daemon is being identified. It could be one single daemon per ROS graph or it could be one daemon per host to make the queries "local". If the daemon would e.g. expose its interface via d-bus you would probably run it on each host. Someone mentioned a REST service it could be either a single global one or a local one per host. Anyway we need to determine how the daemon is being referred to in the context of running to separate ROS systems on a single machine.

Yes, there are several ways that this could go. I think that if we use an environment variable that we can build an automatic mechanism without too much trouble. For example, the rostopic tool/library could look for an environment variable defining the graph information daemon's address, and if it isn't defined attempt to contact one on the local machine. If that doesn't work then it could request the daemon be launched on the local machine.

dirk-thomas:

Any of these tools and interfaces should be independent of any specific rmw implementation. If the rmw interface is being implemented by the newest and hotest discovery / marshalling / transport solution the tools should continue to work. The rmw interface might need to be extended to provide all the necessary information.

Absolutely agree. This is another benefit of hiding it behind a well-known service description, I think: It becomes easier to abstract away the implementation. Then we can look into using existing technologies such as those @Barrett_Strausser mentioned and see if they do what we need already.

marguedas:

Protocol-wise I'd be curious to know what the preference of all of you is. It seems to me that REST services became very popular now that most people provide web interfaces to their systems but would it be your preferred choice ?

I'm in favour of a REST service as the initial goal. I've also been told by someone locally who does a lot of tool implementations, using the browser as an interface, that he wants "everything" to be available over REST. I'm assuming he's talking about introspection facilities, not absolutely everything.

However, I think the information should be available on other protocols from the same daemon if someone wants it that way. If we specify the data structure and the protocol used to access it separately then it should be relatively easy to add additional access methods.

marguedas:

In the case of multiple juxtaposed ROS systems, it comes down to how do we identify "different" ROS systems. An approach could be to have the daemon aggregates all graph information for a given DDS Domain. And thus create a deamon for each DDS Domain used. What other approach for differentiating ROS systems could be used ?

Being able to configure which domains the daemon aggregates data for, and run different daemons for different domains, sounds like a sensible use case. Especially with the potential for using domains for security zone partitioning.

Reply all
Reply to author
Forward
0 new messages