ROS 2.0 and node lifecycle management

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 15, 2015, 9:21:52 AM6/15/15

to ros-sig...@googlegroups.com

All,

In the ROS 2.0 Developer preview presented at ROScon 2014 [1], there was mention that existing lifecycle systems such as those of Orocos and OpenRTM were being investigated for inspiration. I was wondering where things stand in this respect.

An initial question would be, what came out of the study of the existing systems?. In particular, was there any convergence on a desired set of states and transitions?.

In [1], it is also mentioned that one of the benefits of nodes having a standard lifecycle is to be
"Execution agnostic (in a custom main or in an off-the-shelf container)". What would this off-the-shelf container look like, dynamically loadable plugins implementing the lifecycle-management interface?. I can imagine lots of interesting deploy/launch time and runtime tooling that could take advantage of this.

Finally, has there been thought about supporting node composition? (see diagram below), and if so, of specifying composition at deploy/launch time, and modifying it at runtime?. IMO, standard lifecycle management and node composition would be killer features of ROS 2.0.

Inline image 1

Thanks in advance,

[1] http://roscon.ros.org/2014/wp-content/uploads/2014/07/ROSCON-2014-Why-you-want-to-use-ROS-2.pdf

--

Adolfo Rodríguez Tsouroukdissian
Senior robotics engineer
adolfo.r...@pal-robotics.com
http://www.pal-robotics.com

PAL ROBOTICS S.L
c/ Pujades 77-79, 4º4ª
08005 Barcelona, Spain.
Tel. +34.93.414.53.47
Fax.+34.93.209.11.09
Skype: adolfo.pal-robotics
Facebook - Twitter - PAL Robotics YouTube Channel

AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos, pueden contener información privilegiada y/o confidencial que está dirigida exclusivamente a su destinatario. Si usted recibe este mensaje y no es el destinatario indicado, o el empleado encargado de su entrega a dicha persona, por favor, notifíquelo inmediatamente y remita el mensaje original a la dirección de correo electrónico indicada. Cualquier copia, uso o distribución no autorizados de esta comunicación queda estrictamente prohibida.CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s) may contain confidential information which is privileged and intended only for the individual or entity to whom they are addressed. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of this e-mail and/or accompanying document(s) is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender at the above e-mail address.

Dirk Thomas

unread,

Jun 15, 2015, 12:44:04 PM6/15/15

to ros-sig...@googlegroups.com

Hi Adolfo,

we don't have any news on a life cycle implementation in the ROS 2 prototype yet.

Hopefully there will be news in a month or two on this topic.

A component-based interface for nodes in general (with or without a life cycle) will allow us to either run a component in a separate process or load multiple of them dynamically in process and run them together.

In ROS 1 this is a programming time decision since nodes and nodelets use different API.

In ROS 2 that will be a deploy time decision - likely even with a way to change the configuration at run time.

The additional life cycle will allow tools like roslaunch to switch from the initialization state to running once all components have finished their init phase.

It also allows to easily suspend all component in a process, unload one of them, load a different one, initialize it and then resume operation.

These kind of operations will be available as a service interface on the process to enable orchestration from the outside.

But these are currently only the ideas / concepts without being implemented in the ROS 2 prototype yet.

As far as I understand your example about node composition is what nodelets already provide in ROS 1 (minus a life cycle).

That will definitely be possible in ROS 2 too.

In ROS 1 it is also possible to add new components to a process and remove existing ones.

ROS 2 will likely improve flexibility to allow more ways to remap topics within the process and also keep the ROS graph within the process introspectable.

Cheers,

- Dirk

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bob Dean

unread,

Jun 15, 2015, 12:55:20 PM6/15/15

to ros-sig...@googlegroups.com

On Monday, June 15, 2015 at 12:44:04 PM UTC-4, Dirk Thomas wrote:

Hi Adolfo,

we don't have any news on a life cycle implementation in the ROS 2 prototype yet.
Hopefully there will be news in a month or two on this topic.

Dirk,

Is the "news in a month or two" because it is been worked currently and the prototype is not quite ready? or that is when there is less busy time for discussion?

Thanks,

Bob

Dirk Thomas

unread,

Jun 15, 2015, 1:09:13 PM6/15/15

to ros-sig...@googlegroups.com

We have not done any work on the component life cycle for the current ROS 2 prototype.

We did some investigation on component life cycle end of last year but that was not backed by a prototype implementation.

The component life cycle is one of the next todos on the list.

I would expect that we have some progress on it within the next month or two.

There is always room for discussion - please feel free to ask or propose anything anytime on this mailing list.

- Dirk

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 16, 2015, 8:54:27 AM6/16/15

to ros-sig...@googlegroups.com

On Mon, Jun 15, 2015 at 6:44 PM, Dirk Thomas <dth...@osrfoundation.org> wrote:

Hi Adolfo,

Hi Dirk,

we don't have any news on a life cycle implementation in the ROS 2 prototype yet.
Hopefully there will be news in a month or two on this topic.

Ack. I'd be grateful if this list could be kept in the loop when the time comes.

A component-based interface for nodes in general (with or without a life cycle) will allow us to either run a component in a separate process or load multiple of them dynamically in process and run them together.
In ROS 1 this is a programming time decision since nodes and nodelets use different API.
In ROS 2 that will be a deploy time decision - likely even with a way to change the configuration at run time.

The additional life cycle will allow tools like roslaunch to switch from the initialization state to running once all components have finished their init phase.
It also allows to easily suspend all component in a process, unload one of them, load a different one, initialize it and then resume operation.

Suspending and resuming (stopping/starting) a node is a much needed feature.

These kind of operations will be available as a service interface on the process to enable orchestration from the outside.

I wonder about how small the footprint of nodes (with lifecycle) can be made. For instance, if a composite node,
1. is the only one that manages the lifecycle of its composing nodes, and

2. the composite and its internals live in the same process,

then the shared-memory inter-process comms would be used for the internal nodes, and potentially keeping the ROS footprint overhead 'small'?.

Related to the above, will the choice of transport (shared memory vs socket) be a deploy-time decision?, and how will defaults be set?.

I ask these questions to get an idea of the extent to which ROS2 nodes could be used to implement computational graphs for control. Probably more questions will follow ;-)

But these are currently only the ideas / concepts without being implemented in the ROS 2 prototype yet.

As far as I understand your example about node composition is what nodelets already provide in ROS 1 (minus a life cycle).

Yes, but up to a certain extent. As you state above, the node != nodelet distinction is an important one. Also, nodelets have to live in the same process. Although I'm mostly interested in that usecase, one can imagine a composite node consisting of nodes living in different processes or machines.

That will definitely be possible in ROS 2 too.
In ROS 1 it is also possible to add new components to a process and remove existing ones.
ROS 2 will likely improve flexibility to allow more ways to remap topics within the process and also keep the ROS graph within the process introspectable.

Thanks Dirk,

Adolfo.

Dirk Thomas

unread,

Jun 16, 2015, 12:23:52 PM6/16/15

to ros-sig...@googlegroups.com

Hi Adolfo,

regarding the transport choices:

(1) DDS uses sockets to communicate independent of the process layout

(2) some DDS vendors provide the option to use shared memory to communicate between endpoints which are on the same system but that still requires serialization / deserialization

(3) ROS will provide an optimized intra-process communication, so if the endpoints are in the same process they will only exchange references (like nodelets) which does not require any serialization / deserialization

Choosing between (1) and (2) is currently only configurable through the vendor specific configuration.

The default will likely be to choose (3) whenever possible since it will be the fastest approach.

I would assume that it will be possible to use the service API of components to orchestrate them from the outside "at will".

So implementing a computational graph for control should be possible.

We also think that it can e.g. be used to implement "synchronous" pipelines like with ecto in ROS 1.

Cheers,

- Dirk

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 16, 2015, 12:31:50 PM6/16/15

to ros-sig...@googlegroups.com

On Tue, Jun 16, 2015 at 6:23 PM, Dirk Thomas <dth...@osrfoundation.org> wrote:

Hi Adolfo,

Dirk,

regarding the transport choices:
(1) DDS uses sockets to communicate independent of the process layout
(2) some DDS vendors provide the option to use shared memory to communicate between endpoints which are on the same system but that still requires serialization / deserialization
(3) ROS will provide an optimized intra-process communication, so if the endpoints are in the same process they will only exchange references (like nodelets) which does not require any serialization / deserialization

Choosing between (1) and (2) is currently only configurable through the vendor specific configuration.
The default will likely be to choose (3) whenever possible since it will be the fastest approach.

Nice.

I would assume that it will be possible to use the service API of components to orchestrate them from the outside "at will".
So implementing a computational graph for control should be possible.
We also think that it can e.g. be used to implement "synchronous" pipelines like with ecto in ROS 1.

In the synchronous case there is be no need for synchronization, as computation is serialized. The next question would be if it has been considered to make the synchronization policy configurable and extendable. I could imagine mutices being a default synchronization primitive, but the synchronous case could simply do away with them. Non-synchronous control-oriented applications might be interested in implementing (as an extension) lock-free synchronization primitives to ensure forward progress of the control threads.

Thanks again,

William Woodall

unread,

Jun 16, 2015, 2:34:22 PM6/16/15

to ros-sig...@googlegroups.com

Currently we are not doing any of the locking, as intra-process comms between nodes will be using the middleware to asynchronously communicate the addresses of shared pointers. This allows us to more easily mimic the QoS settings of the interprocess topic. At least that's the leading idea for how to implement it. So in that case locking and thread communication will be done with the middleware. Both OpenSplice and RTI have documents which detail their threading model and how to configure them in different ways. I haven't looked to see if they let you implement your own locking strategies. Here's OpenSplice's "deployment" manual:

www.prismtech.com/download-documents/1331

In that document they talk about deployment configurations, including shared memory vs socket comms and internal threading models.

Another strategy would be to do the intra-process comms through our own custom queueing and synchronization, in which case the locking would be configurable by you implementing your own Executor class. There's no guarantee after all that you're even using an executor with more than one thread, in which case no locking would be needed. Our goal is that all work done by our code is done in threads created by the Executor, which you can override and control if you like. The middleware will have it's own threads, for reading sockets and the like, but the vendors are pretty careful to describe those threads and how they work. As you can imagine their customers care about that stuff, so their documentation on the subject is decent.

So far for intra-process comms we've only done prototypes and proofs of concepts. However, I'm starting work on this issue just now, so there will be more concrete details in the next few months.

William Woodall

ROS Development Team

wil...@osrfoundation.org

http://wjwwood.io/

Robert Dean

unread,

Jun 17, 2015, 12:01:45 AM6/17/15

to ros-sig...@googlegroups.com

from a read of the OpenSlice deployment doc, they discuss their threads. I do not see any mention really of the application itself and how it should run. of course that could be due to bedtime being an hour ago. long story short, I do not see any restrictions which OpenSplice would place on an intra-process message passing layer.

i agree that nodelets + transparent intra-process messaging should allow composition as described in the OP.

i do not understand this statement from adolfo: "Non-synchronous control-oriented applications might be interested in implementing (as an extension) lock-free synchronization primitives to ensure forward progress of the control threads."

lock-free does not guarantee forward progress of control threads vs message queueing/passing, the design of the control logic within those threads does combined with proper runtime analysis/profiling and testing. Implementing lock-free data structures is non-trivial, and cases where lock-free provides an actual runtime performance improvement over std::mutex are extremely rare. Adolfo, maybe an example use case is needed? maybe you have encountered one of these rare cases.

Luetkebohle Ingo (CR/AEA2)

unread,

Jun 17, 2015, 3:17:12 AM6/17/15

to ros-sig...@googlegroups.com

Hi,

Regarding the life-cycle: The OROCOS component life-cycle might be worth a look, it’s pretty simple but covers all the bases in one, common life-cycle.

There are also component models with hierarchical life-cycles (e.g., in robotics, SmartSoft comes to mind), but I believe that’s not necessary. If anybody wants hierarchical life-cycles, I’d be happy to provide more rationale against ;-)

Regarding intra-process data passing, I would only like to add that lock-free data passing is not primarily for improving performance. In fact, it can make performance worse in some cases. The real advantage is that they avoid a scheduling point in the kernel that could lead to context switches or (worse) priority inversion. So, this is primarily important when we’re talking real-time guarantees. Btw, OROCOS has implementations of such data structures that we might be able to re-use (not sure about licensing, but otherwise I see no issues).

Mit freundlichen Grüßen / Best regards

Ingo Luetkebohle
Software Design and Analysis (CR/AEA2)

Tel. +49(711)811-12248
Fax +49(711)811-0
Ingo.Lue...@de.bosch.com

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 17, 2015, 12:20:47 PM6/17/15

to ros-sig...@googlegroups.com

On Tue, Jun 16, 2015 at 8:34 PM, William Woodall <wil...@osrfoundation.org> wrote:

Hey William,

Currently we are not doing any of the locking, as intra-process comms between nodes will be using the middleware to asynchronously communicate the addresses of shared pointers. This allows us to more easily mimic the QoS settings of the interprocess topic. At least that's the leading idea for how to implement it. So in that case locking and thread communication will be done with the middleware. Both OpenSplice and RTI have documents which detail their threading model and how to configure them in different ways. I haven't looked to see if they let you implement your own locking strategies. Here's OpenSplice's "deployment" manual:

www.prismtech.com/download-documents/1331

In that document they talk about deployment configurations, including shared memory vs socket comms and internal threading models.

I'll get back on this in a while. There's a lot to chew on in that document (not all is relevant to this thread, but still).

Another strategy would be to do the intra-process comms through our own custom queueing and synchronization, in which case the locking would be configurable by you implementing your own Executor class. There's no guarantee after all that you're even using an executor with more than one thread, in which case no locking would be needed. Our goal is that all work done by our code is done in threads created by the Executor, which you can override and control if you like. The middleware will have it's own threads, for reading sockets and the like, but the vendors are pretty careful to describe those threads and how they work. As you can imagine their customers care about that stuff, so their documentation on the subject is decent.

OK, thanks for the pointer. I have to think about this, to to have a clearer idea of what the compromises are.

So far for intra-process comms we've only done prototypes and proofs of concepts. However, I'm starting work on this issue just now, so there will be more concrete details in the next few months.

What is (are) the main driver(s) behind rolling your own intra-process comms?.

- Not all DDS vendors support shared memory, and you want to remain vendor-agnostic.

- Not having to pull in DDS dependencies in minimal-footprint deployments not requiring it.

- Other?.

The above question does not imply a positive/negative opinion on the matter. I just want to understand the requirements that are being considered.

Thanks and cheers,

Adolfo.

William Woodall
ROS Development Team
wil...@osrfoundation.org
http://wjwwood.io/

--

You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dirk Thomas

unread,

Jun 17, 2015, 12:24:06 PM6/17/15

to ros-sig...@googlegroups.com

Hi Adolfo,

What is (are) the main driver(s) behind rolling your own intra-process comms?.

- Not all DDS vendors support shared memory, and you want to remain vendor-agnostic.
- Not having to pull in DDS dependencies in minimal-footprint deployments not requiring it.

Shared memory is not the same as the intra-process we are developing (which will pass references like nodelets do).

In the case of shared memory you still need to serialize and deserialize the messages.

When passing reference that is not necessary at all.

- Dirk

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 17, 2015, 12:27:23 PM6/17/15

to ros-sig...@googlegroups.com

On Wed, Jun 17, 2015 at 6:24 PM, Dirk Thomas <dth...@osrfoundation.org> wrote:

Hi Adolfo,

What is (are) the main driver(s) behind rolling your own intra-process comms?.

- Not all DDS vendors support shared memory, and you want to remain vendor-agnostic.
- Not having to pull in DDS dependencies in minimal-footprint deployments not requiring it.

Shared memory is not the same as the intra-process we are developing (which will pass references like nodelets do).
In the case of shared memory you still need to serialize and deserialize the messages.

Ah, thanks for clearing that up. I was mistakenly assuming that the shared memory approach skipped (de)serialization as well. So, there is a performance gain that could be significant for certain data types.

Adolfo.

When passing reference that is not necessary at all.

- Dirk

--

You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 17, 2015, 12:59:56 PM6/17/15

to ros-sig...@googlegroups.com

On Wed, Jun 17, 2015 at 6:01 AM, Robert Dean <bob....@gmail.com> wrote:

from a read of the OpenSlice deployment doc, they discuss their threads. I do not see any mention really of the application itself and how it should run. of course that could be due to bedtime being an hour ago. long story short, I do not see any restrictions which OpenSplice would place on an intra-process message passing layer.

i agree that nodelets + transparent intra-process messaging should allow composition as described in the OP.

Hello Bob,

i do not understand this statement from adolfo: "Non-synchronous control-oriented applications might be interested in implementing (as an extension) lock-free synchronization primitives to ensure forward progress of the control threads."

lock-free does not guarantee forward progress of control threads vs message queueing/passing, the design of the control logic within those threads does combined with proper runtime analysis/profiling and testing.

I'm thinking about data being shared between non real-time and real-time threads, and whether it will be feasible to implement (hard?) real-time computation graphs with ROS2 (plus some extensions).

Implementing lock-free data structures is non-trivial,

Yes, I would not want to do that. Fortunately there are good implementations out there. Ingo mentions in another post the primitives in Orocos RTT, but there are others as well. liblfds, for instance, is written in C and has some pretty decent platform requirements (no experience with it, learned about it from the microblx project).
e

and cases where lock-free provides an actual runtime performance improvement over std::mutex are extremely rare.

I agree with you that in general lock-free algorithms are slower than their locking counterparts, but performance and forward-progress guarantees are orthogonal.

Adolfo, maybe an example use case is needed? maybe you have encountered one of these rare cases.

When you need to guarantee forward progress, i.e. any situation in which you are not OK with blocking. Avoiding priority inversion (you could use a priority inheritance mutex as well).

Cheers,

Adolfo.

(snip)

William Woodall

unread,

Jun 17, 2015, 4:20:02 PM6/17/15

to ros-sig...@googlegroups.com

On Wed, Jun 17, 2015 at 12:17 AM, Luetkebohle Ingo (CR/AEA2) <Ingo.Lue...@de.bosch.com> wrote:

Hi,

Regarding the life-cycle: The OROCOS component life-cycle might be worth a look, it’s pretty simple but covers all the bases in one, common life-cycle.

We've been looking at OROCOS and OpenRTC (which is an implementation of the RTC standard which is also done by OMG: http://www.omg.org/spec/RTC/) for inspiration on how our life cycle should be. We haven't committed to just using one of the models as a standard just yet, but I wouldn't be surprised if we ended up using one of them. We've been in contact with the guys behind OpenRTC at AIST and discussed this topic with them at length.

There are also component models with hierarchical life-cycles (e.g., in robotics, SmartSoft comes to mind), but I believe that’s not necessary. If anybody wants hierarchical life-cycles, I’d be happy to provide more rationale against ;-)

I'm not familiar with SmartSoft, nor have I used hierarchical life cycles before. I for one would be interested in your opinion on that pattern vs. a flat hierarchy like what I assume is in OROCOS and OpenRTC.

Regarding intra-process data passing, I would only like to add that lock-free data passing is not primarily for improving performance. In fact, it can make performance worse in some cases. The real advantage is that they avoid a scheduling point in the kernel that could lead to context switches or (worse) priority inversion. So, this is primarily important when we’re talking real-time guarantees. Btw, OROCOS has implementations of such data structures that we might be able to re-use (not sure about licensing, but otherwise I see no issues).

This is the pivotal issue for our intra-process comms at the moment. If we can demonstrate that the message passing of shared pointer addresses meets the needs of hard or soft real-time situations, then the locking will be a matter of configuring or customizing the middleware.

However, based our discussions I'm beginning to think that in order to meet all the varied needs, we'll probably have to consider doing custom intra-process comms and expose the threading and locking primitives as overridable parts of the Executor class. If we go this route, then reusing components from OROCOS or liblfds might be something we investigate. It should even be possible to avoid depending on them directly, but rather provide a package which contains a real-time and/or lock-free versions of the Executor, to be used in specific situations.

Luetkebohle Ingo (CR/AEA2)

unread,

Jun 18, 2015, 4:02:23 AM6/18/15

to ros-sig...@googlegroups.com

Regarding the life-cycle: The OROCOS component life-cycle might be worth a look, it’s pretty simple but covers all the bases in one, common life-cycle.

We've been looking at OROCOS and OpenRTC (which is an implementation of the RTC standard which is also done by OMG: http://www.omg.org/spec/RTC/) for inspiration on how our life cycle should be. We haven't committed to just using one of the models as a standard just yet, but I wouldn't be surprised if we ended up using one of them. We've been in contact with the guys behind OpenRTC at AIST and discussed this topic with them at length.

As far as I can tell, these are virtually indistinguishable, as both are of the basic “initialized, active, inactive, error” variety.

One aspect that I couldn’t discern from the RTC spec, but which the OpenRTC guys surely must have handled, is what to do with ports during the “inactive” state. In Orocos RTT, “operations” (essentially, service ports) can be invoked in all states, but the component’s update thread is only invoked during the “started” state (what RTC would call active). This also means that data coming in on data flow ports is only handled during the active state.

In contrast, the RTC spec is a bit more ambiguous and only says “However, the behavioral contracts of such connections are dependent on the interfaces exposed by the ports and are not described normatively by this specification.” (section 5.4.2.3).

I think this aspect is fairly important and should be clarified. Distinguishing two kinds of ports, and putting that into the spec, like Orocos RTT does it, would be a good approach in my opinion.

There are also component models with hierarchical life-cycles (e.g., in robotics, SmartSoft comes to mind), but I believe that’s not necessary. If anybody wants hierarchical life-cycles, I’d be happy to provide more rationale against ;-)

I'm not familiar with SmartSoft, nor have I used hierarchical life cycles before. I for one would be interested in your opinion on that pattern vs. a flat hierarchy like what I assume is in OROCOS and OpenRTC.

SmartSoft currently considers the mode of operation (in RTC terminology) to be part of the state, and puts a hierarchy below the “active” state to realize. We’ve discussed this with them at length and I think we could convince them to treat this separately, much like RTC does it. This is most likely not reflected in their code, though.

Regarding intra-process data passing, I would only like to add that lock-free data passing is not primarily for improving performance. In fact, it can make performance worse in some cases. The real advantage is that they avoid a scheduling point in the kernel that could lead to context switches or (worse) priority inversion. So, this is primarily important when we’re talking real-time guarantees. Btw, OROCOS has implementations of such data structures that we might be able to re-use (not sure about licensing, but otherwise I see no issues).

This is the pivotal issue for our intra-process comms at the moment. If we can demonstrate that the message passing of shared pointer addresses meets the needs of hard or soft real-time situations, then the locking will be a matter of configuring or customizing the middleware.

I think this could be a problem, depending on exactly how it’s done. Does the C++ standard say anything about whether implementations *have* to use atomic compare and swap, or could an implementation also fall back to a mutex? The latter would be an issue because of the blocking semantics.

If you use a single shared_pointer object, the atomic_store and atomic_load methods could probably meet real-time requirements. However, I would think this is error-prone. If someone modifies the pointer using the “regular” way, it would be unsafe.

However, based our discussions I'm beginning to think that in order to meet all the varied needs, we'll probably have to consider doing custom intra-process comms and expose the threading and locking primitives as overridable parts of the Executor class. If we go this route, then reusing components from OROCOS or liblfds might be something we investigate. It should even be possible to avoid depending on them directly, but rather provide a package which contains a real-time and/or lock-free versions of the Executor, to be used in specific situations.

Orocos RTT makes this configurable on a per-port basis.

Personally, I think your idea of considering locking in conjunction with the threading is good. Something like RTC’s ExecutorContext (which is internally sequential) could be a reasonable approach to simplify this for the user. E.g., if components are only in one context, you don’t need locking. Otherwise you do, for those which are in multiple ones.

Cheers,

Ingo

Robert Dean

unread,

Jun 18, 2015, 8:19:24 AM6/18/15

to ros-sig...@googlegroups.com

from my point of view there is an important difference between orocos and rtc, in that orocos has a "two phase" start and stop procedure. we have a similar process in our systems, and have found it an invaluable design pattern for bringing up the system and shutting it down. For example, we mandate that data publishers be set up in phase 1, and subscribers in phase 2. This prevents deadlock on startup due to a publication resource not existing (shared memory is used for everything local).

i think someone will need to dig up and link the actual c++ spec for mutex to satisfy Ingo. I know that it uses CAS since I dug into the gcc source. if memory serves std::mutex calls the gcc builtin pthread mutex code, which in turn has had #define'd CAS semantics for many years. I suspect the spec will defer to the c++ memory model which in turn will say "use CAS if the hardware supports it".

atomic<shared_ptr> is not currently lock free as a shared_ptr is multiple pointers in size. in our system there is a CAS spinlock which wraps shared_ptr assignments when needed which is about as fast as it can be.

I have another question though, what is the real difference between a system call which forces a context switch, and a CAS spinlock which results in the timeslice expiring during the spin?

This leads to another larger question which is what is the responsibility of the middleware and what is the responsibility of the design patterns developers choose to use for their systems. To date, I have not seen a case (this discussion included) which was not solvable at the system architecture level via a design pattern, which results in reduced middleware complexity. I believe that we have reached the point of needed well defined Use Cases which are described in a more detailed level than is occurring here using real world examples. This would allow features to be discussed and priority assigned. If for example, a feature is solvable with a design pattern that would reduced middleware complexity, and also reduce the amount of code which requires safety certification. Likewise, Use Case analysis would allow comparison of proposed ROS2 technologies, such as mutually exclusive callbacks, with alternative solutions from the community. I hate to say it as I sit on 4 currenlty, but there may be a need for an actual Working Group with regular meetings and homework.

--

Luetkebohle Ingo (CR/AEA2)

unread,

Jun 18, 2015, 8:53:20 AM6/18/15

to ros-sig...@googlegroups.com

Hi Bob,

I concur with most of your points, however, the difference between lock-free and std::mutex has nothing to do with CAS or not. It has to do with std::mutex having blocking semantics. This is always true, irrespective of whether it is implemented using a system call or using CAS.

cheers,

Dr.-Ing. Ingo Lütkebohle
Software Design and Analysis -- Robotics (CR/AEA2)

Tel. +49(711)811-12248
Fax +49(711)811-0
Mobil +49-1525-8813417
Ingo.Lue...@de.bosch.com

Von: ros-sig...@googlegroups.com <ros-sig...@googlegroups.com> im Auftrag von Robert Dean <bob....@gmail.com>
Gesendet: Donnerstag, 18. Juni 2015 14:19

An: ros-sig...@googlegroups.com
Betreff: Re: [ros-sig-ng-ros] ROS 2.0 and node lifecycle management

Jonathan Bohren

unread,

Jun 18, 2015, 9:51:50 AM6/18/15

to ros-sig...@googlegroups.com

Dirk, Will,

Once ROS2's C++ node and middleware libraries are deployed, how different do you expect a system using them to be from a system using Orocos with ROS typekits and DDS transport plugins?

-j

Geoffrey Biggs

unread,

Jun 19, 2015, 1:55:08 AM6/19/15

to ros-sig...@googlegroups.com

On 2015-06-18, 08:19 -0400, Robert Dean wrote:
> from my point of view there is an important difference between orocos and rtc,
> in that orocos has a "two phase" start and stop procedure. we have a similar
> process in our systems, and have found it an invaluable design pattern for
> bringing up the system and shutting it down. For example, we mandate that data
> publishers be set up in phase 1, and subscribers in phase 2. This prevents
> deadlock on startup due to a publication resource not existing (shared memory
> is used for everything local).

This paradigm is indeed very useful in the cases you describe. Which is why RTC
has it, too. :)

It is also possible to not need this paradigm, at least for setting up
connections. In many of our transports, a subscriber can set up the resources
just as well as a publisher, so it doesn't matter which one gets to it first.

[snip]

> This leads to another larger question which is what is the responsibility of
> the middleware and what is the responsibility of the design patterns developers
> choose to use for their systems. To date, I have not seen a case (this
> discussion included) which was not solvable at the system architecture level
> via a design pattern, which results in reduced middleware complexity. I
> believe that we have reached the point of needed well defined Use Cases which
> are described in a more detailed level than is occurring here using real world
> examples. This would allow features to be discussed and priority assigned. If
> for example, a feature is solvable with a design pattern that would reduced
> middleware complexity, and also reduce the amount of code which requires safety
> certification. Likewise, Use Case analysis would allow comparison of proposed
> ROS2 technologies, such as mutually exclusive callbacks, with alternative
> solutions from the community. I hate to say it as I sit on 4 currenlty, but
> there may be a need for an actual Working Group with regular meetings and
> homework.

I agree on the need for some well-defined use cases. Otherwise, we are just
playing around with "wouldn't it be neat if"s. As fun as that is, it may not
necessarily lead to the best design.

For what it's worth, I've drafted a node life cycle document for
design.ros2.org. It's based on our experience and on recent discussions amongst
the people who handle this stuff at the OMG.

https://github.com/ros2/design/pull/34

The draft was written while I was lacking sleep somewhere over far-northern
Russia, so treat it like a 3AM piece of code. It's rambling, incomplete
(especially the composite node stuff), and possibly incoherent. Still, I hope
it can be a basis for moving this discussion forward to producing a design.
Please edit the article (add use cases!) as you see fit.

Geoff

Adolfo Rodríguez Tsouroukdissian

unread,

Jun 19, 2015, 6:01:01 AM6/19/15

to ros-sig...@googlegroups.com

[snip]

I agree on the need for some well-defined use cases. Otherwise, we are just
playing around with "wouldn't it be neat if"s. As fun as that is, it may not
necessarily lead to the best design.

For what it's worth, I've drafted a node life cycle document for
design.ros2.org. It's based on our experience and on recent discussions amongst
the people who handle this stuff at the OMG.

https://github.com/ros2/design/pull/34

Geoff,

Thanks for putting this together, I added some comments inline.

The draft was written while I was lacking sleep somewhere over far-northern
Russia, so treat it like a 3AM piece of code. It's rambling, incomplete
(especially the composite node stuff), and possibly incoherent. Still, I hope
it can be a basis for moving this discussion forward to producing a design.
Please edit the article (add use cases!) as you see fit.

Your 3PM pieces of code must be really good then ;-)

Adolfo

Geoff

--
You received this message because you are subscribed to the Google Groups "ROS SIG NG ROS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ros-sig-ng-ro...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adolfo Rodríguez Tsouroukdissian

unread,

Sep 7, 2015, 8:02:14 AM9/7/15

to ros-sig...@googlegroups.com, Herman Bruyninckx, Peter Soetens, Markus Klotzbuecher

On Fri, Jun 19, 2015 at 12:01 PM, Adolfo Rodríguez Tsouroukdissian <adolfo.r...@pal-robotics.com> wrote:

[snip]

I agree on the need for some well-defined use cases. Otherwise, we are just
playing around with "wouldn't it be neat if"s. As fun as that is, it may not
necessarily lead to the best design.

For what it's worth, I've drafted a node life cycle document for
design.ros2.org. It's based on our experience and on recent discussions amongst
the people who handle this stuff at the OMG.

https://github.com/ros2/design/pull/34

All,

I just wanted to ping back to say that during the past weeks the discussion has kept on going at https://github.com/ros2/design/pull/34. As suggested in the PR, I'm putting in the loop some people that have good experience with component lifecycles, in case they want to chime in.