Metrics Architecture

7 views
Skip to first unread message

Harald Scheirich

unread,
Nov 7, 2016, 2:13:41 PM11/7/16
to openSurgSim
Metrics Architecture
In our latest project Vascular Surgery requirements have come up that drive the need for a more sophisticated handling of metrics data. One is a need for synthesized metrics e.g. taken multiple measurements and state changes and extracting a metric from that. The other is the need to stream metrics data to the outside of the simulation. 

While some of the proposed components will be implemented inside of the vascular surgery project, the will serve as a template for future development in this direction

Measurement

Current Approach
At the moment most of the OSS architecture is driven by a directed pushing and pulling of data, components interact with other components either by observing other components state (pulling) and acting accordingly or receiving data from components (push). In most cases these are all 1-1 relationships one sender, one receiver. All of these interactions are synchronous on the thread that does the activity. This means that when pushing data from the high performance threads care has to be taken to not block the pushing thread with data processing tasks on the receivers side.

Problems
This works reasonably well but lacks flexibility (for example if we need metrics in multiple places we have to implement this for each component separately. If we pull metrics data, we run the risk of missing state changes, if the sampling is done from a different thread. So we not only have to keep a history of all the state state changes but also we need to keep track of when the last snapshot was taken so the correct amount of history can be examined.

Publish Subscribe
So solve this communication problem we are introducing a publish/subscribe interface into OSS. There is currently a merge request online that does this as a component. After some internal discussions it seems more reasonable to introduce this architecture inside the `Runtime` class and make it automatic inside of OSS. 

The current component uses `boost::any` for type erasure to produce an interface that lets us transport any kind of data. The receiver does need to know what kind of data was transported to cast it back to its original form. 

We did consider using a messaging library like `ZeroMQ` at this point but most libraries of this kind only transport untyped byte data. Which means that we would have to convert all our data to byte streams for transport. Something that was considered to inconvenient for use inside of OSS. 

This approach lets us not only distribute data to multiple receivers, it also lets receivers keep track of metrics from multiple components and therefor be able to synthesize other metrics from input data (e.g. the correct execution of an ordered procedure)

Communication
Currently our metrics are gathered after the simulation has ended. This works well for 'After-Action' type metrics but doesn't work for any kind of data that might be continuously updated. One of our partners (Aptima) wants to integrate their product with OSS and one of their requirements is to be able to stream data from OSS into their product PM-Engine. 

As we are currently using a WebSocket connection to communicate with our own metrics, and the connection with Aptimas PM-Engine also uses WebSocket for communication, the approach that is being considered is to connect OSS with the server that communicates with the outside entities, any metrics data that needs to go outside is sent via sockets to the server outside from where it is passed on the external software display the metrics. 

At the point of communicating from the inside of OSS with the outside messaging APIs again become an option, it just seemed simpler to use the existing server connection for communication. 
Reply all
Reply to author
Forward
0 new messages