Is there any app/module/manager that can detect the quality of network links?

139 views
Skip to first unread message

Mao Jianwei

unread,
Oct 17, 2020, 11:25:40 AM10/17/20
to ONOS Discuss
Hi friends,

Happy to see that ONOS has been updated and iterated to 2.4.0 :)

I'm interested in that question:
Is there any app/module/manager, which we can use to detect the quality of network links, in ONOS now?
Such as latency(delay), jitter, packet loss ratio, and so on.

Thanks,
Mao

Eder

unread,
Oct 17, 2020, 12:53:48 PM10/17/20
to ONOS Discuss, Mao Jianwei

Hi,

I am not aware of any app on ONOS that can measure this. As far as I am aware, (maybe someone can correct me), latency, jitter, or packet loss are stats that switches themselves should measure first. To measure latency you'd need two switches, synchronized that can timestamp packets or two hosts that could to the same and obviously keep them synchronized too. Measuring something like bandwidth is generally easier as you know the bytes that are being sent at every N seconds thanks to the stats from the switch. Packets being dropped might be reported by the switch in the port stats reply (I have not checked this out). 

You can however measure the RTT using tools like ping which can give you the time that an ICMP req/resp took to complete. You can, somehow, interpret that the one-way time that a packet took is RTT/2, but this is likely not what actually happened. You can measure packet loss with tools like iperf if I remember correctly. 

On the other hand, you can set up parameters like delay, loss or jitter in Mininet. And you could feed ONOS with this information as if the controller knew some network conditions and then calculate paths (or anythign you'd like to do) knowing the configured link delay or packet loss conditions you established.

Cheers, 

Dan Martin

unread,
Oct 17, 2020, 3:50:58 PM10/17/20
to Eder, Mao Jianwei, ONOS Discuss
The MEF standards Connectivity fault management and link fault management are good for layer 2.  There’s BFD for layer 3.  
Without that you’ll have to know who the neighbor is and look at the port stats on both sides of the link  for drops and errors.

If you’re lucky the box you’re using will be able to stream that to you and you can use solr/nifi/cassandra to make sense of it.

Look for saa, rpm, and rttmon, I’m sure there’s other vendors who implement stuff like that.

--
You received this message because you are subscribed to the Google Groups "ONOS Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to onos-discuss...@onosproject.org.
To view this discussion on the web visit https://groups.google.com/a/onosproject.org/d/msgid/onos-discuss/92b1bd69-80bc-4afb-8a70-468e3b7a8498n%40onosproject.org.

Mao Jianwei

unread,
Oct 17, 2020, 11:09:04 PM10/17/20
to ONOS Discuss, Eder, Mao Jianwei
Hi Eder,

Thanks for your suggestions. I don't find a module in ONOS which can measure link quality, either.

As for timestamping packets, I agree that if we need two switches to mark timestamps to one probe packet based on their own clock, it is necessary and difficult to sync their clock.
In addition, it puts forward higher requirements to switches to timestamp packets, because most switches/routers/vSwitch perform table lookup and packet forward only. By the way, there is no protocol message that can report time data from switch to controller, such as openflow.
As for switch stats, it can provide some source data to calculate bandwidth and packet loss ratio, but it does not provide any time information about latency/jitter.
As for ICMP, it is an end-to-end method for host client/server. If we use it to measure RTT between switches, it requires loopback or interface address to be configured in switch, and timestamping ability. And if ICMP is executed between switches, there is no method to report the RTT result to controller.
As for iperf, it is a good way to measure link capability and device performance, but if we use it to perform routine measurement, it may affect network performance.

So maybe we need a new active measurement method in control plane, i.e. performed by ONOS, without timestamping at switches and performance pressure for whole network. How do you think? :)

Cheers,
Mao

Mao Jianwei

unread,
Oct 17, 2020, 11:24:58 PM10/17/20
to ONOS Discuss, dan.d...@gmail.com, Mao Jianwei, ONOS Discuss, Eder
Hi Martin,

I believe there are some standards and third-party business solutions that can be used to measure link quality, for example iFIT.

But there is no analogous function in ONOS, maybe we need to develop one :)

Cheers,
Mao

Eder

unread,
Oct 18, 2020, 9:51:04 AM10/18/20
to ONOS Discuss, Mao Jianwei, Eder
Hi,

As for timestamping packets, I agree that if we need two switches to mark timestamps to one probe packet based on their own clock, it is necessary and difficult to sync their clock.
Besides, it puts forward higher requirements to switches to timestamp packets, because most switches/routers/vSwitch perform table lookup and packet forward only. By the way, there is no protocol message that can report time data from switch to controller, such as OpenFlow.

I think timestamping and a synchronization such as PTP or any other similar might be necessary on switches, unfortunately. You can, of course, install PTP-aware interfaces on hosts and test end-to-end delay for a particular path, without involving switches on the timestamping task. I sometimes hear other researchers and engineers comment on the fact that the delay perceived in queues in a particular forwarding device can be significantly higher than the delay for a packet to traverse a particular link, which might make sense). You can measure this with P4, In-band Network Telemetry (INT), and BMv2 or Tofino targets/devices, I'd say. If the Tofino devices support PTP, there might be a way to carry timestamps on the packet using INT (we would have to see which timestamps are supported on the metadata). Then, at the end of the path, this information is extracted from packets and collected.

As for switch stats, it can provide some source data to calculate bandwidth and packet loss ratio, but it does not provide any time information about latency/jitter.  

Yes, bandwidth should be fine to be calculated because the controller receives the bytes sent/rec on ports. This is something that the ONOS already reports (I think) on the web UI. You can get a little more specific by calculating per-flow instead of port-based bw. And as we commented, maybe you can get some info on dropped packets but I don't know if the dropped packets report is because the treatment for a flow rule is dropping or because of congestion (we'd have to look into this). As you said, nothing about delay or jitter, which makes sense due to the necessary tools and protocols to measure this properly.

As for ICMP, it is an end-to-end method for host client/server. If we use it to measure RTT between switches, it requires loopback or interface address to be configured in switch, and timestamping ability. And if ICMP is executed between switches, there is no method to report the RTT result to controller.  

When I mentioned ICMP, I was thinking about measuring the approximate value when a packet traversed a particular path, which involves one or more switches, and queueing might be more important than the time delay for the packet to traverse a link. I think I have seen some papers that calculate the value you are looking for by sending a packet to a switch, then fwd it to another switch and then back to the controller: Controller -> Sw1 -> Sw2 -> Controller. I think they follow several statistical methods to calculate the control plane link and processing delay (cannot remember how) and then subtract this to the overall delay they experienced. This however might not be as exact as expected and it also involved the controller to query all switches every N time.

Additionally, it is true that I don't think there is any way to report the RTT to the controller but I remember OpenFlow accommodates experimenter-based messages but I don't know if this could be used in any way.

As for iperf, it is a good way to measure link capability and device performance, but if we use it to perform routine measurement, it may affect network performance.  

This is true, and I also thought to be end-to-end so you could test throughput but you might not know which device/link is performing worse (if any).

So maybe we need a new active measurement method in control plane, i.e. performed by ONOS, without timestamping at switches and performance pressure for whole network. How do you think? :)  

This seems like a good idea. You'd have to consider that without timestamping measuring delay gets complicated. Using P4 and extending the work of the link I posted earlier, it could be a good use case but still difficult as it might require specialized hardware. If you'd use BMv2, then Andy and Antonin make a really good point at the very end of this file. If you'd prefer to involve the controller and use OpenFlow switches, then you might be in search of work similar to this or this.

Cheers,
Reply all
Reply to author
Forward
0 new messages