Traceroute RFC - request for feedback by end of January

114 views
Skip to first unread message

Lai Yi Ohlsen

unread,
Jan 18, 2022, 6:57:41 PM1/18/22
to dis...@measurementlab.net
Hi everyone, 

M-Lab has published an RFC discussing our transition from MDA traceroute data. If you use our traceroute dataset or plan to in the future, please take a look at the RFC and send us your feedback by the end of January. Excerpt below. 

----

Background
M-Lab’s traceroute-caller (TRC) tool was designed and developed in early 2019 as a sidecar service running on M-Lab servers. Its purpose is to collect traceroute data to any remote IP address after it closes its TCP connection to an M-Lab server. TRC uses the scamper tool for running traceroutes. 

The initial version of TRC called scamper to run the tracelb command and saved the resulting traceroute as traceroute datatype which was renamed to scamper1 datatype.

As described in scamper’s manual page, the tracelb command is used to infer all per-flow load-balanced paths between a source and destination using the Multipath Discovery Algorithm (MDA). Starting in 1Q22, TRC also supports regular traceroutes which take much less time to run and return a simpler result that is saved as scamper2 datatype.

M-Lab would like to start collecting regular traceroutes (in addition to MDA traceroutes) in 1Q22 and stop collecting MDA traceroutes by the end of 2Q22.

Request for Comments
We are publishing this RFC to get feedback from the community regarding our decision to stop running MDA traceroutes by the end of 2Q22 and, instead, run regular traceroutes with the Paris traceroute algorithm.

If you use MDA traceroutes (scamper1 datatype) and this decision impacts you, please let us know via reply to the dis...@measurementlab.net mailing list. Also, please let us know if you are planning research that would benefit from regular traceroutes (scamper2 datatype).

Based on your feedback, we will decide if we need to continue running MDA traceroutes and how to support it beyond what we already have.


Thanks! Happy New Year. 

--

Timur Friedman

unread,
Jan 19, 2022, 2:39:26 PM1/19/22
to Lai Yi Ohlsen, dis...@measurementlab.net
Hello Lai Yi,

Am I correct in understanding that two things motivate this change?

- The complexity of parsing the multipath traceroute data structure
- The time that it takes to run the MDA in order to collect a multipath traceroute

If so, there are alternatives to abandoning the collection of multipath traceroutes. After all, M-Lab has the largest set of multipath traceroutes (well over one billion!) going back over many years. Anyone who wants to study how multipath routing has evolved over time in the internet would be hard-pressed to find any other comparable dataset.

Regarding the complex data structure, a single-path traceroute could easily be extracted from a multipath traceroute via post-processing. It would not be necessary to conduct the measurement twice.

And regarding the time that it takes, the problem has been solved in Kevin Vermeulen's Diamond-Miner work.

https://www.usenix.org/conference/nsdi20/presentation/vermeulen

While Kevin's work focuses on rapidly collecting traceroutes towards all of the internet's routable prefixes in a short period of time, the principles behind the speed-up apply equally well to a trace towards a single destination. In a first round of probing, packets to all hop-counts can be sent in parallel, and ten rounds of probing are almost always enough to complete a multipath traceroute, meaning the time required is slightly less than the time required for a classic traceroute, which has a time requirement that scales with the length of the route.

We've distilled the Diamond-Miner probing engine into free open-source liberally licensed code that we call Caracal.

https://github.com/dioptra-io/caracal

With Caracal, we could easily speed up M-Lab's multipath route tracing while producing both multipath and single path outputs in the existing formats that M-Lab provides.

Incidentally, we are producing daily surveys from a single vantage point of multipath traceroutes to all routable IPv4 prefixes in the internet, and these are available to any researcher upon request through our Iris platform.

https://iris.dioptra.io/#/

So if M-Lab continues to collect multipath traceroutes, it will no longer be alone in doing so, which should enhance the value of M-Lab's multipath data for the research community.

Kind regards,

Timur


--
You received this message because you are subscribed to the Google Groups "discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@measurementlab.net.
To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAD3WrcO1ReqyZNtqn57MSws33khjyTMW%2B6MoSQtsRj%3DdZAgjew%40mail.gmail.com.

Stephen Soltesz

unread,
Jan 19, 2022, 3:16:21 PM1/19/22
to Timur Friedman, Lai Yi Ohlsen, dis...@measurementlab.net
Thank you, Timur, comments below.

On Wed, Jan 19, 2022 at 2:39 PM Timur Friedman <t...@oxus.net> wrote:
Hello Lai Yi,

Am I correct in understanding that two things motivate this change?

- The complexity of parsing the multipath traceroute data structure
- The time that it takes to run the MDA in order to collect a multipath traceroute

From my perspective, the primary challenge is using the multipath traceroute data. People often expect the single-path traceroutes, or may require this when combining this dataset with other datasets.
 
If so, there are alternatives to abandoning the collection of multipath traceroutes. After all, M-Lab has the largest set of multipath traceroutes (well over one billion!) going back over many years. Anyone who wants to study how multipath routing has evolved over time in the internet would be hard-pressed to find any other comparable dataset.

I like this point. M-Lab was founded to collect longitudinal data. "Don't throw out the baby with the bath water" so to speak. Do you know anyone who is doing this type of research on multipath traceroutes?
 
Regarding the complex data structure, a single-path traceroute could easily be extracted from a multipath traceroute via post-processing. It would not be necessary to conduct the measurement twice.

Can you say more about how to easily extract a single-path traceroute from a multi-path traceroute? We touched on this conversationally but the mechanism was unclear. There are two potential users here: those who wish to process the raw archive files themselves, those who wish to use BigQuery.

Best,
Stephen

Timur Friedman

unread,
Jan 19, 2022, 4:11:12 PM1/19/22
to discuss, Stephen Soltesz, la...@measurementlab.net, dis...@measurementlab.net
Hi Stephen,

From time to time there have been systematic studies of multipath routing in the internet, the most recent that comes to mind being Rafael Almeida's doctoral thesis in 2019 on "Classification of Load Balancing in the Internet" (he's also the lead author on an IEEE Infocom 2020 paper of the same name), but I am not sure that, besides our own group, Dioptra.io, and, of course, M-Lab, there are groups that are engaged in long-term efforts to document it.

Multipath routing is one of those things like MPLS tunnels and IP anycast that reflect ongoing fundamental changes in how things work in the internet. To me, it would be a pity to see a break in this key dataset that allows people in the future to study how this particular architectural aspect of the network has evolved.

In order to extract a single-path traceroute, one simply needs to focus on a single flow identifier, what the RIPE Atlas folks call the "Paris ID", and pick out the interface at each hop that corresponds to that identifier in the multipath traceroute.

There are some implementation details to be addressed, such as what happens in the rare case that a per-packet load balancer is traversed, and the same Paris ID returns multiple interfaces at the same hop; or which path to prefer if one is longer than another, or has a better response rate from interfaces than another. So long as the design decisions are well documented, I do not see any fundamental difficulties.

Users would not need to do this processing themselves if it were provided as a post-processed output from the multipath traceroute tool itself. The tool could provide both the multipath traceroute and an example single-path traceroute corresponding to a single Paris ID. 

Kind regards,

Timur

Kavé Salamatian

unread,
Jan 19, 2022, 5:07:38 PM1/19/22
to Timur Friedman, discuss, Stephen Soltesz, la...@measurementlab.net
Hello all, 

I am also using these data and I am aware of some people in Columbia also using these data. 

All the bests

Rgds



--
You received this message because you are subscribed to the Google Groups "discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@measurementlab.net.
signature.asc

Timur Friedman

unread,
Jan 20, 2022, 8:23:08 AM1/20/22
to Matt Mathis, discuss, Stephen Soltesz, la...@measurementlab.net, Kavé Salamatian, Maxime MOUCHET
Hello Matt,

Agreed: multipath tracing ought not come at the cost of obtaining a decent single-path trace.

This strikes me as an implementation issue that could be addressed by tweaking the current tracing tool.

It is possible to design a route tracing tool so that classic Traceroute's demultiplexing issue goes away. 

The original insight was Rob Beverly's: to craft the traceroute probe packets in such a way that the ICMP replies are entirely self-identifying. This does away with the need to maintain state on each probe packet sent, in order to match each reply with its corresponding probe packet. He based his high-speed single-path probing tool Yarrp on this insight.

Kevin Vermeulen extended the same principle to multipath probing: his Diamond-Miner tool similarly crafts the probe packets so that the ICMP replies are self-identifying, including, in this case, the flow identifiers, or "Paris IDs", of each probe packet that provoked a reply.

Now, our Iris measurement platform regularly runs Diamond-Miner route traces towards all routable IPv4 prefixes of the internet at a rate of 100,000 probe packets per second. Under these conditions, there are hundreds of thousands of outstanding probe packets at any given moment, and when the replies do come in, they are successfully associated with the trace of a particular route or, occasionally, if the reply is corrupted in some way, discarded.

The 100,000 probe packet per second limit is self-imposed so as not to trigger warnings with our ISP. Otherwise, Diamond-Miner could be scaled to run faster, say at a million probe packets per second or more, while all the time correctly identifying the replies.

We have embodied this behavior in our liberally-licensed free open-source Caracal probing engine. As I mentioned, earlier in this thread, we could easily provide the route tracing tool that M-Lab needs on the basis of this engine. It would be sure to obtain a single-path route trace alongside each multipath trace. And it would withstand whatever case load you send its way. I'm including Maxime Mouchet, the lead developer and maintainer of the current version of Caracal, in the conversation.

 Kind regards,

Timur




On Thu, Jan 20, 2022 at 5:58 AM Matt Mathis <mattm...@google.com> wrote:
Another consideration is that multipath traceroute is more likely to outright fail (zero output) on some of the most interesting paths, because it reaches a time limit or some other resource limit.

My wish would be to increase the  coverage of the single path traceroutes, possibly by relaxing the coverage on the multipath traceroute.

My chronic worry is our ability to assure that we are properly demuxing the ICMP replies, under our worst case loads.
 
Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured: 
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.


Matt Mathis

unread,
Jan 20, 2022, 8:23:08 AM1/20/22
to Timur Friedman, discuss, Stephen Soltesz, la...@measurementlab.net, Kavé Salamatian
Another consideration is that multipath traceroute is more likely to outright fail (zero output) on some of the most interesting paths, because it reaches a time limit or some other resource limit.

My wish would be to increase the  coverage of the single path traceroutes, possibly by relaxing the coverage on the multipath traceroute.

My chronic worry is our ability to assure that we are properly demuxing the ICMP replies, under our worst case loads.
 
Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured: 
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.


On Wed, Jan 19, 2022 at 2:07 PM Kavé Salamatian <kave.sa...@univ-savoie.fr> wrote:

Ethan Katz-Bassett

unread,
Jan 20, 2022, 3:02:23 PM1/20/22
to Timur Friedman, Matt Mathis, discuss, Stephen Soltesz, la...@measurementlab.net, Kavé Salamatian, Maxime MOUCHET, Ítalo Fernando Scotá Cunha, Kévin Vermeulen, Choffnes, David
Chiming in as one of the people at Columbia Kavé mentioned as using the data....

[I'm also adding the other people on my MLab project ot the thread, including Kevin who developed the Diamond-Miner tool that Timur mentioned]

I agree with everything that Timur said:
- Most of the drawbacks of multipath tracing that have been mentioned have been solved or seem to be addressable.
- I hope you'll continue issuing multipath measurements.
- It seems to make sense to extract (in post processing, when storing the data) a single path measurement from each of those and "present" that as the basic traceroute that many users will use without looking at the multipath measurement. The multipath measurements can be in a separate table for those who want them, perhaps with a foreign key to join between the two.

Our most common use case has been when we are working on something related to Internet routing/measurement and need a large set of multipath routes or load balancing routers, to help us test the behavior of our measurements in that setting or to help us interpret those results. It's useful to have large sets of such measurements already available (from Timur's platform and/or M-Lab) that we can easily use to join with our measurements.

Ethan

Lai Yi Ohlsen

unread,
Jan 21, 2022, 2:21:15 PM1/21/22
to Ethan Katz-Bassett, Timur Friedman, Matt Mathis, discuss, Stephen Soltesz, Kavé Salamatian, Maxime MOUCHET, Ítalo Fernando Scotá Cunha, Kévin Vermeulen, Choffnes, David
Hi everyone, 

Thank you so much for your feedback thus far. It is clear there is an interest in M-Lab continuing to collect multipath traceroutes. We have some follow-up questions that we will circle back with next week. 

Have a great weekend! 

Lai Yi Ohlsen

unread,
Jan 31, 2022, 5:26:11 PM1/31/22
to Ethan Katz-Bassett, Timur Friedman, Matt Mathis, discuss, Stephen Soltesz, Kavé Salamatian, Maxime MOUCHET, Ítalo Fernando Scotá Cunha, Kévin Vermeulen, Choffnes, David
Hi again,

As noted by Saied in our follow-up to discuss@ ("Traceroute Format Change RFC Results"), we will continue to collect MDA traceroutes as before and archive them as scamper1 datatype. With this decision, we have some follow-up questions about implementation and are in the process of organizing a meeting with the participants on this thread to discuss these details. 

If you would like to also be included in this discussion re: MDA traceroutes or have recommendations for others to include, please let me know by replying to this email. We are also seeking feedback about the use of our traceroute data more broadly -- please see the thread "Accessing Traceroute data" for more information. 

Thanks! 
Reply all
Reply to author
Forward
0 new messages