Seeking algorithm assistance for determining shortest 500k route

361 views
Skip to first unread message

Paul Melzer

unread,
Aug 31, 2016, 4:19:01 PM8/31/16
to or-tools-discuss
Hello, 

I need some guidance from any and all who can offer it. I am an ultrarunner (long distance) and I am planning to run [in one event] all the city streets of my town, Redlands, California (USA), as a charity for a local nonprofit. The length will be approximately 500k and I do not want to repeat any more streets than I have to. Every time I stop and leave the course, I must then return to the same place to continue on. Given the number of streets (nodes and sides [street blocks and crossings]), I believe only a computer can find the shortest route(s). I am willing to tackle any work necessary to provide the basic information needed (how many nodes, whether odd or even, name each node and/or side, etc.), but am not a mathematician nor computer scientist. I do happen to live in the home town of ESRI, however I do not know anyone there who can assist me.

I have a line graph of the city streets (not yet vectorized), which I attach below as my exhibit. 

I do so appreciate any and all assistance. (I found my way here because I saw that Vincent Furnon frequents this group and, I believe, is both an expert in the problem and a runner...I do not have Mr. Furnon's contact information.)

My kind thanks, in advance,

Paul Melzer
redlands-runaround-map-flat.jpg

Rob

unread,
Sep 1, 2016, 12:48:39 PM9/1/16
to or-tools-discuss
Hi Paul

I'm not an ultra-distance runner, but I like big TSPs. There are a few aspects to your problem description that make it tricky. Firstly, you need a reasonably accurate distance matrix based on the walking distance between places (you can apply a factor to this to adjust to running speed). Whether you use a TSP (Travelling Salesman Problem) formulation or a VRP (Vehicle Routing Problem) formulation, an accurate distance matrix would be needed for both. There are a few ways to do this but all fall outside the scope of this group (dm's are more data generation or input than optimisation).

In a TSP formulation you would find the shortest path to visit all streets exactly once. You may want to model a street as two streets (one in each direction) if you need to visit each pavement on a street (i.e. run both directions of a road). It's actually easier if you need to run both sides of the road because you explicitly model the direction of travel then. It doubles the size of the problem, but actually makes it easier to model (there's a subtle point about the direction/bearing you approach the midpoint of a road in). You could use the Concorde TSP Solver (http://www.math.uwaterloo.ca/tsp/concorde.html) to generate a "super-tour" which is the shortest path to visit all your nodes in one run. This is probably not possible in a single run in practice when you run the tour, but you could split your routes up each day if someone is collecting you in a vehicle (driving far in a vehicle doesn't matter, you effort on foot is more "costly"). 

In the VRP you would model each day of your run as a route and would probably focus on generating routes that match how long you can run for each day? As a runner, you might want to model the gradient of the roads (but this is optional). This may produce a result that's more user-friendly than the TSP result above, but will not be as short as the route produced by concorde (let's say for a non-trivially sized problem). 

The issue in both of these formulations is going to be the size of the problem you're solving when you model every street. In a small portion of the network there can be thousands of nodes which won't blend well with the or-tools routing library (which is very good, but not at that kind of size - not in my experience anyway). If you're happy to use an approximate solution (rather than an optimal one) in concorde you can probably get a solution which will be very very close to an optimal super tour.

Lastly, if you've already decided your route should be 500km, does it really matter which 500km it is? Technically there's no such thing as a shortest 500km route when it's defined as 500km. Perhaps I've misunderstood your question and you just want 500km's worth of non-overlapping adjacent edges which wouldn't require as complicated an algorithm/approach to solve. 

Cheers
Rob 

Paul Melzer

unread,
Sep 1, 2016, 7:14:53 PM9/1/16
to or-tools-discuss
Hello Rob! Thanks so much for giving some thought to my problem. I've written comments to yours within them.

Paul


On Thursday, September 1, 2016 at 9:48:39 AM UTC-7, Rob wrote:
Hi Paul

I'm not an ultra-distance runner, but I like big TSPs. There are a few aspects to your problem description that make it tricky. Firstly, you need a reasonably accurate distance matrix based on the walking distance between places (you can apply a factor to this to adjust to running speed). Whether you use a TSP (Travelling Salesman Problem) formulation or a VRP (Vehicle Routing Problem) formulation, an accurate distance matrix would be needed for both. There are a few ways to do this but all fall outside the scope of this group (dm's are more data generation or input than optimisation).

I can determine a fairly exact measurement of one long street (in meters, say), and once vectorized, can figure out measurements for each side. Tedious, yes, but quite doable.  

In a TSP formulation you would find the shortest path to visit all streets exactly once.

I understand that a eulerian path is not possible. I understand that there will be duplications [beyond the obvious cul-de-sacs which will require two lengths to return to the node]. My goal is a route that, as if drawn without lifting pencil from paper, a continuous route. The ultimate running of it would require no exit from the path (sleeping/eating, etc., all down en route); allowing for such exits, the requirement is that the return be to the same place of exit, thus honoring the original "unbroken" path.
 
You may want to model a street as two streets (one in each direction) if you need to visit each pavement on a street (i.e. run both directions of a road). It's actually easier if you need to run both sides of the road because you explicitly model the direction of travel then. It doubles the size of the problem, but actually makes it easier to model (there's a subtle point about the direction/bearing you approach the midpoint of a road in). You could use the Concorde TSP Solver (http://www.math.uwaterloo.ca/tsp/concorde.html) to generate a "super-tour" which is the shortest path to visit all your nodes in one run.

The Traveling Salesman Problem only requires visiting each node, not each edge (street), so that model would not work without some hand tweaking [and a more than likely increase in distance]. 

This is probably not possible in a single run in practice when you run the tour, but you could split your routes up each day if someone is collecting you in a vehicle (driving far in a vehicle doesn't matter, you effort on foot is more "costly"). 

In the VRP you would model each day of your run as a route and would probably focus on generating routes that match how long you can run for each day? As a runner, you might want to model the gradient of the roads (but this is optional). This may produce a result that's more user-friendly than the TSP result above, but will not be as short as the route produced by concorde (let's say for a non-trivially sized problem). 

Again, breaking up into different "chunks" doesn't fit with the goal of an unbroken route. In fact, I will not know when I will break from each day's running. The first day may be 24 hours, with several rests and a short sleep break on the side of the road, followed by day two of another long run. Can't really say when my first "exit" will be from the route, and would not want to be bound by an expected distance for each day. 

The issue in both of these formulations is going to be the size of the problem you're solving when you model every street. In a small portion of the network there can be thousands of nodes which won't blend well with the or-tools routing library (which is very good, but not at that kind of size - not in my experience anyway). If you're happy to use an approximate solution (rather than an optimal one) in concorde you can probably get a solution which will be very very close to an optimal super tour.

Lastly, if you've already decided your route should be 500km, does it really matter which 500km it is? Technically there's no such thing as a shortest 500km route when it's defined as 500km. Perhaps I've misunderstood your question and you just want 500km's worth of non-overlapping adjacent edges which wouldn't require as complicated an algorithm/approach to solve. 

The 500k is an estimate of the distance of streets within the city limit. This is not taking into account the streets that will be required to run twice, so there will be more that 500k—just how much more, I want to find out. I could simply try to draw it out by hand, but I am 100% certain that I would be less efficient than an algorithm that would take the nodes and edges and try each and every option possible (millions), to determine the actual shortest route(s). (There may be several ties for shortest route.) 

Cheers
Rob 

 I sort of expect that each node will have to be named, and each edge will have to be measured. Couldn't these tasks be accomplished more readily by a computer?

Alan Glennon

unread,
Sep 1, 2016, 7:52:44 PM9/1/16
to or-tools-discuss
You can use the open source GIS software QGIS (qgis.org), its Chinese Postman Solver Plugin (install via QGIS Plugins), and street data. I have attached a screenshot of a solution for the streets in the center of town.


The code for the Chinese Postman Solver is available at: https://github.com/rkistner/chinese-postman
The easiest way to add it though is to install it in QGIS: Plugins --> Manage and Install Plugins --> Settings (and select Show Experimental Plugins) --> Close. You can then do a Plugin search for Chinese Postman Solver and press Install.

Good luck!
Alan Glennon
CEO Arogi.com



Screenshot from 2016-09-01 16-30-35.png

Rob

unread,
Sep 2, 2016, 2:45:32 AM9/2/16
to or-tools-discuss
Hi Paul
I think Alan's suggestion is probably the path of least resistance. The Chinese postman algorithm has polynomial running time which trumps my suggestions. Let us know how it works out!
Cheers
Rob

Vincent Furnon

unread,
Sep 2, 2016, 4:33:32 AM9/2/16
to or-tools...@googlegroups.com
Although or-tools contains most of the bits to build a polynomial-time Chinese postman algorithm, it is not available yet (it's something we'd like to add eventually). So your best bet is indeed Alan's suggestion for now.
Note that if you wanted to minimize the extra distance due to run interruptions, your problem would probably become NP-complete and or-tools could be used in that case (actually solving a VRP on the dual graph (or line graph) of the original street graph).
Good luck with your run,
Vincent

--
You received this message because you are subscribed to the Google Groups "or-tools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to or-tools-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Paul Melzer

unread,
Sep 2, 2016, 2:42:10 PM9/2/16
to or-tools-discuss
Alan, Rob, and Vincent, I now have a solid directional path to follow—I'll return to give an update. Keeping 

Vincent, my run interruptions (exits from the route) will be mostly picked up by vehicle, so I'll not be adding any distance on foot. (If I need to exit route in order to get over a block to a restaurant or similar required pit stop, then yes, I'll have some extra distance, but this need not be considered for the routing.)

Thank you each for taking time and giving some thought to this challenge. Your assistance gives me confidence.

Kindly,

Paul

Alan Glennon

unread,
Sep 2, 2016, 3:42:48 PM9/2/16
to or-tools-discuss
Here is a solution using streets from the Redlands' city website. The solution does not include several streets disconnected from the core network. Using your local knowledge, you might add a few streets outside the city limits boundary in order to access dangling roads and also reduce the amount of backtracking near the border. If I were running it, I would probably remove the Interstate too. 

The attached GeoJSON structure possesses implicit directionality and can be visualized with arrow symbols in any standard mapping software.

Here's the solution to beat:
Total length of roads: 567.737 km
Total length of path: 758.215 km
Length of sections visited twice: 190.478

Good times!
Alan




redpathUTMZone11N.geojson
redpath.png

Paul Melzer

unread,
Sep 2, 2016, 4:07:21 PM9/2/16
to or-tools-discuss
Extraordinary! Now, there are several tweaks to the Redlands city streets proper that need to be done. 

1) connecting two loose ends just outside city limits that will shorten the overall length of run (bottom center of the map you see the road pass just outside of city limits; including that short length in the routing will remove two doubled lengths). There is at least one other similar instance. 

2) Another case is in cup-de-sacs that end within a few yards of an adjacent, perpendicular road. By connecting beyond the dead end, we can also reduce the number of doubles lengths. In this instance, there are many , MANY of these situations in this city.

3) There are two places—to the east and once length of street on the north side—where the road(s) are noncontiguous, disconnected. These would require a "lifting of the pencil," so to speak, and thus are excluded from the route.

4) There are also several instances where the roads are not maintained—perhaps they are considered to be owned by the city at this time, however are not yet built and thus do not yet exist.

All of these tweaks I have considered over the past several months and created a layer in Photoshop, above a scanned printed map from the city of it''s public streets. The small jpeg I supplied in my original post is that layer. Here is a link to the psd file, in case anyone would care to have a gander. (Warning, size is 175 mb...should complete uploading shortly.)

These tweaks will reduce the route by many miles—enough, certainly, to make the effort worthwhile. (Again, I fully expect I will need to do the grunt work, if it indeed grunt work. Question I have is how tweak-able the online map is.)

Paul

Alan Glennon

unread,
Sep 2, 2016, 6:08:17 PM9/2/16
to or-tools-discuss
Paul,

The source datasets are attached. Take a bit of time to learn how to edit in QGIS and you can tweak the data however you wish.

These two YouTube videos seem good candidates to get rolling:
https://www.youtube.com/watch?v=Ai3k7nLfkE8 (Basic Line editing)

Quickstart:
1) Install and start QGIS (qgis.org)
2) Select Layer->Add Layer->Vector Layer (then select the street and city limit GeoJSON files). The data should now appear in the main map window. 
3) In the Layers Panel (left side of the screen), see the names of your two data files. Click on the streets data file name, and it will highlight.
4) In Windows or Linux, write click and select Toggle Editing. There is also a pencil icon near the top left that affords the same thing.
5) In this Editing Mode, it is possible to delete lines, move vertices, and create new lines. For example, you could highlight a road using the "Select Features by area or single click" tool (in the top center menu, the tools has a yellow square, dotted square outline, and an arrow cursor), and press Delete on your keyboard. The road should disappear.
6) When finished, select Toggle Editing, and it will prompt whether to save the changes or not.

As you advance, you might experiment with the bulk processing tools under the menu: Vector --> Geoprocessing Tools.
To create my quick and dirty Redlands road dataset, I used the "Clip" tool to filter out roads beyond the Redlands city limit.

If you have questions, send a note. glennon [at] gmail.com

Now back to or-tools!

Regards,
Alan Glennon
redlandsstreetsUTM.geojson
redlandsUTM11N.geojson
Reply all
Reply to author
Forward
0 new messages