Finding shortest paths between two groups of points: which algorithm to use?

1,081 views
Skip to first unread message

Dmitriy Tarasov

unread,
Jul 27, 2021, 12:07:03 PM7/27/21
to networkx-discuss
My Python is rusty and I'm a complete newcomer to Networkx, so sorry for such an obvious question. I have two sets of points -- let's call them Origins and Destinations --and an undirected graph. For every Origin, I need to find the nearest Destination and calculate the shortest path to it, saving all shortest paths to the same file. Am I right that I will first need to find the node on a graph that's closest to the Origin and calculate the path to a node nearest to a Destination? Is Dijkstra's algorithm what I should use, or are there other algorithms in Networkx that are better for this?

Edward Kerstein

unread,
Jul 31, 2021, 8:39:00 AM7/31/21
to networkx-discuss
Hi Dmitriy,

A couple questions:
  • What is the size of the dataset? How many nodes and edges? I ask because certain networkx functions may be impractical / slow at scale.
  • Are Origins and Destinations guaranteed to be part of the same connected component? Or could there be multiple, separate connected components to the dataset?
Here's how I might approach it. Assuming there is only one connected component and the graph is a reasonable size, say under 10k nodes:
  1. For each origin, calculate an eccentricity dictionary (the shortest length to all other nodes).
  2. Get the nearest destination from the eccentricity dict (there may be several at the same distance).
  3. Calculate all shortest paths between this origin and destination.
Here's some untested code / pseudocode:

# For each origin node
for origin_node in origin_list:
     # Get shortest lengths to all other nodes
     origin_eccentricity_dict = nx.eccentricity(G, origin_node)
     # Convert dict to list of tuples sorted from low to high [[('a', 1), ('c', 1), ('b', 2)]
     origin_eccentricity_list = list(sorted(origin_eccentricity_dict.items(), key=lambda x: x[1]))
     # Get nearest destination. There may be multiple at the same length but we will pick the first.
     destination_node = origin_eccentricity_list[0][0]
     # Calculate all shortest paths between origin and destination. This is a list of lists [[0, 1, 2], [0, 10, 2]]
     shortest_paths_list = [p for p in nx.all_shortest_paths(G, origin_node, destination_node)])

Hope that helps,
Edward Kerstein

Dmitriy Tarasov

unread,
Aug 1, 2021, 3:37:15 PM8/1/21
to networkx-discuss
The number of origins and destination points is usually in the double digits (the biggest dataset that I've seen so far is, I think, between 600 and 700 points). As for the graphs, there are, I guess, a couple of thousand nodes and edges (the graphs represent networks of roads in rural counties, my plan is to download them from OpenStreetMap using OSMnx). So, I think that the origins and destinations should all be part of the same connected component. What I'm trying to do here is similar to the Closest Facility solver in ArcGIS, with distance as the cost and a GeoDataFrame of the shortest paths (which I'll write to a shapefile or feature class) as the output.

Dmitriy Tarasov

unread,
Aug 15, 2021, 7:09:32 PM8/15/21
to networkx-discuss
I've finally run the code, and the list of tuples sorted from low to high seems to be empty; destination_node = origin_eccentricity_list[0][0] results in the error "list index out of range." I'm now trying to figure out what is going wrong and where.

On Saturday, July 31, 2021 at 8:39:00 AM UTC-4 eker...@gmail.com wrote:

Nicolas Cadieux

unread,
Aug 16, 2021, 9:23:35 AM8/16/21
to networkx...@googlegroups.com
Hi,
Out of range error means python cannot retrieve the value in the index position you are indicating.  Same error will happen if the list is empty.


Le 15 août 2021 à 19:09, Dmitriy Tarasov <dmitriy...@uconn.edu> a écrit :


--
You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/networkx-discuss/bc694ff6-10e7-4784-9f02-3a1fdee8be2bn%40googlegroups.com.

Nicolas Cadieux

unread,
Aug 16, 2021, 9:35:57 AM8/16/21
to networkx...@googlegroups.com



Le 27 juill. 2021 à 12:07, Dmitriy Tarasov <dmitriy...@uconn.edu> a écrit :

My Python is rusty and I'm a complete newcomer to Networkx, so sorry for such an obvious question. I have two sets of points -- let's call them Origins and Destinations --and an undirected graph. For every Origin, I need to find the nearest Destination and calculate the shortest path to it, saving all shortest paths to the same file. Am I right that I will first need to find the node on a graph that's closest to the Origin and calculate the path to a node nearest to a Destination?

Yes, networkX, does not do this for you.  If a node is disconnected from the graph, no path can be found to it.  

I have some functions that will do that found i. This code It can be found in here https://gitlab.com/njacadieux/upstream_downstream_shortests_path_dijkstra. Used the beta version as it’s easier to follow.


Method to the madnesses is found here.


Is Dijkstra's algorithm what I should use, or are there other algorithms in Networkx that are better for this?

Some Info on this is found in the paper.  Dijkstra one to one is probably the best unless you have peculiar situations. My code uses one to one Dijkstra but disguised as a all to all.  The goal of the beta version is to change the input to have a real one to one option.  It’s on the todo list when I get time.

--
You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.

Nicolas Cadieux

unread,
Aug 16, 2021, 9:44:52 AM8/16/21
to networkx...@googlegroups.com
Hi,

If you have trouble with your network, write to me.  I am currently working on a QGIS model that will find problems in a undirected street network.  I should be done this week.  Sorry for not helping sooner, I found this in my spam.


Le 1 août 2021 à 15:37, Dmitriy Tarasov <dmitriy...@uconn.edu> a écrit :

The number of origins and destination points is usually in the double digits (the biggest dataset that I've seen so far is, I think, between 600 and 700 points). As for the graphs, there are, I guess, a couple of thousand nodes and edges (the graphs represent networks of roads in rural counties, my plan is to download them from OpenStreetMap using OSMnx). So, I think that the origins and destinations should all be part of the same connected component. What I'm trying to do here is similar to the Closest Facility solver in ArcGIS, with distance as the cost and a GeoDataFrame of the shortest paths (which I'll write to a shapefile or feature class) as the output.
--
You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.
Message has been deleted

Dmitriy Tarasov

unread,
Aug 17, 2021, 1:53:56 PM8/17/21
to networkx-discuss
I am trying to find an alternative to the Closest Facility solver in ArcGIS (my current version of ArcGIS does not include it). I know that Networkx cannot draw routes between facilities and incidents, but only between the nodes that are closest to them, so I'll need to find such nodes. My incidents I read from a shapefile using GeoPandas:

import osmnx as ox
import geopandas as gpd
incidents_fp = "C:/Users/User Name/fakepath/incidents.shp
incidents = gpd.read_file(incidents_fp) # Reading incidents from a shapefile. Am I right that this creates a GeoDataFrame?

Next, for every incident, I try to find the nearest node on a graph and to append that node to a GeoDataFrame that will be used for finding the shortest paths. This is where I get an error:

incident_nodes = gpd.GeoDataFrame(columns=["geometry"], crs="EPSG:2235") # creates an empty GeoDataFrame to hold the nodes
for incident in incidents:
    node_near_incident = ox.distance.nearest_nodes(graph_proj, incident.geometry.x, incident.geometry.y, return_dist=False)

What could be causing this error? The incidents do have a column named geometry. This is what they look like when printed:
Incidents_printed.JPG
Sorry for the silly questions, my Python is rusty and I've never worked with GeoPandas (or Pandas) or Networkx before.

Nicolas Cadieux

unread,
Aug 17, 2021, 10:36:07 PM8/17/21
to networkx...@googlegroups.com
Hi,

It may be easier to use the functions found in the geopandas library (or shapely) instead of using osmnx.  Geopandas object are shapely object. I would ask the questions on those forums as networkX people don’t know much about road networks and GIS geometries as a hole.  You will have better luck in the geopandas communities probably.

BTW, have you tried Qneat3 in QGiS? https://root676.github.io/


Le 17 août 2021 à 13:53, Dmitriy Tarasov <dmitriy...@uconn.edu> a écrit :


I am trying to find an alternative to the Closest Facility solver in ArcGIS (my current version of ArcGIS does not include it). I know that Networkx cannot draw routes between facilities and incidents, but only between the nodes that are closest to them, so I'll need to find such nodes. My incidents I read from a shapefile using GeoPandas:

import osmnx as ox
import geopandas as gpd
incidents_fp = "C:/Users/User Name/fakepath/incidents.shp
incidents = gpd.read_file(incidents_fp) # Reading incidents from a shapefile. Am I right that this creates a GeoDataFrame?

Next, for every incident, I try to find the nearest node on a graph and to append that node to a GeoDataFrame that will be used for finding the shortest paths. This is where I get an error:

incident_nodes = gpd.GeoDataFrame(columns=["geometry"], crs="EPSG:2235") # creates an empty GeoDataFrame to hold the nodes
for incident in incidents:
    node_near_incident = ox.distance.nearest_nodes(graph_proj, incident.geometry.x, incident.geometry.y, return_dist=False)

What could be causing this error? The incidents do have a column named geometry. This is what they look like when printed:
<Incidents_printed.JPG>

Dmitriy Tarasov

unread,
Aug 20, 2021, 12:17:24 PM8/20/21
to networkx-discuss
Thank you! I will look into Qneat3.
Reply all
Reply to author
Forward
0 new messages