Efficiently sorting through a generator of paths in order to find the longest one

77 views
Skip to first unread message

Altus

unread,
Mar 7, 2024, 8:26:26 AMMar 7
to networkx-discuss

I am trying to write a function that calculates the maximum number of unexplored cells in a grid that is reachable within a given fuel limit. For example I have a 10x10 grid, and each cell in the grid is connected to each other and their weight is the Manhattan distance between them. From this grid I create an adjacency matrix whose dimensions are then 100x100, and create a graph from the adjacency matrix using the networkx library. I then use the all_simple_paths function which returns a generator that contains all the paths from a source to a target node.

Now comes the problem, when trying to sort through this generator in order to find the path that contains the most unexplored cells (in this case just equal to the path length), my program either runs out of memory, or it takes a really long time to execute.

I have tried to convert the generator into a list, and then sort the list from longest to shortest path length, but this just causes the python script to keep growing in memory size until it takes all available memory on my PC (16GB). I have also tried using the sorted function to sort the generator as below, but this just takes too long to execute, and eventually runs out of memory as well.

```
G = nx.from_numpy_array(A) sorted_paths = sorted(nx.all_simple_paths(G, source=current_node, target=target_node), key=len, reverse=True)
```
If anybody has any suggestions as to how I can tackle my problem, it would be greatly appreciated. Whether it be sorting the generator, or maybe another function inside networkx/another python library that could help me find the max number of unexplored cells in a grid.

Dan Schult

unread,
Mar 7, 2024, 8:50:57 AMMar 7
to networkx...@googlegroups.com
To find the max, you don't need to sort them all. Just keep track of the biggest one, iterate through the yielded outputs without storing them.
Something like:
```
max_path_length = 0
max_path = None
for path in nx.all_simple_paths(...):
    if len(path) > max_path_length:
        max_path_length = len(path)
        max_path = path
```

--
You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/networkx-discuss/2e7fefa1-d571-4b26-86cf-ec9d5a1c18c9n%40googlegroups.com.

Altus

unread,
Mar 8, 2024, 1:15:01 AMMar 8
to networkx-discuss
I tried something similar to this, but as I mentioned in my post I need to find the max, but it must be the max that is still below my fuel limit. Maybe I implemented it incorrectly, but when I tried this it took very long to complete and was still running out of memory. I will give it another attempt today and report back

Ross Barnowski

unread,
Mar 8, 2024, 1:58:06 AMMar 8
to networkx...@googlegroups.com
> I then use the all_simple_paths function which returns a generator that contains all the paths from a source to a target node.

For a 10x10 grid, i.e. a complete graph with 100 nodes, there are 2.5624945006073462e+154 (give or take a few 10^140) simple paths between any two nodes (see https://oeis.org/A000522), so it's not really really a question of memory vs. compute time. At the risk of giving away homework hints, the "maximum number of unexplored cells" sounds like it could also be interpreted as the "minimum number of explored cells".

Altus

unread,
Mar 8, 2024, 3:13:47 AMMar 8
to networkx-discuss
Thanks for the reply Ross, just to clear up, it is not a homework problem :). I gathered that the number of paths between 2 nodes in a fully connected graph will be massive, the link you provided is insightful. I do not think your suggestion of solving the complement of my problem would work for my use case, my graph consists only of unexplored cells (it is a subset of my original grid that has both explored and unexplored cells). 

Something that might work, but I'm not sure as I don't have much experience with graph theory, is using my original grid (2D array in python with 0s representing explored cells and 1s unexplored cells), and turning that into a graph where the edge weights are the opposite of my exploration status definition? Then I could use the shortest path function, which will return the path that instead maximizes the number of unexplored cells. The only concern I would have then is how would I calculate the actual distance of the path, as I need to ensure this is below a given fuel limit. Or is my train of thought not along the lines of what you were trying to hint towards?
Reply all
Reply to author
Forward
0 new messages