Neo4j Isochrones, All Nodes Reachable

BalzaelRisingDark

no leída,

7 jul 2016, 11:21:107/7/16

a Neo4j

Hi everyone,
I'm new in Neo4j world, I'm trying to elaborate isochrones in very large DB created from osm datas and loaded into Neo4j.
Isochrones should be intended as: "All nodes reachable from a starting node within a certain time", in my db the nodes are for example the crossroads of a city and the relations are the streets with the time needed in minutes to go from a crossroad to another.
I wrote something like this in cypher

MATCH (S) WHERE S.osm_id="1683208894" //to find the starting node

MATCH path = (S)-[*0..75]->(E) //to find all nodes connected with a max of hops of 75 relations, if I do not put a limit it will calculate every possible path for every possible node in a 10GB db... not the best..

WITH E,MIN( REDUCE(cost=0.0, r IN RELATIONSHIPS(path) | cost + toFloat(r.minutes)) ) AS cost //to calculate cost

WHERE cost<15 //if i want only the isochrones of 15 min

RETURN cost,collect(E) as isochrones

ORDER BY cost

The problem is that execution's time degenerate after 70 relations and 70 is enough for isochrones of 12 minutes, I want to find at least 30 minutes isochrones.
Is that possible in Cypher without waiting a year?
Will be better if go for it by Java? Am I forced to write an apposite algorithm for this?

THANKS!
M.G.

Nodes.png

Michael Hunger

no leída,

7 jul 2016, 11:48:157/7/16

a ne...@googlegroups.com,Craig Taverner

Hi,

1. you should use a label + index or constraint to find your starting node

2. I'd change it to use shortest path perhaps, at least rewrite it a bit

3. do you have only one relationship-type ?

// lookup via label + osm_id in index/constraint

MATCH (S:Waypoint) WHERE S.osm_id="1683208894"

MATCH path = (S)-[rels *0..75]->(E)

WHERE REDUCE(cost=0.0, r IN rels | cost + toFloat(r.minutes)) < 15

WITH E, REDUCE(cost=0.0, r IN rels | cost + toFloat(r.minutes)) as cost

RETURN cost, collect(distinct E) as isochrones

ORDER BY cost

I'm not sure if the cost expression is pulled into the expansion, please compare the query above with the one below

explain MATCH (S:Waypoint) WHERE S.osm_id="1683208894"

MATCH path = (S)-[rels *0..75]->(E)

WITH E, REDUCE(cost=0.0, r IN rels | cost + toFloat(r.minutes)) as cost

WHERE cost < 15

RETURN cost, collect(distinct E) as isochrones

ORDER BY cost

In reality you want to use an path expansion algorithm. If you use Neo4j 3.0.x you can check out the graph algorithms of the APOC library (like dikjstra, A*, allSimplePaths etc)

https://github.com/neo4j-contrib/neo4j-apoc-procedures#graph-algorithms-work-in-progress

Or perhaps check out Neo4j Spatial which got an uplift for 3.0 as well, you could use a boundary-circle for a first estimate, e.g. 30km and then do the post-filtering with the actual cost

http://github.com/neo4j-contrib/spatial

Craig can say more to that.

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Craig Taverner

no leída,

8 jul 2016, 5:44:268/7/16

a ne...@googlegroups.com

Regarding Neo4j Spatial, I'm curious if that was used to import the OSM data in the first place? If so, the graph model you get is not ideal for routing, or in fact any algorithm that searches for least cost paths, which yours does. This is because the graph is the full OSM graph with all intermediate nodes stored. You would want to make a routing graph with only points of decision as nodes, and relationships between including the distance (or in your case 'minutes'). Since your query already used r.minutes, I'm guessing you have already done this.

Then I would say Michaels ideas are the best. The problem with a pure Cypher query like your's is that it will produce all length 75 paths with a one-direction depth first search, filtering on cost. With the graph algorithms accessible through apoc, you could get a bi-directional bredth first search which, while taking more memory to run, should also run faster (if there is a path between S and E).

Michaels idea to use Neo4j Spatial to pre-filter the possible set of E nodes to only those within a certain distance is also one way to reduce the search space and speed things up before doing the path searches.

On Thu, Jul 7, 2016 at 3:36 PM, BalzaelRisingDark <gallega...@gmail.com> wrote:

Se ha eliminado el mensaje

MattiaGallegati

no leída,

11 jul 2016, 10:30:3911/7/16

a Neo4j

Thanks Michael and Craig,
1)Done, thanks.
2)shortestpath seems to make things go worst... e.g. MATCH path = shortestPath((S)-[:CONNECTED*1..2]->(E)) increase my execution..
3)yes, CONNECTED

APOC: awesome, but wich algo can I use in order to increase performance? I have only the starting node and not the ending node.. I tried path.exapand but it is quite similar to my solution..

SPATIAL: well that is an idea that I was evaluating.. first of all: if I reduce the number of nodes still the number of relations and the paths to calculate won't change will this be enought in order to reduce ex. time? Doesn't it depends from the number of relations that my query finds starting from the node specified?
Is that possible to query on a query's result (like nested queryes)? How can I do that?

THANK YOU SO MUCH

Responder a todos

Responder al autor

Reenviar