Latency issue with large relationship dataset

10 views
Skip to first unread message

gram...@wootcloud.com

unread,
Feb 7, 2018, 6:49:52 AM2/7/18
to Neo4j
Need help in debugging latency issue with large relationship dataset. 

Please find the details below:

System configuration
8 core, 32 GB VM on cloud

Neo4J configuration
page cache - 20 GB
heap - 8 GB

ObjectModel
Nodes share a relationship "COMMUNICATING_TO" with a relationship property "timestamp".

Query
find all the communications between nodes for a given time period, remove duplicate communications between two given nodes.
`MATCH (n1)-[r:COMMUNICATING_TO]->(n2) WHERE r.timestamp >= <fromTimestamp> AND r.timestamp <= <toTimestamp> RETURN {id:id(n1)} as fromNode, COLLECT(DISTINCT {id:id(n2)}) as toNode`

Data
100K nodes, with 500 millions relationships between them.

Challenge
For a given day, there are 2 million relationships that exist and the query time is ~50 seconds.

Any suggestions that can help in optimizing the query, system parameters and the object model is appreciated. 

Thanks.
Reply all
Reply to author
Forward
0 new messages