Possible to match part of the value of a relationship's property in a Cypher query?

21 views
Skip to first unread message

kumar...@gmail.com

unread,
Aug 28, 2015, 7:03:59 AM8/28/15
to Neo4j
I am a novice to GraphDB and trying to construct a graph to show application relations in Neo4j DB. The main objective is to have a graph that will show different datas that flow from one application to another and be able to search for flow of specific data.

Eg:

DataSet A
DataSet B
DataSet C
DataSet D

App A
App B
App C
App D
App E

This is how the relations will build up over a period of time:
DataSet A -GoesTo->App A -SendsTo {DataOf: "DataSet A"}->App B
DataSet B -GoesTo->App A -SendsTo {DataOf: "DataSet A, DataSet B"}->App B - SendsTo {DataOf: "DataSet B"}->App C - SendsTo {DataOf: "DataSet B"}->App E
DataSet C -GoesTo->App C -SendsTo {DataOf: "DataSet C"}->App D - SendsTo {DataOf: "DataSet C"}->App E
DataSet D -GoesTo->App B -SendsTo {DataOf: "DataSet B, DataSet D"}->App C - SendsTo {DataOf: "DataSet B, DataSet D"}->App E

The DataOf value of different relations will have the different data sets appended to it as more data is added. I have read through some cases and going through the documentation to figure out whether the approach I am taking is the right one or I should create a new relation for every Data Set, and I am unable to find a case like mine. I need advice from those who have the know-how, how I should approach this. If the approach I am taking is right, then how can I query flow of specific Data Set. I have put in some data in the DB and my queries are not working when I ask it to match only DataSet B or DataSet D, etc.

Thanks

Sumit Gupta

unread,
Aug 28, 2015, 9:48:46 PM8/28/15
to Neo4j
Hi,

Assuming that you have 2 Labels: -

1. "App" - For all nodes which are of type "App". All Nodes labelled with "App" defines a property with the value as the name of App. example - "name = App A"
2. "DataSet" - for all nodes which are of type "DataSet". All Nodes labelled with "DataSet" defines a property with the value as the name of DataSet. example - "name = DataSet"

if this is the model than below query should work: -

match (n:DataSet {name:"DataSet B"})-[r]-(n1) return n,r,n1;


if it does not, then please post your creation script.

Thanks,
Sumit

Craig Taverner

unread,
Aug 30, 2015, 10:32:35 AM8/30/15
to ne...@googlegroups.com
I think the use of properties on the relationships to refer to nodes is non-ideal. It is like a non-indexed foreign key in an RDBMS which is not an efficient solution. And you double up on this issue by having multiple dataset references in a single sendsTo relationship. So I think there are two improvements you could make.

The first one you already suggested yourself, having multiple sendsTo relationships for each dataset, so you don't create the arrays (or long strings) of foreign keys.

The second one is to get rid of the dataset reference completely with a relationship back to the dataset itself. This requires you split the sendsTo relationship into two with an intermediate node that can have relationships back to the datasets. This node represents data flowing from one app to another, and the relationships back to the dataset tell us what dataset(s) are involved. So replace your DataOf property with graph structure.

CREATE (dA:Dataset {name:"A"})
CREATE (dB:Dataset {name:"B"})
CREATE (dC:Dataset {name:"C"})
CREATE (dD:Dataset {name:"D"})

CREATE (aA:App {name:"A"})
CREATE (aB:App {name:"B"})
CREATE (aC:App {name:"C"})
CREATE (aD:App {name:"D"})
CREATE (aE:App {name:"E"})

CREATE (sAB:SendsData)
CREATE (sBC:SendsData)
CREATE (sCD:SendsData)
CREATE (sDE:SendsData)
CREATE (sCE:SendsData)

// Apps communication
CREATE (aA)-[:SENDS_DATA]->(sAB)-[:SENDS_TO]->(aB)
CREATE (aB)-[:SENDS_DATA]->(sBC)-[:SENDS_TO]->(aC)
CREATE (aC)-[:SENDS_DATA]->(sCD)-[:SENDS_TO]->(aD)
CREATE (aD)-[:SENDS_DATA]->(sDE)-[:SENDS_TO]->(aE)
CREATE (aC)-[:SENDS_DATA]->(sCE)-[:SENDS_TO]->(aE)

// Dataset A data flow
CREATE (dA)-[:GOES_TO]->(aA)
CREATE (sAB)-[:SENDS_USING]->(dA)

// Dataset B data flow
CREATE (dB)-[:GOES_TO]->(aA)
CREATE (sAB)-[:SENDS_USING]->(dB)
CREATE (sBC)-[:SENDS_USING]->(dB)
CREATE (sCE)-[:SENDS_USING]->(dB)

// Dataset C data flow
CREATE (dC)-[:GOES_TO]->(aC)
CREATE (sCD)-[:SENDS_USING]->(dC)
CREATE (sDE)-[:SENDS_USING]->(dC)

// Dataset D data flow
CREATE (dD)-[:GOES_TO]->(aB)
CREATE (sBC)-[:SENDS_USING]->(dD)
CREATE (sCE)-[:SENDS_USING]->(dD)

This creates a graph like that in the attached image. I'm not sure how you plan to use this, and depending on your actual queries the graph could be adjusted to optimize for your case, but at least this could be a starting point.

Regards, Craig


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Screenshot 2015-08-30 16.31.05.png
Reply all
Reply to author
Forward
0 new messages