Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Need help with designing deep tree of type relationships - can Neo4J cope?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  17 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ian Tompsett  
View profile  
 More options Nov 15 2012, 8:51 am
From: Ian Tompsett <iantompset...@gmail.com>
Date: Thu, 15 Nov 2012 05:51:04 -0800 (PST)
Local: Thurs, Nov 15 2012 8:51 am
Subject: Need help with designing deep tree of type relationships - can Neo4J cope?

I'm after some database design help, which will need some explanation to
set the context.

I'm using Neo4J (1.8 community edition, server mode - not embedded) to
store a map of the type relationships within the DLLs deployed within my
organisation. I don't think it matters, but they're .NET DLLs, generated
from C# code.

I have about 2500 DLLs or so to parse. In the process of parsing I build up
a list of all C# types detected, and for each type the list of other types
that it uses (by variable reference, method reference or base type
reference). Hope that makes sense - I want to end up with a complete tree
of type dependencies to allow impact analysis.

Once I have all of the types, certain known types are identified - we call
them 'Entity' types. Example entity types are web pages, web service
proxies, database commands.

There is then a GUI that sits on top of this information that allows a user
to select a known entity and see it's tree of entity dependencies: up the
tree to see the entities which use the entity in question (and all the way
up the tree, not just one layer); and down the tree to see the entities
used by the entity in question (and again further down to see downstream
entity dependencies). In this view, the types that are not known entities
are not shown, although their existence is still needed to link known
entities.

So basically I'm looking at a very deep tree, with certain types of nodes
to be shown, but the intervening nodes not shown

I can successfully get the data into the Neo4J database - I manage to get
all of the DLL, type and type relationship information into a QuickGraph
bidirectional graph which I then save as a GraphML file. Using
Neo4jClient's Gremlin support it's then a one step operation to get the
data imported into Neo4J via loadGraphML.

Once the data is there, I end up with 120702 different type nodes and
471144 relationships between these types - these relationships I have given
the type 'TypeUsedByType'.

From these 120702 type nodes I then manage to identify 6500 types that are
my special 'Entity' types.

And then the uncertainty about what approach begins as I'm having trouble
getting Neo4J to return the tree (up and down) for a single type.

So far I have given each Type node that refers to an 'Entity'  an
additional property - the numberic EntityTypeID of the particular entity
type (eg WebPage, DBCommand, WebService etc). I have also created a set of
nodes for the different entity types and added a relationship between each
identified entity and the node for that entity type.

I would then like a set of relationships between the known entities - say
'EntityUsedByEntity'. This would cut out the middle men - the types that
come between entities that aren't entities themselves. This is the set of
relationships I would like to show in the GUI.

So to identify these EntityUsedByEntity relationships, I need to take each
of my 6500 identified entities and go down its 'TypeUsedByType'
relationships until a known entity is found. At that stage, I would like to
stop (I don't need to go any further) and create an EntityUsedByEntity
relationship between the source entity and the found entity.

Which comes to my question: how best to do this in a way that Neo4J can
cope with. I should say up front, I'm trying this in cypher, but if gremlin
seems a better choice I'll try that. Using the java traversal API doesn't
seem an option as I'm accessing Neo4J as a server - and I'm writing in C#.

Ideally I'd like a single cypher query to create all the EntityUsedByEntity
relationships

start entity=node:node_auto_index(NodeType = 'TypeVertex')   // currently I
only have an index on the node type, but could create one for the entity
nodes
match entity<-[:TypeUsedByType*1..]-entity2       // but how to match only
the first one down each path?
where has(entity.EntityTypeId) and has(entity2.EntityTypeId)
create unique entity<-[:EntityUsedByEntity]-entity2

I had no luck with this - it went away and never came back, so I just took
a single entity node and changed the start expression to just refer to it.
ie start entity=node(99731). Even then it had troubles...

I've tried making the length of the relationships unbounded, but in the
particular case I'm looking at, that doesn't seem to come back - I can
bound it at 11 and it comes back in 17 seconds on my machine (i7 laptop
running Windows XP). If I leave it unbound, it doesn't seem to come back.

As mentioned, when I identified my entities I also created a relationship
back to the underlying entity node. I tried including that in the match
clause, but that didn't seem to make any difference.

The problem seems to me that I don't know how many relationships to go
down, so I can't use an upper bound. But that then means that cypher will
find all the paths before applying the where clause. From my cursory look
at the traversal API it seems possible to apply a filter and each step
down, but I can't see how to do this in cypher - is it possible?

I must say that I'm a bit surprised that Neo4J has trouble with a deep tree
like this. Is there something else I should be doing - smarter cypher,
different relationships etc.

Thanks heaps if you've managed to read this far, I hope this makes enough
sense. Any advice/suggestions would be greatly appreciated

BTW I used to have this data in Oracle and tried to find the relationships
with connect by queries. It was having trouble (aka not coming back) when I
was trying to populate all of the EntityUsedByEntity type relationships (by
inserting into an EntityRelationship table) in one operation. But I'm
pretty sure it would cope with my single type case that's causing me issues
in Neo4J. But I was expecting Neo4J to be better at coping with these heavy
relationships. Hopefully there's something reasonably obvious I'm missing
here.

Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Neubauer  
View profile  
 More options Nov 15 2012, 10:29 am
From: Peter Neubauer <peter.neuba...@neotechnology.com>
Date: Thu, 15 Nov 2012 16:28:40 +0100
Local: Thurs, Nov 15 2012 10:28 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Ian,
I think we can figure this out. Could you do a samll-ish sample over at
http://console.neo4j.org/usage.html, so we have a dataset to play with?
Then, we could reason about this. To start with, what do you mean by "the
first one down each path"? In your example (in the console) what would that
path be?

Thanks Ian!

/peter

Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Neo4j 1.8 GA -
http://www.dzone.com/links/neo4j_18_release_fluent_graph_literacy.html

On Thu, Nov 15, 2012 at 2:51 PM, Ian Tompsett <iantompset...@gmail.com>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 15 2012, 3:53 pm
From: Ian Tompsett <iantompset...@gmail.com>
Date: Thu, 15 Nov 2012 12:53:37 -0800 (PST)
Local: Thurs, Nov 15 2012 3:53 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Thanks Peter

I'll have a look at the best way to get a sample for you - I should be able
to choose a subset DLLs that still have the appropriate depth. I haven't
played with the console, and my initial data load was done via a bit of
gremlin from within neo4jclient so that might take me a little while.
There'll also be a timezone lag too - I'm in Australia.

In the meantime, re "the first down each path": my cypher queries return a
(large) number of paths. Each of those paths follows a tree of type
dependencies. Within each of those paths some of the nodes (ie types) will
be my known entities. So far each path, I only want the path down as far as
the first matching entity - even though the path could extend much further.

For example given a single path of type usage:  *A*
->uses->B->uses->C->uses->*D*->uses->E->uses->*F*->uses->G->uses->H

And say A, D and F are known entity types, when starting from A I would
like my query to return the path to D, without the need to get to F. I
would then create a direct relationship between A and D, but no direct
relationship between A and F. Separately I would create a relationship
between D and F.

So far I've only managed to get my queries to return all of the paths in
their entirety and then perform the filtering afterwards (using a where
clause). This means that all of the paths are hugely deep and have a lot of
repeated information.

Hope that makes it a bit clearer.

Thanks
Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Hunger  
View profile  
 More options Nov 15 2012, 4:14 pm
From: Michael Hunger <michael.hun...@neotechnology.com>
Date: Thu, 15 Nov 2012 22:14:07 +0100
Local: Thurs, Nov 15 2012 4:14 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Ian,

if you're in Australia and using neo4jclient you should try to chat to Tatham and Romiko which are also in downunder and the authors of that library. They have quite a lot of modeling experience.

Michael

Am 15.11.2012 um 21:53 schrieb Ian Tompsett:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 15 2012, 4:34 pm
From: Ian Tompsett <iantompset...@gmail.com>
Date: Thu, 15 Nov 2012 13:34:25 -0800 (PST)
Local: Thurs, Nov 15 2012 4:34 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Thanks Michael

I had wondered the same thing, but for now my issue has morphed into more
of a cypher question rather than a Neo4jClient issue. But drawing on Tatham
and Romiko's modeling experience would be easier given the timezone
compatibility - any idea how to get their attention here? Do you know if
they cruise these forums much?

Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jens Dietrich  
View profile  
 More options Nov 15 2012, 7:42 pm
From: Jens Dietrich <jens.dietr...@gmail.com>
Date: Thu, 15 Nov 2012 16:42:10 -0800 (PST)
Local: Thurs, Nov 15 2012 7:42 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Hi Ian,

I had to deal with a similar scenario - finding patterns in software
dependency graphs that need reasoning about paths (= computation of
transitive closure). If you are interested, this has resulted in this
tools: http://xplrarc.massey.ac.nz/ , the queries are in the analyse menu.
The largest graphs (extracted from the JRE) have a similar size to your
graphs.

In my experience the current version of cypher does not cope very well with
this (there are some older discussions on this issue in the forum, and a
benchmarking project to demonstrate some of the problems:
http://code.google.com/p/graph-query-benchmarks/). You might want to try
guery as an alternative to query your neo4j database
- http://code.google.com/p/gueryframework/ - the latest version has a
blueprints adapter. Have a look at the test cases, they contain some
examples similar to what you need.

Cheers, Jens (from even more down under .. NZ)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 16 2012, 12:09 am
From: Ian Tompsett <iantompset...@gmail.com>
Date: Thu, 15 Nov 2012 21:09:05 -0800 (PST)
Local: Fri, Nov 16 2012 12:09 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

OK, I think I've created a suitable (albeit very small) test case
at http://tinyurl.com/chyu863

Not surprisingly this returns queries pretty quickly, but should be enough
to get the structure in place. I've added some noise in there to include
some paths that shouldn't be followed.

Hopefully it's clear enough - TypeA is the start point (node 4) and it's an
entity. It has 3 direct type relationships - to TypeB, Type AC and Type AD.
And then there's downstream type relationships, some of which are known
entity types (reflected by both an EntityTypeId property, and a
relationship back to the EntityType nodes). TypeA has entity relationships
to 2 and 3, but not 4 - as that's via TypeAD

So initially what I'd like is the most efficient cypher query for detecting
the direct entity relationships for TypeA - EntityType2 and EntityType3 in
my example. That is, I'd like to see the following relationships returned

TypeA-TypeD
TypeA-TypeAC
TypeA-TypeAD

Thanks again for your help


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Niels Hoogeveen  
View profile  
 More options Nov 16 2012, 9:25 am
From: Niels Hoogeveen <nielshoog...@gmail.com>
Date: Fri, 16 Nov 2012 06:25:56 -0800 (PST)
Local: Fri, Nov 16 2012 9:25 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

I don't have a direct answer to the situation described here, though I
would like to add my two cents anyway.

Thinking about this problem yesterday, it occurred  to me that a traversal
description over a graph can be viewed as a formal language expression,
where the alphabet of the language consists of RelationshipTypes.

The following operations can easily be achieved:

concatenation: creation of paths
union: choice  between two traveral descriptions
kleene star: repeated traversal description

These three operations give a regular language, and with that have the same
expressive power as a regular expression.

It is possible to add intersection and complement, resulting in a visibly
pushdown language. I am not an expert on Cypher or Gremlin, but as far as I
am concerned, all these operations can be supported in a query language.
For the content management system multispective.com, I have implemented
these operations, giving a reasonably powerful language for querying a
Neo4j database.

Niels


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
okkytant...@gmail.com  
View profile  
 More options Nov 16 2012, 1:32 am
From: okkytant...@gmail.com
Date: Fri, 16 Nov 2012 06:32:34 +0000
Local: Fri, Nov 16 2012 1:32 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

T
Sent from my BlackBerry®
powered by Sinyal Kuat INDOSAT


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Neubauer  
View profile  
 More options Nov 17 2012, 10:21 am
From: Peter Neubauer <peter.neuba...@neotechnology.com>
Date: Sat, 17 Nov 2012 16:20:50 +0100
Local: Sat, Nov 17 2012 10:20 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Ian,
I am not sure I am totally with you, but this query gives you the results
you want:

start t1=node(5)
match p=t1-[:TypeUsedByType*1..]-t2-[:Entity]->e1
return t1.name, e1.name,t2.name, length(p) as l
order by l asc
limit 3

http://console.neo4j.org/r/yiix14

Does that make sense?

/peter

Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

Neo4j 1.8 GA -
http://www.dzone.com/links/neo4j_18_release_fluent_graph_literacy.html

On Fri, Nov 16, 2012 at 6:09 AM, Ian Tompsett <iantompset...@gmail.com>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 18 2012, 9:22 am
From: Ian Tompsett <iantompset...@gmail.com>
Date: Sun, 18 Nov 2012 06:22:54 -0800 (PST)
Local: Sun, Nov 18 2012 9:22 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Thanks Peter

What I'm after is a reusable way of only picking up the types that are
entities that are directly related to TypeA. By directly related, I mean
there are no other entities between the type in question and TypeA.

If you try the following

start t1=node(5)
match p=t1-[:TypeUsedByType*1..]->t2-[:Entity]->e1
return t1.name, e1.name,t2.name, length(p) as l, extract (n in nodes(p) :
n.name)

You'll see the TypeAF appears in the list. You've managed to filter out by
basically knowing the results up front (ie the Limit 3). What I'd like to
do is filter it out by detecting that there is another entity (TypeAD)
between TypeAF and TypeA.

In the cypher query I've played around with trying a second match clause to
exclude t2 values that satisfy the first match, but do not have an entity
between them and t1. So that would exclude TypeAF in our example.

Hope that makes sense? Any ideas how to cypher that? Something like

Something like:
start t1=node(5)
match p=t1-[:TypeUsedByType*1..]->t2-[:Entity]->e1 ,
      p2=t2<-[:TypeUsedByType*1..]-t3<-[:TypeUsedByType]-t1,
p3=t3-[:Entity]-e2
where p1 exists but p2 and p3 don't???
return t1.name, e1.name,t2.name, length(p) as l, extract (n in nodes(p) :
n.name)

I think I'm looking for an outer join that fails sort of concept here, to
use SQL parlance.

Thanks


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 20 2012, 5:03 pm
From: Ian Tompsett <iantompset...@gmail.com>
Date: Tue, 20 Nov 2012 14:03:17 -0800 (PST)
Local: Tues, Nov 20 2012 5:03 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

HI Peter (or anyone else..)

Have you had a chance to look at this? I'm getting concerned that the
answer to my initial question is "no" - especially given Jens' comment.

Whilst I do have a deep hierarchy to contend with, I don't need to look at
it all, I'd just like to find a way of getting my cypher query to
understand how early it can stop.

Unless my only option is the traversal API?

Thanks
Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Hunger  
View profile  
 More options Nov 20 2012, 6:22 pm
From: Michael Hunger <michael.hun...@neotechnology.com>
Date: Wed, 21 Nov 2012 00:22:21 +0100
Local: Tues, Nov 20 2012 6:22 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Ian,

I think you want to filter paths, depending on the nodes that are in there:

something like this:

start t1=node(5)
match p=t1-[:TypeUsedByType*1..]->t2,t2-[:Entity]->e1

where SINGLE(n in tail(nodes(p))
             where has(n.EntityTypeId))

return t1.name, e1.name,t2.name, length(p) as l, extract (n in nodes(p) : n.name)

I check if there is only a single node (aka t2) that has an EntityType property
tail(nodes(p)) because otherwise t1 would also be in there.
and e1 also has a property EntityType so I split the path :)

see: http://console.neo4j.org/r/m909wo

Cheers

Michael

Am 20.11.2012 um 23:03 schrieb Ian Tompsett:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marko Rodriguez  
View profile  
 More options Nov 20 2012, 5:05 pm
From: Marko Rodriguez <okramma...@gmail.com>
Date: Tue, 20 Nov 2012 15:05:44 -0700
Local: Tues, Nov 20 2012 5:05 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Hi,

Have you looked at Gremlin? It is lazy and has path and tree operators.

Good luck,
Marko.

http://markorodriguez.com
On Nov 20, 2012 3:03 PM, "Ian Tompsett" <iantompset...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 21 2012, 7:13 am
From: Ian Tompsett <iantompset...@gmail.com>
Date: Wed, 21 Nov 2012 04:13:52 -0800 (PST)
Local: Wed, Nov 21 2012 7:13 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Thanks Michael

That looks promising, although I haven't had a chance to try it with my
real set of data (I'm not working today) to guage performance. I'll let you
know.

Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 21 2012, 7:18 am
From: Ian Tompsett <iantompset...@gmail.com>
Date: Wed, 21 Nov 2012 04:18:30 -0800 (PST)
Local: Wed, Nov 21 2012 7:18 am
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Hi Marko

Firstly, apologies if my reply gets duplicated - I posted a reply (or at
least thought I did) but it doesn't seem to have appeared.

I had a little look at gremlin - enough to think it has potential. But it's
a bit foreign to me so I haven't quite got my head around how to try it.
Don't suppose you've any examples doing something similar that you could
point me at? (Gremlin isn't the only lazy one).

Thanks
Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Tompsett  
View profile  
 More options Nov 25 2012, 11:54 pm
From: Ian Tompsett <iantompset...@gmail.com>
Date: Sun, 25 Nov 2012 20:54:53 -0800 (PST)
Local: Sun, Nov 25 2012 11:54 pm
Subject: Re: [Neo4j] Need help with designing deep tree of type relationships - can Neo4J cope?

Michael

Well I've finally managed to try it with my real data. It worked out for a
simple case - a type that had one Entity related to it directly. That type
wasn't using many others. Then I went a little up the type usage hierarchy,
but by no means to the top, and it collapsed in a heap. I tried it once and
went to lunch (after it had been going for about half an hour). When I came
back it had stopped with an "Invalid Query" error (why it couldn't tell me
that straight away I don't quite understand). I tried again and several
hours later still no result.

The type I chose used 5 entities directly (detected by leaving out the
'*1..' from your query), but I don't know how deep the hierarchy would be
underneath - looking at the hierarchy in an unbounded manner like that
seems a recipe for failure.

This I'm doing via the Data browser web interface.

I used your suggested query verbatim (apart from changing the start node).

I think you now understand my data requirements, do you think there is
another way I should structure things (eg create different relationships)
that would help?

Or perhaps I'm faced with the choice of trying to work out how to do this
using Gremlin or just accepting that this isn't Neo4J's thing and looking
elsewhere.

Thanks
Ian


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »