Subgraphs

515 views
Skip to first unread message

Derrick Johnson

unread,
Apr 11, 2013, 10:49:05 AM4/11/13
to gremli...@googlegroups.com
A while ago, there was a discussion on how to obtain a subgraph. I want to revisit this conversation, and see if  Tinkerpop can handle my use case.

I have a large graph, and I want to dynamically obtain subgraphs based off of edge properties. For example:

Using my Facebook social network, I want to get a graph of friends that are female, and also friends that are male, in the same graph.  Is there any way of doing this with the API?  FOr example:

Graph g = myNetwork.getSubgraph(vertices: "sex"="male", edges:"only_links_between_males")

I hope this makes sense. In other words, I want the simplest way of getting a *Graph* (not a just a separate collection of vertices and a collection of edges) which has a certain type of node, and no dangling edges..


Thanks,
-Derrick

Stephen Mallette

unread,
Apr 11, 2013, 2:18:48 PM4/11/13
to gremli...@googlegroups.com
I think that your case can be easily done with a little bit of Gremlin.  This is my usual approach:

gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]

// here's the subgraph i want...just those people who know others
gremlin> g.E.has('label','knows')           
==>e[7][1-knows->2]
==>e[8][1-knows->4]

// this is my target subgraph
gremlin> sg = new TinkerGraph()                                                                                      
==>tinkergraph[vertices:0 edges:0]

// define a "Get Or Create" function for vertices...use ElementHelper to copy properties
gremlin> def goc(v,g){nv=g.getVertex(v.id);if(nv==null){nv=g.addVertex(v.id,ElementHelper.getProperties(v))};nv}       
==>true

// subgraph away
gremlin> g.E.has('label','knows').sideEffect{sg.addEdge(it.id,goc(it.outV.next(),sg),goc(it.inV.next(),sg),it.label,ElementHelper.getProperties(it))}.iterate()
==>null
gremlin> sg.E
==>e[7][1-knows->2]
==>e[8][1-knows->4]

From here you can out the subgraph to GraphML, do more gremlin just over the subgraph, etc.  What's really nice about using TinkerGraph in this way is that it's the only graph implementation that will preserve the element identifiers.  Of course, the limitation is that TinkerGraph is in-memory only so your subgraph will that ceiling hanging over its head...but other than that, this works well.  I'm sure you could flip this gremlin into a function such as the one you described.

I added just added basically the same stuff to: http://gremlindocs.com/#recipes/subgraphing

HTH,

Stephen




--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Zack Maril

unread,
Apr 11, 2013, 3:16:15 PM4/11/13
to gremli...@googlegroups.com
I've been thinking about finding subgraphs and algorithms based on subgraphs for a few weeks now. I don't like the idea of pulling out vertices from the graph and creating another graph some place else. It seems brittle. You are taking the vertices out of the graph and storing them someplace else to play with, while the graphs are getting updated in real time. Race conditions might eventually pop up and you have to deal with all this state and moving copies of bits around. Yuck. 

In a way, you are currently asking a question about how to find the vertex or edge induced subgraph of a certain graph with gremlin and then exporting and working with that. What I want to explore in the next few months is "traversal induced subgraphs", or minors of the graph, defined by some combination of traversals and filters. Effectively, for any given algorithm you want to run on a property graph, you could define a traversal that transforms the given graph into the minor and then run the algorithm on that minor. The series of traversals would correspond to edge contractions and the filters on properties would correspond to vertex and edge deletions. The interesting thing about attacking the subgraph problem this way means that you can run your algorithms on the graph as it exists now and not have to worry about transforming the graph somehow. 

All of this is a direct result of my work building Ogre[0]. I can define a traversal as a function that is then later used in another query like so: 

(require '[ogre.core :as q])

(defn friend-of-friend [pipe] 
    (-> pipe (q/--> :friend) (q/--> :friend))

(q/query sample-vertex
             friend-of-friend
             into-vec!)

This is perfectly legal in Ogre and is encouraged. So we can define all sorts of generalized algorithms that take in a property graph, a traversal method and any other needed arguments and then executes the algorithm based on the minor induced by the traversal method. This would allow us to work on the graph in real time without worrying about exporting stuff. In the same way, it might also provide a cleaner way to build a generalized graph algorithm library without too much hassle. I've thought a bit about how to do this in pure Gremlin, but it's not obvious to me how to make it work cleanly. I'm not as good in Java with throwing functions around as I am in Clojure. I'm fairly confident this approach will lead interesting results in regards to Titan/Titanium integration. The general difficultly of serializing closures/functions in Faunus means that this will be harder to accomplish with that library though. 

Just some thoughts I had that were tangentially related. 
-Zack

Reply all
Reply to author
Forward
0 new messages