I've been thinking about finding subgraphs and algorithms based on subgraphs for a few weeks now. I don't like the idea of pulling out vertices from the graph and creating another graph some place else. It seems brittle. You are taking the vertices out of the graph and storing them someplace else to play with, while the graphs are getting updated in real time. Race conditions might eventually pop up and you have to deal with all this state and moving copies of bits around. Yuck.
In a way, you are currently asking a question about how to find the vertex or edge induced subgraph of a certain graph with gremlin and then exporting and working with that. What I want to explore in the next few months is "traversal induced subgraphs", or minors of the graph, defined by some combination of traversals and filters. Effectively, for any given algorithm you want to run on a property graph, you could define a traversal that transforms the given graph into the minor and then run the algorithm on that minor. The series of traversals would correspond to edge contractions and the filters on properties would correspond to vertex and edge deletions. The interesting thing about attacking the subgraph problem this way means that you can run your algorithms on the graph as it exists now and not have to worry about transforming the graph somehow.
All of this is a direct result of my work building Ogre[0]. I can define a traversal as a function that is then later used in another query like so:
(require '[ogre.core :as q])
(defn friend-of-friend [pipe]
(-> pipe (q/--> :friend) (q/--> :friend))
(q/query sample-vertex
friend-of-friend
into-vec!)
This is perfectly legal in Ogre and is encouraged. So we can define all sorts of generalized algorithms that take in a property graph, a traversal method and any other needed arguments and then executes the algorithm based on the minor induced by the traversal method. This would allow us to work on the graph in real time without worrying about exporting stuff. In the same way, it might also provide a cleaner way to build a generalized graph algorithm library without too much hassle. I've thought a bit about how to do this in pure Gremlin, but it's not obvious to me how to make it work cleanly. I'm not as good in Java with throwing functions around as I am in Clojure. I'm fairly confident this approach will lead interesting results in regards to Titan/Titanium integration. The general difficultly of serializing closures/functions in Faunus means that this will be harder to accomplish with that library though.
Just some thoughts I had that were tangentially related.
-Zack