I was talking with Chris Diehl on the side about natural language processing an graph traversals. I thought many of you might like the ideas and well, I think its pretty powerful so I thought I would share.
----------------------
QUESTION: What are your thoughts on processing unstructured text during a graph traversal?
ANSWER: The beauty of Gremlin 0.7 is Groovy/Java is at your fingertips. Thus, you can use all the known NLP packages for Java when traversing a graph. For example, lets assume we have a POM test Java package (to review: the POM test is a psychology test that can determine from written text the psychological state of an individual -- http://en.wikipedia.org/wiki/Profile_of_mood_states --- anger, confusion, fatigue, vigor, tension, depression.). Now lets say we are traversing the Enron email data set.
I assume a graph model as such:
marko<---from---email---to--->chris (where an email has the property textBody which is the unstructured text -- i.e. the email body)
g.v(1).inE[[label:'from']].inV{POM.isAngry(it.textBody)}.outE[[label:'to']].inV
The query above will return all the people that Marko wrote an angry email to. To be clear, I'm assuming there is some POM.class with a static method POM.isAngry(String freeText) that will return true/false depending on whether the unstructured text can be categorized as having hostility/anger in it. This POM.isAngry() method is called within a filter closure (that is, filter the current object if the closure returns false).
So, in short, with the body of NLP Java packages out there, including them into Gremlin during a graph traversal is trivial given the beauty of Groovy.
Enjoy!,
Marko.