[Gremlin and Free Text] Using NLP in a Gremlin Traversal

247 views

Skip to first unread message

Marko Rodriguez

unread,

Jan 26, 2011, 11:13:13 AM1/26/11

to gremlin-users

Hello everyone,

I was talking with Chris Diehl on the side about natural language processing an graph traversals. I thought many of you might like the ideas and well, I think its pretty powerful so I thought I would share.

----------------------

QUESTION: What are your thoughts on processing unstructured text during a graph traversal?

ANSWER: The beauty of Gremlin 0.7 is Groovy/Java is at your fingertips. Thus, you can use all the known NLP packages for Java when traversing a graph. For example, lets assume we have a POM test Java package (to review: the POM test is a psychology test that can determine from written text the psychological state of an individual -- http://en.wikipedia.org/wiki/Profile_of_mood_states --- anger, confusion, fatigue, vigor, tension, depression.). Now lets say we are traversing the Enron email data set.

I assume a graph model as such:
marko<---from---email---to--->chris (where an email has the property textBody which is the unstructured text -- i.e. the email body)

g.v(1).inE[[label:'from']].inV{POM.isAngry(it.textBody)}.outE[[label:'to']].inV

The query above will return all the people that Marko wrote an angry email to. To be clear, I'm assuming there is some POM.class with a static method POM.isAngry(String freeText) that will return true/false depending on whether the unstructured text can be categorized as having hostility/anger in it. This POM.isAngry() method is called within a filter closure (that is, filter the current object if the closure returns false).

So, in short, with the body of NLP Java packages out there, including them into Gremlin during a graph traversal is trivial given the beauty of Groovy.

Enjoy!,
Marko.

http://markorodriguez.com

Reply all

Reply to author

Forward

0 new messages