I'm on a topic modeling list
https://lists.cs.princeton.edu/mailman/listinfo/topic-models. There is a
lot of related research going on there. For performance reasons, nearly
all of the code is in either Java or C++, although there are Python and
R "wrappers" for much of the code. Most of the code I've seen go by is
open source and could probably be incorporated into SwiftRiver if it can
be somehow wrapped in PHP or Python. I don't know what's inside Techmeme
or any of the "commercial" journalism automation products, though.
I've got all of the R NLP code I could find (and the underlying C++ and
Java code) in the Social Media Analytics Research Toolkit if you want to
play with that. It's all open source, so if you (or SwiftRiver) see
something you like, just grab it. ;-)
--
http://twitter.com/znmeb http://borasky-research.net
"A mathematician is a device for turning coffee into theorems." -- Paul
Erdős
--
You received this message because you are subscribed to the Google Groups "SwiftRiver" group.
To post to this group, send email to swift...@googlegroups.com.
To unsubscribe from this group, send email to swiftriver+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/swiftriver?hl=en.