Seeking Project Teammates on sentiment analysis with IMDB and Twitter data

26 views
Skip to first unread message

Yitong Zhou

unread,
Feb 4, 2013, 8:01:49 PM2/4/13
to 10-701-spri...@googlegroups.com
Mingtao and I are currently teamed together and we are thinking about using movie and Twitter data to do sentiment analysis and rating predictions.

Currently we have an IMDB movie reviews dataset with 50,000 reviews tagged evenly with 25k positive ratings(7~10) and 25k negative ratings(0~4), together with 50,000 unlabeled documents (neutral ratings) for unsupervised learning. And also we can get access to TREC Microblog Dataset and Netflix Challenge Dataset. Since we have noticed that tweets and movie reviews are both short-length documents and may share or differ to some degrees, maybe we can use them to practice sentiment analysis or information retrieval.

The interesting topic could be:
1) Sentiment analysis on movies using IMDB dataset to validate.
2) Adapt such model to the current Twitter data try to predict the IMDB rating of newly released movies. (Maybe we can try that on April and May movies to validate the effectiveness.)
3) Real-time tweets filtering ---- given a query or a certain topic at a certain timestamp, filering out all the relevant top tweets after then.
4) Also maybe novelty detection.

We are both master students of MISM and coding geeks, heavily enjoying math inductions and agile programming (though always being tired out). If you are interested or have other interesting ideas, please contact us.

Yitong Zhou (yit...@andrew.cmu.edu)
Mingtao Zhang (ming...@andrew.cmu.edu)
Reply all
Reply to author
Forward
0 new messages