Hi,
I'm trying to reproduce the classifiers published at "Twitter
Sentiment Classification using Distant Supervision" to use as
baseline of my research, which is tweet sentiment classification
in pt-BR.
I'm using the dataset provided at http://help.sentiment140.com/for-students.
I was able to get similar performance with Naive Bayes as
published.
But I couldn't train a SVM classifier since the dataset has a
decent size (1.6M).
I tried to use the scikit-learn
implementation with linear kernel and unigram (33k) as
feature. All the matrix are already sparsed represented.
My best try was to run a bagging ensemble of smaller SVMs (32k)
which run on a couple of hours.
Am I missing any detail?
Could you elucidate how you trained the SVM classifier which gave
the published results?
Thank you in advance,
Breno.
--
You received this message because you are subscribed to the Google Groups "Sentiment140" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sentiment140...@googlegroups.com.
To post to this group, send email to sentim...@googlegroups.com.
Visit this group at https://groups.google.com/group/sentiment140.
For more options, visit https://groups.google.com/d/optout.