The XML file is complemented with a Python algorithm that deals with negation ("not good"), intensity ("very good") and emoticons ("baaad >:-D"). This algorithm has changed a lot and might be entirely different from what is wrapped in TextBlob.
The only way to reliably measure the accuracy of a sentiment analysis system is to compare its output to (thousands of) human assessments. "Seems better" is always a problem, because the accuracy is statistical and the system might be wrong about specific cases, which the human eye is very good at spotting; and humans tend to disagree about any personal opinion 30% of the time.
Typical problems are domain adaptation (e.g., what works well on book reviews might not work very well on hotel reviews or political tweets) and sarcasm. The sentiment analysis in Pattern has been tested on book reviews and movie reviews. The accuracy has lowered with 1-2% with new updates to the algorithm – but this actually means that it has become stronger in other domains, in other words, in has better generalization.
Overall, classifiers will reach the same 70-80% accuracy than a lexicon + algorithm approach used by Pattern, unless you have a lot of training data and time to fine-tune the classifier. Classifiers offer a prediction, but they do not offer insight such as the assessments.
It is not difficult to extend Pattern's lexicon with your own scores:
from pattern.en import sentiment
sentiment.annotate("wicked party", polarity=0.7)
sentiment.annotate("nice job stupid", polarity=-0.9)
print sentiment("wicked party this weekend!")
Have a look at the Sentiment.annotate() method in pattern/text/__init__.py
If you do want to use classifiers, use SVM and focus on lots of high-quality training data instead of tweaking parameter values.
Tom
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "Pattern" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
pattern-for-pyt...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.