Loading the sentiwordnet file

949 views
Skip to first unread message

MT

unread,
Jun 17, 2011, 8:31:50 AM6/17/11
to Pattern
Hi,

After reading this thread http://groups.google.com/group/pattern-for-python/browse_thread/thread/1537aa6f4a797853,I
tried to use this sample of code :

wordnet.sentiment.load(path="/SOMEDIRECTORY/SentiWordNet/
SentiWordNet_3.0.0_20100908.txt")

### 'Note: the + in that example should be a minus (SWN stores the
negative value as "0.125", not as "-0.125"). '
def sentiment_score(w):
positive, negative, objective = wordnet.sentiment[w]
return positive - negative

#testing sentiment_score - sentiment_score("happy") should be 0.875
print sentiment_score("happy")


but I still getting this :

Traceback (most recent call last):
File "election-twitter.py", line 5, in <module>
wordnet.sentiment.load(path="/SOMEDIRECTORY/SentiWordNet/
SentiWordNet_3.0.0_20100908.txt")
File "/usr/local/lib/python2.7/dist-packages/pattern/en/wordnet/
__init__.py", line 353, in load
self.load_sentiwordnet()
File "/usr/local/lib/python2.7/dist-packages/pattern/en/wordnet/
__init__.py", line 368, in load_sentiwordnet
for s in open(glob.glob(path)[0]).readlines():
IndexError: list index out of range

any idea please ? thanks.

Tom De Smedt

unread,
Jun 29, 2011, 7:18:40 PM6/29/11
to Pattern
The error says that glob.glob(path) is an empty list. It seems that
the script is looking in the wrong folder for the sentiwordnet file..
Note that the sentiwordnet file is not included in the Pattern module,
it's a separate project you have to download. Once you have the file
you can place it in pattern/en/wordnet/ after which

from pattern.en import wordnet
wordnet.sentiment.load()

should work.

T


On Jun 17, 2:31 pm, MT <mohamed.te...@gmail.com> wrote:
> Hi,
>
> After reading this threadhttp://groups.google.com/group/pattern-for-python/browse_thread/threa...

William Richard

unread,
Feb 18, 2013, 12:12:35 PM2/18/13
to pattern-f...@googlegroups.com
I realize this is old, but I think I know why this isn't working.  Before I start my explanation though, I want to say that I'm an outsider - I'm not familiar with how this project is maintained, and by pointing out what I believe to be a typo, I mean no offense or insult.

wordnet.sentiment.load() is the following:

    def load(self, path=None):
        # Backwards compatibility with pattern.en.wordnet.Sentiment.
        if path is not None:
            self._path = path
        self._parse()

Notice that self._path is assigned to the optional path argument.

Then, self_parse() is called, which looks  for the sentiwordnet file in the _parse_path function, which is as follows.

    def _parse_path(self):
        """ For backwards compatibility, look for SentiWordNet*.txt in:
pattern/en/parser/, patter/en/wordnet/, or the given path.
"""
        try: f = (
            glob(os.path.join(self.path)) + \
            glob(os.path.join(MODULE, self.path)) + \
            glob(os.path.join(MODULE, "..", "wordnet", self.path)))[0]
        except IndexError:
            raise ImportError, "can't find SentiWordnet data file"
        return f

Notice that it looks in self.path - without the underscore before path.

A quick search of sentiment file reveals that load() is the only method that sets or uses the _path attribute.

This appears to be a simple typo - I changed '_path' in load() to 'path', everything works as expected.

The change is minimal, so I'm not sure it's worth me becoming a contributor, or dealing with a pull request to fix it.

Tom De Smedt

unread,
Feb 21, 2013, 6:17:16 PM2/21/13
to pattern-f...@googlegroups.com
@Rchard: Thanks for your bug report! I'll add your changes to the latest version.

@MT: Once this is fixed it should work, but I'd like to point out that the more recent releases of Pattern have sentiment analysis bundled in (I should update the example case study). In many cases it even outperforms SentiWordNet, since it focusses on adjectives that occur frequently in real-world language use instead of assigning a score to each word in a sentence, which can lead to noise in longer sentences.

Using Pattern's sentiment analysis is easy:

from pattern.en import sentiment
print sentiment("it was a terrible day")
print sentiment("it was a terribly exciting day")
print sentiment("it was not terribly exciting")

The return value is a (polarity, subjectivity) tuple, where polarity is a number between -1 and +1 indicating negative or positive tone.

Best,
Tom

Reply all
Reply to author
Forward
0 new messages