average polysemy of nouns, verbs, adj, adv

453 views
Skip to first unread message

melo.d...@hotmail.com

unread,
Mar 8, 2013, 10:14:48 AM3/8/13
to nltk-users
Hello,

Does anybody knows how to do that "The polysemy of a word is the
number of senses it has. Using WordNet, we can determine that the noun
dog has 7 senses with: len(wn.synsets('dog', 'n')). Compute the
average polysemy of nouns, verbs, adjectives and adverbs according to
WordNet"

We tried that but it says error at the end.

>>> from nltk.corpus import wordnet as wn
>>> def average_polysemy(pos):
synset_list=list(wn.all_synsets(pos))
lemma_list=[synset.lemma_names for synset in synset_list]
contador=0
for lemma in lemma_list:
contador_new=len(wn.synsets(lemma, pos))
contador=contador_new+1
return contador/len(synset_list)


>>> average_polysemy("n")

Thanks

zabbarob

unread,
Mar 9, 2013, 9:10:02 AM3/9/13
to nltk-...@googlegroups.com
hey,

is this what you need:

from __future__ import division
from nltk.corpus import wordnet as wn
pos_tags = [wn.ADJ, wn.ADV, wn.NOUN, wn.VERB]
polysemy = ((p, len(wn.synsets(l, p))) for l in lemmas for p in pos_tags if len(wn.synsets(l, p)) > 0)
pos_count = {k: (0, 0) for k in pos_tags}
for pos, count in polysemy:
    pos_count[pos] = (pos_count[pos][0] + 1, pos_count[pos][1] + count)
for pos in pos_count:
    print pos, pos_count[pos][1] / pos_count[pos][0]

Cya,
Robert.

Radhika Gaonkar

unread,
Mar 9, 2013, 4:05:38 PM3/9/13
to nltk-...@googlegroups.com
Hey see i'm following the particular as first part of my keyword extraction:

  1. I'm extracting the contents of anchor, title, all the heading tags and doing the text processing for these
  2. Later i'm considering the entire raw text of the web page by cleaning up all the tags , js etc.
The problem is that this raw text also has the contents of point one. Now if i have already done the text processing for point 1, i don't want to repeat it in point 2 . Is there a way i can do this . One obvious way would be to have loops to eliminate the words already processed. Any alternative to this?


Thanks


--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Radhika Gaonkar
3rd year B.E. Hons Computer Science
BITS Pilani K. K . Birla Goa Campus
Contact no. || +91 9004753662


Reply all
Reply to author
Forward
0 new messages