wn.synsets(someword)[0]
>> 2) Maximizing sentence internal similarity before computing similarity
> Could you expand on this a little?
He means you could pick a single sense for each ambiguous word in such
a way as to maximize the overall similarity of the senses contained in
the sentence. You could do this by brute force, enumerating all
combinations of senses (e.g. sense 4 of w[0], sense 1 of w[1], ...,
then combine the n(n-1)/2 similarity scores somehow). There's also a
greedy algorithm for doing this called "lexical chaining".
-Steven Bird
Rather, wn.synsets(word)[0] instead of wn.synsets(word). Limiting by
part of speech would also likely help.
>> 2) Maximizing sentence internal similarity before computing similarity
> Could you expand on this a little? Do you mean something like using
> synsets that are most likely to co-occur? Without a synset n-gram
> corpus, this doesn't seem that easy to me, but I might be missing the
> point.
One possible way of disambiguating a sentence is:
1. Enumerate all possible synset combinations
2. For each one, compute the sum of pairwise similarities
3. Take the synset combination that maximizes the internal similarity
(If you don't want to sample all synset combinations, one could use a
sampling approach.
Good luck,
Jordan