Re: [Senserelate-developers] WordNet::SenseRelate

26 views
Skip to first unread message

Ambikesh jayal

unread,
Apr 5, 2010, 8:30:46 PM4/5/10
to linasv...@gmail.com, Ted Pedersen, si...@cs.utah.edu, sata...@gmail.com, senserelate...@lists.sourceforge.net, senserelate-users, wn-...@googlegroups.com
Dear Linas,

Thanks for developing WordNet::SenseKey. It is ecactly the program I was looking for. Thanks Siddharth  and Ted and for pointing to SenseKey.

However the output from WordNet::SenseKey seems to be bit different from the corresponding value shown by the  WordNet web interface. For example for the sense number "distinct#a#1", the WordNet::SenseKey shows sensekey as "distinct%5:00:00:different:00" where as the WordNet web interface shows sense key as "distinct%3:00:00:different:00". I apologise if I am missing something here.


Following are more examples.


Output from WordNet::SenseKey.pm
****distinct#a#1 [distinct%5:00:00:different:00]
 ****distinct#a#2 [distinct%3:00:00::]
 ****distinct#a#3 [distinct%5:00:00:separate:00]
 ****distinct#a#4 [distinct%5:00:00:definite:00]
 ****distinct#a#5 [distinct%5:00:00:clear:00]

Output from WordNet web interface http://wordnetweb.princeton.edu/perl/webwn
distinct#1 (distinct%3:00:00:different:00),
distinct#2 (distinct%3:00:00::)
distinct#3 (distinct%3:00:00:separate:00) 
distinct#4 (distinct%3:00:00:definite:00)
distinct#5 (distinct%3:00:00:clear:00),

Regards, 
Ambikesh Jayal. 
School of IS, Computing & Maths,
Brunel University,
Uxbridge, UB8 3PH,
United Kingdom.

--- On Mon, 5/4/10, Siddharth Patwardhan <si...@cs.utah.edu> wrote:

From: Siddharth Patwardhan <si...@cs.utah.edu>
Subject: Re: [Senserelate-developers] WordNet::SenseRelate
To: "Ted Pedersen" <dulu...@gmail.com>
Cc: sata...@gmail.com, "Ambikesh jayal" <jayal...@yahoo.com>, senserelate...@lists.sourceforge.net, "senserelate-users" <senserel...@lists.sourceforge.net>
Date: Monday, 5 April, 2010, 19:14

I remember a little while back Linas Vepsats wrote something that could
deal with sensekeys. He's released it on CPAN:

http://search.cpan.org/dist/WordNet-SenseKey/

-- Sid.

On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote:
> Hi Bano,
>
> Very impressive memory. :) That had totally slipped my mind, but
> indeed it is here (on my very own web page :)
>
> http://www.d.umn.edu/~tpederse/wordnet.html
>
> Here's the short description from that page...
>
> Map from QueryData to WordNet sense-keys
>
> QueryData identifies WordNet senses using a word#pos#sense format.
> WordNet identifies senses using sense-keys (aka mnemonics). This
> program creates a mapping between the QueryData format and the WordNet
> sense-key format. (This tool is not specific to Senseval-2 data - it
> is generally useful if are using QueryData to access WordNet.)
>
> So, this sounds very much like what Ambikesh may want to use. Thanks
> for pointing this out, I absolutely missed this!
>
> Thanks!
> Ted
>
> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee <sata...@gmail.com> wrote:
> > Hi Ted,
> >
> > I'm pretty rusty with Senserelate, but I vaguely recall having written a
> > program (way back when!) that at least created a map between the sensekeys
> > and the word#pos#sense format (but maybe we are talking of something else
> > here?) I googled around for it, and found this link:
> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this program
> > still exist? As far as I remember, it depended on the minutiae of the
> > various file formats in WordNet, so I wouldn't be surprised if those formats
> > have changed now rendering the program useless :-).
> >
> > -Bano
> >
> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dulu...@gmail.com> wrote:
> >>
> >> Hi Ambikesh,
> >>
> >> See my comments inline...
> >>
> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jayal...@yahoo.com>
> >> wrote:
> >> >
> >> > Hi,
> >> > The WordNet::SenseRelate returns the value in the format "infer#v#5". To
> >> > run my experiments I need to compare it with a value in the format
> >> > "infer%2:31:01::".
> >> > 1. Is there a function that takes sense key as input and returns the
> >> > corresponding sense number? For example inputting "infer%2:31:01" should
> >> > return "infer#v#5".
> >>
> >> I am not sure, but if there is it would be in WordNet::QueryData.
> >>
> >> http://search.cpan.org/dist/WordNet-QueryData/
> >>
> >> While we use WordNet::QueryData, we don't include all of its
> >> functionality, so this might be something that they provide but we
> >> don't use. There is mailing list devoted to QueryData that might be
> >> the best place to ask this - it's a google group named wn-perl
> >> (details can be found at the site above).
> >>
> >> >
> >> > 2. Can WordNet::SenseRelate  be configured to return the results in the
> >> > format  "infer%2:31:01::" ?
> >>
> >> No, we only support the wps format (word#part-of-speech#sense, as in
> >> dog#n#2).
> >>
> >> > Also can WordNet::SenseRelate  be configured for list of stopwords,
> >> > special characters?
> >>
> >> Yes. See the stoplist option described here
> >>
> >>
> >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/SenseRelate/AllWords.pm
> >>
> >> and here
> >>
> >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl
> >>
> >> and find a sample stoplist here :
> >>
> >>
> >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.19/samples/default-stoplist-raw.txt
> >>
> >> ;)
> >>
> >> Good luck,
> >> Ted
> >>
> >> > Thanks,
> >> > Regards,
> >> > Ambikesh Jayal.
> >> > School of IS, Computing & Maths,
> >> > Brunel University,
> >> > Uxbridge, UB8 3PH,
> >> > United Kingdom.
> >> > Email: ambikes...@brunel.ac.uk
> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj
> >> >
> >>
> >>
> >> --
> >> Ted Pedersen
> >> http://www.d.umn.edu/~tpederse
> >>
> >>
> >> ------------------------------------------------------------------------------
> >> Download Intel® Parallel Studio Eval
> >> Try the new software tools for yourself. Speed compiling, find bugs
> >> proactively, and fine-tune applications for parallel performance.
> >> See why Intel Parallel Studio got high marks during beta.
> >> http://p.sf.net/sfu/intel-sw-dev
> >> _______________________________________________
> >> senserelate-developers mailing list
> >> senserelate...@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers
> >
> >
>
>
>
> --
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
>
> ------------------------------------------------------------------------------
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> senserelate-developers mailing list
> senserelate...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/senserelate-developers


Benjamin R. Haskell

unread,
Apr 5, 2010, 9:38:37 PM4/5/10
to wn-...@googlegroups.com, linasv...@gmail.com, Ted Pedersen, si...@cs.utah.edu, sata...@gmail.com, senserelate...@lists.sourceforge.net, senserelate-users
(My cross-posting will fail, since I don't think I'm subscribed to some
of the lists -- feel free to forward as desired.)

The web interface is incorrect in the cases you list. Internally, the
Perl library we used at the lab for all of our projects was somewhat
stupid about the 'adjective'/'satellite' distinction. ('Stupid' in the
sense that it conflated 3 and 5 -- I gave it an option called
conflate35, which defaulted to true.)

The good news is that the 3/5 distinction is redundant with the
:'head-word':'head-sense' trailing portion of a sense key. You can just
do:

sub recover_sense35 {
local $_ = shift;
/::$/ and s/%3/%5/;
$_
}

to get the correct senses. I'm kind of surprised that the web interface
is wrong -- I thought I'd corrected that at some point -- but maybe it
got reverted when the site moved. (And I'm no longer working at WordNet
anyway.)

Best,
Ben

> --
> You received this message because you are subscribed to the Google Groups "wn-perl" group.
> To post to this group, send email to wn-...@googlegroups.com.
> To unsubscribe from this group, send email to wn-perl+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/wn-perl?hl=en.
>
>

Ambikesh Jayal

unread,
Apr 5, 2010, 10:38:45 PM4/5/10
to wn-...@googlegroups.com, linasv...@gmail.com, Ted Pedersen, si...@cs.utah.edu, sata...@gmail.com, senserelate...@lists.sourceforge.net, senserelate-users
Hi Ben,

>>However the output from WordNet::SenseKey seems to be bit different from the corresponding value shown by the  WordNet web
>> interface. For example for the sense number "distinct#a#1", the WordNet::SenseKey shows sensekey as
>> "distinct%5:00:00:different:00" where as the WordNet web interface shows sense key as "distinct%3:00:00:different:00".

>The web interface is incorrect in the cases you list. 

You mean the web interface should have shown "distinct%3:00:00:different:00" for the sense number "distinct#a#1"?


Regards,
Ambikesh Jayal.
PhD Student (Final Year)

School of IS, Computing & Maths,
Brunel University,
Uxbridge, UB8 3PH,
United Kingdom.

Benjamin R. Haskell

unread,
Apr 5, 2010, 10:51:56 PM4/5/10
to wn-...@googlegroups.com, linasv...@gmail.com, Ted Pedersen, si...@cs.utah.edu, sata...@gmail.com, senserelate...@lists.sourceforge.net, senserelate-users
On Tue, 6 Apr 2010, Ambikesh Jayal wrote:

> Hi Ben,
> >>However the output from WordNet::SenseKey seems to be bit different
> >>from the corresponding value shown by the  WordNet web interface.
> >>For example for the sense number "distinct#a#1", the
> >>WordNet::SenseKey shows sensekey as "distinct%5:00:00:different:00"
> >>where as the WordNet web interface shows sense key as
> >>"distinct%3:00:00:different:00".
>
> >The web interface is incorrect in the cases you list. 
>
> You mean the web interface should have
> shown "distinct%3:00:00:different:00" for the sense
> number "distinct#a#1"?

No. The correct sense key has a 5 if it lists a headword. Any key that
*doesn't* end with '::' (which denotes two empty colon-separated fields)
should have a 5.

distinct#a#2 is the only sense of 'distinct' that should be a '3'
(adjective) rather than a '5' (satellite). (Assuming the sense-ordering
is correct in the web interface.) So, most of the adjectives appear to
be wrong. I'm fairly certain that's to do with the server change (which
happened long ago now), because we used sense keys internally for many
things. (Since they're relatively stable, as opposed to sense numbers
which are less stable, or file-offsets which are almost entirely
unstable).

Best,
Ben

Ambikesh Jayal

unread,
Apr 5, 2010, 11:03:37 PM4/5/10
to wn-...@googlegroups.com, linasv...@gmail.com, Ted Pedersen, si...@cs.utah.edu, sata...@gmail.com, senserelate...@lists.sourceforge.net, senserelate-users
Thanks Ben. Your reply is very much helpful and appreciated.


Regards,
Ambikesh Jayal.
School of IS, Computing & Maths,
Brunel University,
Uxbridge, UB8 3PH,
United Kingdom.
Email: ambi...@gmail.com


Reply all
Reply to author
Forward
0 new messages