Newbie question: read contents of synset

27 views
Skip to first unread message

abligh

unread,
Nov 18, 2009, 12:35:08 PM11/18/09
to wn-perl
I have a small perl snippet below. It fails to find synonyms of the
word "beautiful", printing:
Form: beautiful#a Sense: beautiful#a#1 Synset: beautiful#a#1
Form: beautiful#a Sense: beautiful#a#2 Synset: beautiful#a#2

I would have thought querySense("beautiful#a#1","syns") would return
an array containing the words after "==>" in the output of "/usr/local/
WordNet-3.0/bin/wn 'beautiful' -synsa", i.e. starting with
"beautious". Instead the synset only seems to contain the word itself.

What am I doing wrong?

Alex



#!/usr/bin/perl
use strict;
use WordNet::QueryData;

my $wn = WordNet::QueryData->new( noload => 1);

sub thesaurus
{
my $word = shift @_;
my @forms = $wn->validForms($word);

foreach my $form (@forms)
{
my @senses = $wn->querySense($form);
foreach my $sense (@senses)
{
my @syns = $wn->querySense($sense, "syns");
print STDERR "Form: $form Sense: $sense Synset: ",join(",
",@syns),"\n";
}
}
}

print STDERR thesaurus("beautiful"),"\n";

Jason Rennie

unread,
Nov 18, 2009, 12:49:01 PM11/18/09
to wn-...@googlegroups.com
I don't have the wn binary installed, but I can tell you that "beautious" is related to the beautiful#a#1 synset via the "similar to" WordNet relation.  You're not doing anything wrong AFAICT---QueryData is providing the correct information ("beautiful" has two synsets each with the single word beautiful).


QueryData is not a replacement for the wn binary.  QD is an interface to the WordNet database.  It helps to understand some of the internal WN structure.  The QD man page provides some of this information.  I imagine others here can provide pointers to more detailed information about the database structure.

Jason
--
Jason Rennie
Research Scientist, ITA Software
617-714-2645
http://www.itasoftware.com/

Benjamin R. Haskell

unread,
Nov 18, 2009, 1:09:08 PM11/18/09
to wn-perl
On Wed, 18 Nov 2009, abligh wrote:

> I would have thought querySense("beautiful#a#1","syns") would return an
> array containing the words after "==>" in the output of "/usr/local/
> WordNet-3.0/bin/wn 'beautiful' -synsa", i.e. starting with "beautious".
> Instead the synset only seems to contain the word itself.
>
> What am I doing wrong?

The words after '==>' in the output of wn beautiful -synsa are probably
'satellite' synsets, rather than synonyms. The organization of most
adjectives is different than the other parts of speech. Usually there is
a pair of roughly-antonymous 'head' synsets, each with related 'satellite'
synsets (the terminology coming from the arrangement:

satellite1 satellite1
| |
satellite2 -- head1 -- head2 -- satellite2
| |
satellite3 satellite3

e.g. in the 'beautiful' case,

head1: { beautiful -- (delighting the senses ... }
satellite1: { beauteous }
satellite2: { bonny, bonnie, comely, fair, sightly }
satellite3: { dishy }
...
vs.
head2: { ugly -- (displeasing to the senses ... }
satellite1: { disfigured }
satellite2: { evil-looking } # <-- really?
satellite3: { fugly }


The 'syns' relation is only for 'true' synonyms (words in the same
synset). The 'sim' relation returns the satellites (if the synset's a
head) or the head (if the synset's a satellite). And then you'd want to
call 'syns' on the returned results.

#!/usr/bin/perl
use strict;
use warnings;


use WordNet::QueryData;
my $wn = WordNet::QueryData->new( noload => 1 );

sub thesaurus {
my @forms = map $wn->validForms($_), @_;
my @sense = map $wn->querySense($_), @forms;
my @adj = map $wn->querySense($_, 'sim'), grep /#a#/, @sense;
my @all_syns = map $wn->querySense($_, 'syns'), @sense, @adj;
}

print "Syns: ", join("\t", thesaurus "beautiful"), "\n";
__END__
prints:
Syns: beautiful#a#1 beautiful#a#2 beauteous#a#1 bonny#a#1
bonnie#a#1 comely#a#2 fair#a#3 sightly#a#1 dishy#a#1
exquisite#a#4 fine-looking#a#1 good-looking#a#1
better-looking#a#1 handsome#a#1 well-favored#a#1
well-favoured#a#1 glorious#a#3 resplendent#a#1 splendid#a#1
splendiferous#a#1 gorgeous#a#1 lovely#a#1 picturesque#a#1
pretty#a#1 pretty-pretty#a#1 pulchritudinous#a#1
ravishing#a#1 scenic#a#1 stunning#a#4 pleasant#a#1

abligh

unread,
Nov 20, 2009, 5:03:24 AM11/20/09
to wn-perl
> The words after '==>' in the output of wn beautiful -synsa are probably
> 'satellite' synsets, rather than synonyms.  The organization of most
> adjectives is different than the other parts of speech.  Usually there is
> a pair of roughly-antonymous 'head' synsets, each with related 'satellite'
> synsets (the terminology coming from the arrangement:

Thanks, that's very helpful.
Reply all
Reply to author
Forward
0 new messages