Computing similarity between concepts (not words)

42 views
Skip to first unread message

vijay

unread,
Jun 7, 2012, 11:15:03 AM6/7/12
to ukblist
Hello,

I'd like to use ppv to compute the similarity between concepts
(synsets). The similarity.pl script expects word pairs and a
dictionary. However, for my application, I already have disambiguated
concepts, and do not want to use a dictionary - I just want the cosine
of the PPV vectors on the concept graph. How do I do this?

I tried using a dummy dictionary with just the concept ids as the
dictionary entries, but I'm not sure if this is the right approach.

Here is an example with words and a dictionary:
echo "abyssinian_cat alley_cat" | ./similarity.pl --sim dot --ukbargs
"--concept_graph --only_ctx_words --only_synsets --dict_file ../
lkb_sources/30/wnet30_dict.txt --kb_binfile ../bin/wnet30_rels.bin"
abyssinian_cat alley_cat 0.1137679974

to avoid using words, I create a dummy dictionary, and run using the
concept ids
02124313 02124313-n:0
02122510 02122510-n:1

echo "02124313 02122510" | ./similarity.pl --sim dot --ukbargs "--
concept_graph --only_synsets --dict_file test_dict.txt --kb_binfile ../
bin/wnet30_rels.bin"
02124313 02122510 0.1137679974

And get the same result as expected. But I would like to avoid using
the dummy dictionary.

Thanks!

Vijay Garla
Yale Computational Biology and Bioinformatics

Aitor Soroa

unread,
Jun 8, 2012, 8:46:21 AM6/8/12
to ukb...@googlegroups.com
Hi vijay,

if you change the similarity.pl script in line 86 like this:

- print O $pair->[$x] . "#n#1#1" ;
+ print O $pair->[$x] . "##1#2" ;

you can use ukb without dictionary. For instance, I'm able to run:

$ echo "02124313-n 02122510-n" | ./similarity.pl -x ../ --sim dot --ukbargs "--concept_graph --kb_binfile graph/wnet30_rels.bin"
02124313-n 02122510-n 0.1137679992

please note that you have to explicitly add the POS tag to the input
pairs ("02124313-n" instead of "02124313", etc). Also, the
--concept_graph options becomes mandatory for this approach to work.

hope this helps,
aitor
> --
> You received this message because you are subscribed to the Google Groups "ukblist" group.
> To post to this group, send email to ukb...@googlegroups.com.
> To unsubscribe from this group, send email to ukblist+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ukblist?hl=en.
>

--
ondo izan
aitor
Reply all
Reply to author
Forward
0 new messages