Compiling the gloss file with compile_kb?

53 views
Skip to first unread message

vict...@gmail.com

unread,
Sep 19, 2013, 9:33:47 AM9/19/13
to ukb...@googlegroups.com

Hi everyone,
I am experimenting with the tool from the command line, which works great in general. However, I am having trouble with the following:
./compile_kb -o wn30+gloss.bin wnet30_rels.txt wnet30g_rels.txt

to build the graph with both wordnet3.0 and the gloss file. The output of the above command is always "multiple occurrences" without further hints.

Any tips on how to proceed?

Thanks a lot in advance!

Victor Yan

Aitor Soroa

unread,
Sep 19, 2013, 10:42:19 AM9/19/13
to ukb...@googlegroups.com
Hi Victor,

the proper way to create the graph is this:

cat wnet30_rels.txt wnet30g_rels.txt | ./compile_kb -o wn30+gloss.bin -

This behaviour changed in version 2.0 and is not properly
documented. Sorry for that.

best,
aitor

vict...@gmail.com

unread,
Sep 19, 2013, 3:32:56 PM9/19/13
to ukb...@googlegroups.com
Thanks a lot aitor, that works like a charm :)

May I also ask if the old wn1.7 version still works substantially better than the 3.0 one as described in your paper? What suggestions would you give to ordinary users on the version to use?

Thanks,
Victor

Aitor Soroa

unread,
Sep 27, 2013, 4:19:19 AM9/27/13
to ukb...@googlegroups.com
Hi Victor,

sorry for the delay.

> May I also ask if the old wn1.7 version still works substantially
> better than the 3.0 one as described in your paper? What suggestions
> would you give to ordinary users on the version to use?

In our experiments (described in [1]) wn1.7 worked better because the
gold standard ("Senseval 2 all words" and "Senseval 3 all words") are
annotated using wn1.7. On the other side, "Semeval 2007 all words" and
"Semeval 2007 coarse grained all words" datasets are annotated with
wn2.1, and for those datasets ukb performes best using the wn2.1+xwn
graph.

So in summary, the closer the graph you use to the wn version the
dataset is annotated with, the better the result.

We also analyzed the impact of using wn3.0+glosses on those
datasets. Take a look to Section 6.4.3 in [1].

[1] Agirre E., Lopez de Lacalle O., Soroa A. 2013 Random Walks for
Knowledge-Based Word Sense Disambiguation Computational Linguistics,
40:1. ISSN 0891-2017.

http://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00164


hope this helps,
aitor

vict...@gmail.com

unread,
Sep 28, 2013, 5:28:07 AM9/28/13
to ukb...@googlegroups.com
Hi Aitor,
That's very helpful, I'll have a further look at that. Thanks a ton!

Best,
Victor
Reply all
Reply to author
Forward
0 new messages