Thanks for sharing this, although I guess what I see in that thread is
part of my question. In that discussion it says that the ordering of
the instances within each cluster is important (because it affects the
"flattening" process using in S-recall and S-precision). However, in
the gold standard data, the order of the instances in the cluster is
simply based on the instance number. That doesn't seem to really have
much to do with the "quality" or confidence of the clustering, and so
I'm puzzled as to why the gold standard data is ordered that way. For
example (and this is a simple made up example but it reflects what I
see in the gold standard data) :
dog.n dog.n.3 dog.n.1
dog.n dog.n.4 dog.n.1
dog.n dog.n.6 dog.n.1
dog.n dog.n.1 dog.n.2
dog.n dog.n.2 dog.n.2
dog.n dog.n.5 dog.n.3
dog.n dog.n.7 dog.n.3
In the gold standard data, I think what I see is data that is sorted
by the third column (sense) and then the second column (instance id).
I really don't see how that fits in with the idea of ordering the
within-cluster results based on some notion of confidence or quality.
"In order to perform the flattening procedure, WSD/WSI must provide
snippets in each cluster already sorted by the confidence according to
which the snippet belongs to the cluster, and must rank clusters
according to their diversity."
This is from the task page, but what I see in the gold data does not
seem to be organized by confidence, unless by some miracle confidence
is associated with the instance number. :) So my question really is,
how does the gold standard data reflect an ordering based on
"confidence", or what detail am I missing?
Thanks!
Ted
> --
> You received this message because you are subscribed to the Google Groups
> "Semeval-2013 Task 11: WSI & Disambiguation within An Application" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
semeval-2013-wsi-in-a...@googlegroups.com.
> For more options, visit
https://groups.google.com/groups/opt_out.
>
>