Hi, John. Could we use slot in the truth data for training?

张东旭

unread,

Aug 20, 2014, 11:10:46 AM8/20/14

to trec...@googlegroups.com

I just found out that the offsets mentioned in the truthdata showed us the place where the query appeared , instead of the slot value (What I once believed)

And I get confused when you once mentioned this,

>In the official release, we will include another script that generates the
>*initial* entity profiles in the same JSON schema as last year's
>filter-topics.json. The important thing is that this will only include
>the three special slots:
>
> * entity_type
>
> * canonical_name
>
> * external_profiles
>
>All of the other slots are considered evaluation data for SSF. They
>cannot be used for training in CCR nor SSF.

Could we use slot value in the truth data for training?

If not, what can I use for SSF training?

(I have used the KBP2014 training data for some of slots, but there are still 13 types of slots KBP doesn't have)

John R. Frank

unread,

Aug 21, 2014, 6:52:23 AM8/21/14

to trec...@googlegroups.com

> I just found out that the offsets mentioned in the truthdata showed
> us the place where the query appeared , instead of the slot value (What
> I once believed)

I see why that is confusing. The truth data is really only *coreference*
truth data. It is document-level coreference. The offsets provide the
offsets found by the NER tagger for the mention to the entity, and have
not been adjusted by the humans.

> > In the official release, we will include another script that generates
> > the *initial* entity profiles in the same JSON schema as last year's
> > filter-topics.json. The important thing is that this will only
> > include the three special slots:
> >
> > * entity_type
> >
> > * canonical_name
> >
> > * external_profiles
> >
> > All of the other slots are considered evaluation data for SSF. They
> > cannot be used for training in CCR nor SSF.
>
> Could we use slot value in the truth data for training?

That is correct.

> If not, what can I use for SSF training? (I have used the KBP2014
> training data for some of slots, but there are still 13 types of
> slots KBP doesn't have)

Since the assessors were asked to focus on specific entities, we asked
them to capture slots that were natural for the entities. We tried to
audit those selections and guide them toward KBP and ACE slots, so I
believe there are at least close analogs for all of the slots in ACE and
KBP. If you find one that you believe is not close enough, let's discuss
it specifically.

John

张东旭

unread,

Aug 21, 2014, 7:57:56 AM8/21/14

to trec...@googlegroups.com

Thanks, John!

I didn't know ACE Before, I'll figure it out.

So, we can't use the 'slot' information in the truthdata for training ssf, is that right?

What can I use from truth data for training ssf track?

在 2014年8月21日星期四UTC+8下午6时52分23秒，John R. Frank写道：

John R. Frank

unread,

Aug 21, 2014, 8:45:09 AM8/21/14

to trec...@googlegroups.com

> Thanks, John!I didn't know ACE Before, I'll figure it out. So, we can't

> use the 'slot' information in the truthdata for training ssf, is that
> right?

correct.

> What can I use from truth data for training ssf track?

ACE and KBP have produced quite a few corpora with annotations that you
can use as truth data. There are also some trained systems that may be
fruitfully applied to KBA, such as the RelationFactory from UMass.

jrf

Reply all

Reply to author

Forward