> I just found out that the offsets mentioned in the truthdata showed
> us the place where the query appeared , instead of the slot value (What
> I once believed)
I see why that is confusing. The truth data is really only *coreference*
truth data. It is document-level coreference. The offsets provide the
offsets found by the NER tagger for the mention to the entity, and have
not been adjusted by the humans.
> > In the official release, we will include another script that generates
> > the *initial* entity profiles in the same JSON schema as last year's
> > filter-topics.json. The important thing is that this will only
> > include the three special slots:
> >
> > * entity_type
> >
> > * canonical_name
> >
> > * external_profiles
> >
> > All of the other slots are considered evaluation data for SSF. They
> > cannot be used for training in CCR nor SSF.
>
> Could we use slot value in the truth data for training?
That is correct.
> If not, what can I use for SSF training? (I have used the KBP2014
> training data for some of slots, but there are still 13 types of
> slots KBP doesn't have)
Since the assessors were asked to focus on specific entities, we asked
them to capture slots that were natural for the entities. We tried to
audit those selections and guide them toward KBP and ACE slots, so I
believe there are at least close analogs for all of the slots in ACE and
KBP. If you find one that you believe is not close enough, let's discuss
it specifically.
John