Thanks for your interest in the LM-KBC challenge. I hope this answers your questions sufficiently. If you have any further questions, don’t hesitate, to ask us again.
- You are not limited to the Wikidata API. The entity disambiguation can be seen as part of the tasks in both tracks. As a baseline method, you could just use the Wikidata API. For a higher F1 score, you might try to develop your own disambiguation method.
- Yes, this is allowed.
- This is a bit more difficult discussion. In your case, we believe that fine-tuning on Wikidata5m might lead to data leakage, hence, the usage for the fact prediction component is not allowed. The same holds for other WIkidata-based datasets. However, it would be okay to fine-tune your component on some text-based datasets. However, it would be okay to use Wikidata5m or another Wikidata-based dataset for the entity disambiguation method.
- Yes, using these kinds of models is allowed! The only important point to keep in mind is that your model should not be trained on Wikidata.
As a general guideline, we would recommend to self-assess whether the usage of a certain model or dataset is fair or not. If in doubt, just drop us another message.
Best regards,
The LM-KBC Team