Hi Jan,
> On 27. Apr 2022, at 12:11, Jan Matti Dollbaum <
janmatti...@gmail.com> wrote:
>
> We have started using the recommender function, which works quite well already, but which would be even more helpful if it were able to draw upon the annotations of earlier projects. Is there a way to transfer this knowledge so that it is available to the recommender in each new project that we set up?
The built-in INCEpTION recommenders learn from annotated texts. Bootstrapping for them usually works by importing pre-annotated texts, blocking the annotators from using these texts such that only the recommenders learn from them. This way, the recommenders can learn from the pre-annotated data as well as from the additional training data provided by the annotators.
If you have a significant amount of annotated data - in particular if it is spread over multiple projects - then the recommended approach would be to export that data and to externally train a model from it. Then hook that model up to INCEpTION as an external recommender. The downside here is that this particular recommender usually won't be trainable anymore and won't learn from additional annotations. But if your training data is good enough, then that is not a problem. Also, you can organize your annotation into iterations and export the newly annotated data after each iteration, re-train the external model and update it.
You'd export the project as XMI CAS, use dkpro-cassis [1] to extract the training data and to transform it into whatever your ML library needs - then you'd use the ML library to train a model - then you'd use the external recommender API [2] to hook up your model. As for the ML library, maybe try spacy or sklearn. The external recommender project includes an example for a `SklearnMentionDetector` that is crf-based and might be suitable for adaptation.
Cheers,
-- Richard
[1]
https://github.com/dkpro/dkpro-cassis
[2]
https://github.com/inception-project/inception-external-recommender