Transfer accumulated recommender knowledge between projects

4 views
Skip to first unread message

Jan Matti Dollbaum

unread,
Apr 27, 2022, 6:12:03 AM4/27/22
to incepti...@googlegroups.com

Dear INCEpTION team,

 

for reasons of data storage and management, we are currently organizing our texts into relatively small projects that are easy to monitor, export and reimport. We have started using the recommender function, which works quite well already, but which would be even more helpful if it were able to draw upon the annotations of earlier projects. Is there a way to transfer this knowledge so that it is available to the recommender in each new project that we set up?

 

Many thanks and best wishes,

Jan

Richard Eckart de Castilho

unread,
Apr 28, 2022, 2:53:10 AM4/28/22
to inception-users
Hi Jan,

> On 27. Apr 2022, at 12:11, Jan Matti Dollbaum <janmatti...@gmail.com> wrote:
>
> We have started using the recommender function, which works quite well already, but which would be even more helpful if it were able to draw upon the annotations of earlier projects. Is there a way to transfer this knowledge so that it is available to the recommender in each new project that we set up?

The built-in INCEpTION recommenders learn from annotated texts. Bootstrapping for them usually works by importing pre-annotated texts, blocking the annotators from using these texts such that only the recommenders learn from them. This way, the recommenders can learn from the pre-annotated data as well as from the additional training data provided by the annotators.

If you have a significant amount of annotated data - in particular if it is spread over multiple projects - then the recommended approach would be to export that data and to externally train a model from it. Then hook that model up to INCEpTION as an external recommender. The downside here is that this particular recommender usually won't be trainable anymore and won't learn from additional annotations. But if your training data is good enough, then that is not a problem. Also, you can organize your annotation into iterations and export the newly annotated data after each iteration, re-train the external model and update it.

You'd export the project as XMI CAS, use dkpro-cassis [1] to extract the training data and to transform it into whatever your ML library needs - then you'd use the ML library to train a model - then you'd use the external recommender API [2] to hook up your model. As for the ML library, maybe try spacy or sklearn. The external recommender project includes an example for a `SklearnMentionDetector` that is crf-based and might be suitable for adaptation.

Cheers,

-- Richard

[1] https://github.com/dkpro/dkpro-cassis
[2] https://github.com/inception-project/inception-external-recommender

Jan Matti Dollbaum

unread,
Apr 28, 2022, 3:47:20 AM4/28/22
to incepti...@googlegroups.com
Hi Richard,

tanks a lot. It seem, then, that we have two options: importing and blocking texts into the existing projects and training an external recommender. We are working on a completely external solution that would produce automatically pre-annotated xmis to import into INCEpTION but probably can twist the pipeline according to your suggestions. We will be in touch in case further questions arise.

Thanks again!
Jan
--
You received this message because you are subscribed to the Google Groups "inception-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to inception-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/inception-users/3D47AFDB-9A3A-4C40-8FCA-D468929FBA76%40gmail.com.

Richard Eckart de Castilho

unread,
Apr 28, 2022, 3:59:27 AM4/28/22
to incepti...@googlegroups.com
Hi,

> On 28. Apr 2022, at 09:47, Jan Matti Dollbaum <janmatti...@gmail.com> wrote:
>
> We are working on a completely external solution that would produce automatically pre-annotated xmis to import into INCEpTION but probably can twist the pipeline according to your suggestions.

importing pre-annotated documents works as well if you can do it. I believe the main difference to a recommender-based approach is that with the pre-imported annotations, it might be easier for annotators to miss annotations that they would have to correct whereas with recommenders, the annotators need to make a conscious decision to accept, reject or correct the suggestion.

Cheers,

-- Richard
Reply all
Reply to author
Forward
0 new messages