Using BratReader to read files with unknown labels

10 views
Skip to first unread message

Alain Désilets

unread,
Oct 7, 2019, 9:24:40 AM10/7/19
to dkpro-core-user
Here is the problem I am facing. 

I need to write a sentence classification pipeline that takes a bunch of Brat files where some of the sentences have been labelled with brat tags. The pipeline takes as input a list of string which corresponds to the list of labels that the classifier must learn to apply.

My problem is that those, given the current design of BratReader, those labels have to be mapped to an annotation class. Whenever BratReader encounters a label that is not mapped to a Uima annotation, it raises an exception.

But my users do not have the ability to define new uima annotation classes. They just have the ability to create new brat labels and apply them to the files.

I was wondering if there would be a way to have unknown labels be assigned a "generic" Uima annotation class say, "Label", where getValue() of a Label would return the corresponding brat label.

Does this capability already exist?

If not, would be worth it for me to create it?

Thx.

Richard Eckart de Castilho

unread,
Oct 7, 2019, 10:58:31 AM10/7/19
to dkpro-core-user
On 7. Oct 2019, at 15:24, Alain Désilets <alainde...@gmail.com> wrote:
>
> I was wondering if there would be a way to have unknown labels be assigned a "generic" Uima annotation class say, "Label", where getValue() of a Label would return the corresponding brat label.
>
> Does this capability already exist?

I think we have this. DKPro Core master has recently received an upgrade to the brat -> UIMA mapping capabilities - check out:

https://github.com/dkpro/dkpro-core/blob/a73a72ed5db1c876f753c1b54c5718babced9335/dkpro-core-io-brat-asl/src/test/java/org/dkpro/core/io/brat/BratReaderWriterTest.java#L184

I think what you are looking for is the "subCatFeature" which tells the BratReader to put the brat label into the specified feature of the target UIMA type. I'm afraid, there is no documentation for this so far, but looking at the mapping configuration classes in

https://github.com/dkpro/dkpro-core/tree/a73a72ed5db1c876f753c1b54c5718babced9335/dkpro-core-io-brat-asl/src/main/java/org/dkpro/core/io/brat/internal/mapping

and at the unit test mentioned above may give some clues.

Cheers,

-- Richard
Reply all
Reply to author
Forward
0 new messages