Need Help Importing Brat Project into Inception

88 views
Skip to first unread message

Filipe Cunha

unread,
Feb 14, 2024, 8:20:47 AMFeb 14
to inception-users
Hi everyone,

I'm currently working on a project which uses Brat for text annotation (several entities, relations and attributes). However, we are now considering transitioning to Inception for its potential enhanced features. 
In this regard, I would like to ask some questions:

Is Inception able to import brat projects out of the gate? 
If not, what would be the main challenges associated with this transition?
Is there any documentation/guides on this transition that we can follow?

Thank you in advance for your assistance!

Best regards,
Filipe Cunha

Richard Eckart de Castilho

unread,
Feb 16, 2024, 12:31:16 PMFeb 16
to inception-users
Hi Felipe

> On 14. Feb 2024, at 13:58, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I'm currently working on a project which uses Brat for text annotation (several entities, relations and attributes). However, we are now considering transitioning to Inception for its potential enhanced features.
> In this regard, I would like to ask some questions:
>
> Is Inception able to import brat projects out of the gate?

Not at the moment. The annotation scheme design philosophy used by INCEpTION and by brat is quite different. It would require to define a mapping from brat to a (custom) annotations scheme defined in INCEpTION and that is not viable via the UI.

> If not, what would be the main challenges associated with this transition?

The design philosophy mainly. Annotation schemes defined by people using UIMA (INCEpTION)
tend to be more differentiated than in brat.

If you read the DKPro Core Reader documentation, you might get an idea:

https://dkpro.github.io/dkpro-core/releases/2.2.0/docs/format-reference.html#format-Brat

That said, philosophy evolves. I could imagine that INCEpTION moves towards a situation where a special "brat compatible" project template could be included that would facilitate the import of brat data at the expense of being able to define a custom schema mapping.

> Is there any documentation/guides on this transition that we can follow?

- DKPro Core includes an UIMA-based reader and writer component for the brat format.
This could be used to write a Java program that reads brat and writes UIMA CAS that
INCEpTION may consume - but the process is not trivial.

- alternatively DKPro Cassis could help to handle the part for writing to the UIMA CAS
format in a Python-based script. You still need some Python to to load and map the
brat format tough.

Do you haven an example of what your brat data looks like (i.e. how simple/complex it is)? Maybe that could inform a way towards facilitating brat import into INCEpTION in the future.

Cheers,

-- Richard

Filipe Cunha

unread,
Feb 27, 2024, 9:24:41 AMFeb 27
to inception-users
Thank you for your reply Richard,
Here is an example of a text file and corresponding annotation file generated in Brat.

Best Regards,
Filipe 

sample.ann
sample.txt
Message has been deleted

Richard Eckart de Castilho

unread,
Mar 10, 2024, 4:39:57 PMMar 10
to incepti...@googlegroups.com
Hi Felipe,

is this what you would expect your data to look like in INCEpTION?

Cheers,

-- Richard

Screenshot 2024-03-10 at 21.36.38.png

Filipe Cunha

unread,
Mar 12, 2024, 6:40:13 AMMar 12
to inception-users
Hi Richard,
I might need to make some changes regarding the span annotation display order, but other than that it looks perfect.
Could you share your magic with me, please?

Cheers,
Filipe

Richard Eckart de Castilho

unread,
Mar 12, 2024, 8:11:21 AMMar 12
to incepti...@googlegroups.com
Hi Felipe,

> On 12. Mar 2024, at 11:40, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I might need to make some changes regarding the span annotation display order, but other than that it looks perfect.
> Could you share your magic with me, please?

I have started a pull request here:

https://github.com/inception-project/inception/pull/4621

I think this may need a bit additional work - I just don't know exactly what yet.
The most annoying part at the moment may be that brat data always comes as two
files (`.ann` and `.txt`) and INCEpTION requires that an annotation file is always
a single file. So in order to import a single brat document, you have to put the
`.ann` and the `.txt` file into a ZIP file and then import that. And you have to
have one such ZIP for every document - you cannot put multiple `.ann` and `.txt`
files into the same ZIP!

Also, the current implementation requires that you start with the "Basic annotation
(span/relation)" template or at least have the "basic span" and "basic relation"
layers in your project - because all brat data is mapped into these two layers.

Cheers,

-- Richard

Filipe Cunha

unread,
Mar 18, 2024, 7:47:20 PMMar 18
to inception-users
Hi again Richard,
I tried to follow your instructions, however, I am having trouble when importing the brat annotations.
I'm using Inception with Docker. I have used the following command to start the container:
$ docker run -it --name inception -v /srv/inception:/export -p8080:8080 ghcr.io/inception-project/inception:31.3

I created the  /srv/inception folder and the file "settings.properties" with format.brat.enabled=true before running the command above.
However, when importing the zip file (containing the txt file and ann file), the brat format is not present within the available options. (image below)
screenshot_.png

Best regards,
Filipe 

Richard Eckart de Castilho

unread,
Mar 19, 2024, 2:43:20 AMMar 19
to inception-users
Hi Filipe,

> On 19. Mar 2024, at 00:47, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I tried to follow your instructions, however, I am having trouble when importing the brat annotations.
> I'm using Inception with Docker. I have used the following command to start the container:
> $ docker run -it --name inception -v /srv/inception:/export -p8080:8080 ghcr.io/inception-project/inception:31.3
>
> I created the /srv/inception folder and the file "settings.properties" with format.brat.enabled=true before running the command above.
> However, when importing the zip file (containing the txt file and ann file), the brat format is not present within the available options. (image below)<screenshot_.png>

The new feature is available only in 32.0-SNAPSHOT.

I have pushed an image here:

https://github.com/inception-project/inception/pkgs/container/inception-snapshots/192721365?tag=32.0-SNAPSHOT

Note that if you upgrade your installation to that image and later want to downgrade back to 31.x release version,
there *might* be problems. So if you want to try it, I would advise doing so using a fresh data folder / database.
If you upgrade your existing system using this image, I would strongly recommend you make sure you have a backup
that you can go back to.

https://github.com/inception-project/inception/blob/main/inception/inception-doc/src/main/resources/META-INF/asciidoc/admin-guide/upgrade_backup.adoc

Cheers,

-- Richard

Filipe Cunha

unread,
Mar 19, 2024, 1:55:39 PMMar 19
to inception-users
Hello Richard,

I have pulled your image but still no luck. The Brat format option is not found when importing the zip file inside the project.
To be sure that the volume is working correctly, I have entered the container and confirmed that the settings.properties file was passed to the container to the "/export" path.
The file only contains one line: format.brat.enabled=true
Is there anything that I might be missing?
I tried to recreate the container from scratch from the image with version 32.0 and create a new inception project without any data but still no luck...

Thank you again
Filipe

Richard Eckart de Castilho

unread,
Mar 19, 2024, 1:57:22 PMMar 19
to inception-users
Hi,

> On 19. Mar 2024, at 18:55, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I have pulled your image but still no luck. The Brat format option is not found when importing the zip file inside the project.
> To be sure that the volume is working correctly, I have entered the container and confirmed that the settings.properties file was passed to the container to the "/export" path.
> The file only contains one line: format.brat.enabled=true
> Is there anything that I might be missing?
> I tried to recreate the container from scratch from the image with version 32.0 and create a new inception project without any data but still no luck...

Look for a line like this when you start INCEpTIN:

```
2024-03-19 07:35:45 INFO [main] [SYSTEM] boot - Settings: /.../inception/settings.yml (file exists)
```

If the line does not say "file exists", then your setting is not picked up.

-- Richard

Filipe Cunha

unread,
Mar 19, 2024, 7:14:23 PMMar 19
to inception-users
Hi,

I have that line.
Here you have my logs and the settings.properties file I'm using.

- Filipe

settings.properties
application.log

Richard Eckart de Castilho

unread,
Mar 20, 2024, 1:57:09 AMMar 20
to inception-users
Hi Filipe,

> On 20. Mar 2024, at 00:14, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I have that line.
> Here you have my logs and the settings.properties file I'm using.

You know what - I just tried enabling brat directly via JAVA_OPTS when starting docker and
it also does not work. I am puzzled at this point, but since I can reproduce this, I can
debug it. Will get back to you once I got to the bottom of this...

Cheers,

-- Richard

Richard Eckart de Castilho

unread,
Mar 20, 2024, 2:20:49 PMMar 20
to inception-users
Hi again,

> On 20. Mar 2024, at 06:56, Richard Eckart de Castilho <richard...@gmail.com> wrote:
>
>> I have that line.
>> Here you have my logs and the settings.properties file I'm using.
>
> You know what - I just tried enabling brat directly via JAVA_OPTS when starting docker and
> it also does not work. I am puzzled at this point, but since I can reproduce this, I can
> debug it. Will get back to you once I got to the bottom of this...

Ok, my memory slipped - I had already changed the name of the properties:

- ´format.brat-basic.enabled=true` - this is the one you want to use to enable brat import
- `format.brat-custom.enabled=true` - this one can be used to enable brat export, but the export strategy is quite different from the import strategy.

Here is also a bit more of documentation:

- https://github.com/inception-project/inception/blob/410fa10efdb40ef8d7d55d33efc35c4a6164e507/inception/inception-io-brat/src/main/resources/META-INF/asciidoc/user-guide/formats-brat-basic.adoc
- https://github.com/inception-project/inception/blob/410fa10efdb40ef8d7d55d33efc35c4a6164e507/inception/inception-io-brat/src/main/resources/META-INF/asciidoc/user-guide/formats-brat-custom.adoc

Cheers,

-- Richard

Filipe Cunha

unread,
Mar 20, 2024, 5:05:48 PMMar 20
to inception-users
Hi, there Richard,
I was able to select the brat format option to import my file sample (the one I gave you earlier). However, I cannot replicate the results you obtained on your screenshot.
When I import my files I get the following error:
error.png

Just to be sure,  I deleted the annotation associated with the spans 262-272. After that I was able to import the file, however, it seems that the annotation spans are not aligned correctly.
For example, if you compare your results with mine, you can see that the token "GNR" is annotated as a participant in your screenshot (as it should), contrary to what happens in my version:

missaligned annotations.png 

Thank you again for your support.
I hope we can find a way to make this work, and that this effort may help other people in the future.

- Filipe

Richard Eckart de Castilho

unread,
Mar 21, 2024, 1:20:36 AMMar 21
to incepti...@googlegroups.com
Hi,

> On 20. Mar 2024, at 22:05, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I was able to select the brat format option to import my file sample (the one I gave you earlier). However, I cannot replicate the results you obtained on your screenshot.

what operating system are you using?

Cheers,

-- Richard

Richard Eckart de Castilho

unread,
Mar 21, 2024, 2:04:25 AMMar 21
to incepti...@googlegroups.com
Hi Filipe,

> On 20. Mar 2024, at 22:05, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I was able to select the brat format option to import my file sample (the one I gave you earlier). However, I cannot replicate the results you obtained on your screenshot.
> When I import my files I get the following error:<error.png>
>
> Just to be sure, I deleted the annotation associated with the spans 262-272. After that I was able to import the file, however, it seems that the annotation spans are not aligned correctly.
> For example, if you compare your results with mine, you can see that the token "GNR" is annotated as a participant in your screenshot (as it should), contrary to what happens in my version:

So I tried this again and here is what I did (I'm on macOS):

* drag-and-dropped the example files attached to your email to a folder
* marked both of them, right-clicked and selected "Compress files" that created an "Archive.zip"
* created a fresh project in INCEpTION using the "Basic annotation (span/relation)" template
* Clicked on "import documents"
* Selected "brat (basic)"
* Selected the "Archive.zip" file and imported
* Clicked on "dashboard"
* Opened the annotation page

I did not make any changes to the files.

> Thank you again for your support.
> I hope we can find a way to make this work, and that this effort may help other people in the future.

My suspicion is that that either of two things happened on your side (I suspect you might be using Windows):

1) there is a platform-specific behavior in my brat reader code that works differently on your operating system than on mine
2) you opened the text file with an editor that changed line endings - the file attached to your mail uses unix-style line feeds (LF)

So I used the command "unix2dos" to convert the line endings in the sample text file from your mail to Windows/DOS line endings (CRLF),
repeated my process of zipping the data up and importing it.

Now I also get the error you reported in your mail:

```
Unable to load annotations: Start position of range [262-272] is not part of any visible row. In a sentence-based editor, this is most likely caused by annotations outside sentences.
```

Could you please verify that when you compress and import exactly the files you attached to your mail without intermediate editing you get the same error?

If yes, I need to go looking for the bug on my side.

If no, the problem should be resolved.

Cheers,

-- Richard

Filipe Cunha

unread,
Mar 21, 2024, 5:55:21 AMMar 21
to inception-users
Hi, 
I'm using Windows as my OS.
I downloaded the files again, zipped them using the Windows Linux subsystem and still got the same error
Could it be that Inception behaves differently when opening files on windows?

Cheers,
- Filipe

Filipe Cunha

unread,
Mar 21, 2024, 6:12:30 AMMar 21
to inception-users
Never mind, I'm running it on docker so my last message does not make sense.

Richard Eckart de Castilho

unread,
Mar 21, 2024, 7:20:57 AMMar 21
to inception-users
Hi,

> On 21. Mar 2024, at 10:55, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I'm using Windows as my OS.
> I downloaded the files again, zipped them using the Windows Linux subsystem and still got the same error
> Could it be that Inception behaves differently when opening files on windows?

It is possible that it does. Thanks for checking. I know what I need to look for now :)

-- Richard

Filipe Cunha

unread,
Mar 21, 2024, 7:21:24 AMMar 21
to inception-users
I used the dos2unix command on the .txt file and it works now.
Im now trying to configure the layers to fit our use case. 
For instance, we have Event and Participant entities in our annotation schema. Each of them has its own attributes. I added a tagset to the label feature with ["Event","Participant"]. After adding all the attributes to the span layer, all the features are displayed during the annotation of a certain span. Would it be possible to filter the attributes that are displayed, for example, showing only the Event attributes if the label Event was selected?

Thank you.

Richard Eckart de Castilho

unread,
Mar 24, 2024, 9:25:41 AMMar 24
to incepti...@googlegroups.com
Hi,

> On 21. Mar 2024, at 12:21, Filipe Cunha <ryz...@gmail.com> wrote:
>
> Would it be possible to filter the attributes that are displayed, for example, showing only the Event attributes if the label Event was selected?

In theory yes, but only if the attributes are string attributes and have an associated tagset.

See: https://inception-project.github.io/releases/31.3/docs/user-guide.html#sect_constraints_conditional_features

-- Richard

Filipe Cunha

unread,
Apr 8, 2024, 6:48:46 AMApr 8
to inception-users
Hi Richard,

Inception was able to meet most of the requirements we have, however, there is one problem that we are facing.
Lets suppose that we have Event, Participant and Time entities which are manually annotated on a first pass. 
On a second annotation pass, we annotate relations between these entities, however, we have specific relations to connect certain entity types.
As we used your pull request to import data from brat, all entities are contained on the same Span layer, thus they are identified by a span layer feature. Is it possible to constrain the relations from the Relation layer according to that feature from the span layer? 
For instance, we the relation "SRL_Agent" is used to connect an Event with a Participant, but should never be used to connect an Event with a Time.

Richard Eckart de Castilho

unread,
Apr 9, 2024, 5:47:11 AMApr 9
to incepti...@googlegroups.com
Hi,

> On 8. Apr 2024, at 12:48, Filipe Cunha <ryz...@gmail.com> wrote:
>
> As we used your pull request to import data from brat, all entities are contained on the same Span layer, thus they are identified by a span layer feature. Is it possible to constrain the relations from the Relation layer according to that feature from the span layer?
> For instance, we the relation "SRL_Agent" is used to connect an Event with a Participant, but should never be used to connect an Event with a Time.

First, you need to create tagsets for in particular the "label" feature of your span and relation layer because the Constraints functionality only apply to features with tagset. You need to add all the values you use for these labels into the tagset (cf. https://github.com/inception-project/inception/pull/4701).

Once you have that, go to the project settings, to the Constraints panel and create a new rule set with this content:

```
/* Constraint rules set created by admin */
import custom.Relation as Relation;

Relation {
Governor.label = "Event" & Dependent.label = "Participant" -> label = "SRLINK_agent";
Governor.label = "Event" & Dependent.label = "Time" -> label = "TLINK_isIncluded";
}
```

Add more rules as required.

That will re-order the dropdowns such that the "valid" relation labels appear first depending
on the label of the spans they attach to.

There is currently no option to disallow "invalid" labels (those not covered by a constraint rule), but
you can open a feature request:

https://github.com/inception-project/inception/issues/new/choose

Cheers,

-- Richard


Screenshot 2024-04-09 at 11.35.37.png
Screenshot 2024-04-09 at 11.35.26.png

Filipe Cunha

unread,
Apr 24, 2024, 5:47:16 AMApr 24
to inception-users
Hi,

I have been trying these constraints to filter the available relations according to the Governor and Dependent labels, however,  I'm facing the following issue:
I have some conditions that should include a logical OR. For example:

Governor.label = "Measure" & (Dependent.label = "Participant" | Dependent.label = "Event") ->
label = "MLINK_distance" | label = "MLINK_length";  
}
In this case, I want the relations  MLINK_distance  and  MLINK_length   to be applied to a Governor with the label Measure, and the Dependent can be either  Participant  or  Event  .
I tried to create this condition, but Inception won't let me add the logical OR on the condition. How does Inception handle these cases?

Best Regards,
Filipe

Richard Eckart de Castilho

unread,
Apr 24, 2024, 5:56:40 AMApr 24
to inception-users
Hi,

> On 24. Apr 2024, at 11:47, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I have been trying these constraints to filter the available relations according to the Governor and Dependent labels, however, I'm facing the following issue:
> I have some conditions that should include a logical OR. For example:
>
> Governor.label = "Measure" & (Dependent.label = "Participant" | Dependent.label = "Event") -> label = "MLINK_distance" | label = "MLINK_length";
> }
> In this case, I want the relations MLINK_distance and MLINK_length to be applied to a Governor with the label Measure, and the Dependent can be either Participant or Event .
> I tried to create this condition, but Inception won't let me add the logical OR on the condition. How does Inception handle these cases?

You have two rewrite it as two rules, both only using the conjunction (&) on the left side.

Governor.label = "Measure" & Dependent.label = "Participant" -> label = "MLINK_distance" | label = "MLINK_length";
Governor.label = "Measure" & Dependent.label = "Event" -> label = "MLINK_distance" | label = "MLINK_length";

Sorry for the verbosity. Feel free to raise a feature request to improve the situation.

-- Richard

Filipe Cunha

unread,
Apr 24, 2024, 8:24:10 AMApr 24
to inception-users
Hi, 
Where should I create this feature request? Should I use the GitHub issues tab? (seems a bad practice)

Filipe

Richard Eckart de Castilho

unread,
Apr 24, 2024, 8:50:00 AMApr 24
to incepti...@googlegroups.com
Hi,

> On 24. Apr 2024, at 14:24, Filipe Cunha <ryz...@gmail.com> wrote:
>
> Where should I create this feature request? Should I use the GitHub issues tab? (seems a bad practice)

Yes, the GitHub issues tab - there is a template there for feature requests, please fill it in.

For curiosity: why does it seem to be a bad practice?

-- Richard

Filipe Cunha

unread,
May 3, 2024, 8:45:06 AMMay 3
to inception-users
Hi again Richard
I found another problem regarding the end of the sentences.
Looking at the .txt file that I shared (in our first messages) as an example we have the first three sentences:

"Redação, 11 out 2020
VAM // JH
Covid-19: GNR acabou com festa ilegal com 50 pessoas em São Brás de Alportel"

However, when importing this document into inception, it ignores the newline characters these three sentences are joined into one:


Captura de ecrã 2024-05-03 133859.png

This problem can also be seen in the first print screen that you sent in our first messages.

For some reason, Incpetion is ignoring some new line characters. Looking at other documents, I'm having difficulties identifying the pattern as sometimes it respects the end of the line, and sometimes it doesn't. Maybe it requires a dot at the end of the sentence? Weird...

Thank you in advance,

- Filipe 

Richard Eckart de Castilho

unread,
May 3, 2024, 9:01:32 AMMay 3
to inception-users
Hi,

> On 3. May 2024, at 14:45, Filipe Cunha <ryz...@gmail.com> wrote:
>
> For some reason, Incpetion is ignoring some new line characters. Looking at other documents, I'm having difficulties identifying the pattern as sometimes it respects the end of the line, and sometimes it doesn't. Maybe it requires a dot at the end of the sentence?

The sentence splitter that INCEpTION uses will require a dot, yes.

Do you have sentence boundaries in your original brat data?

-- Richard

Filipe Cunha

unread,
May 3, 2024, 9:44:56 AMMay 3
to inception-users
We use newline characters (\n). Is it possible to change this?

Richard Eckart de Castilho

unread,
May 3, 2024, 9:48:55 AMMay 3
to incepti...@googlegroups.com

> On 3. May 2024, at 15:44, Filipe Cunha <ryz...@gmail.com> wrote:
>
> We use newline characters (\n). Is it possible to change this?

When importing from plain text, there is one format that treats each line as a sentence.

But when importing from brat, there is no such option (currently).

Assuming you are mainly interested in visual presentation and not in the sentence boundaries themselves,
you can click switch the editor from "brat (sentence-based)" to "brat (line-based)" either from the
preferences dialog on the annotation page (per user) or from the annotation tab in the project settings
(for all users of a project). You will then also want to enable the "allow crossing sentence boundaries"
option for all of your layers to ensure the (wrong) sentence boundaries don't interfere with your
annotation process.

-- Richard

Filipe Cunha

unread,
May 16, 2024, 11:58:27 AMMay 16
to inception-users
Hello Richard,
We need to export our annotations in brat format. Do you have any plans for this?

Thank you
- Filipe

Richard Eckart de Castilho

unread,
May 16, 2024, 1:32:43 PMMay 16
to incepti...@googlegroups.com
Hi,

> On 16. May 2024, at 17:58, Filipe Cunha <ryz...@gmail.com> wrote:
>
> We need to export our annotations in brat format. Do you have any plans for this?

If you enable them in your `settings.properties` file, have two options of exporting
data to brat:

- https://inception-project.github.io/releases/32.2/docs/user-guide.html#sect_formats_brat_custom
- https://inception-project.github.io/releases/32.2/docs/user-guide.html#sect_formats_brat_basic

-- Richard

Filipe Cunha

unread,
May 16, 2024, 6:45:33 PMMay 16
to inception-users
Hi,
I have the format.brat-basic.enabled=true on my settings.properties
As you can see in the next image, I can import brat annotated files into my inception project :


Captura de ecrã 2024-05-16 234226.png



However, when I try to export the annotated file, the brat format is not included in the export format options:

.Captura de ecrã 2024-05-16 233818.png

Thank you,
- Filipe

Richard Eckart de Castilho

unread,
May 17, 2024, 12:44:20 AMMay 17
to inception-users
Hi Filipe,

> On 17. May 2024, at 00:45, Filipe Cunha <ryz...@gmail.com> wrote:
>
> I have the format.brat-basic.enabled=true on my settings.properties
> As you can see in the next image, I can import brat annotated files into my inception project :
>
> However, when I try to export the annotated file, the brat format is not included in the export format options:

you are right - I trusted the documentation, but the code says something different.

Please enable the "brat custom" format using `format.brat-custom.enabled=true`
to get a brat export option.

I am fixing the documentation.

Cheers,

-- Richard
Reply all
Reply to author
Forward
0 new messages