Attached is a small project that illustrates some improvements I propose to make to BratReader. I didn't put them in a Pull Request because I wanted to show the difference between the current implementation (BratReader.java) and my proposed change (BratReaderForgiving.java) and I didn't want to commit the latter only to have to rename it to BratReader.java later on.
The unit tests in BratIssuesTest.java illustrate the problems that these improvements aim to fix.
In a nutshell, it allows you to:
1) Create a BratRead without having to provide mappings.
- BratReader already knows mappings for the Annotations defined in the standard dkpro-core type system (ex: Person)
- If you provide a PARAM_MAPPING, those mappings are added to the default ones (Note: not working at the moment because of a bug with serialization of Mapping)
- If the .ann file contains a label that is defined in neither of the default or PARAM_MAPPING mappings, it will use a "catch-all" Annotation type (NamedEntity for the moment, but could be something else)
2) Pass a directory or file to PARAM_SOURCE_LOCATION without having to worry about things like:
- Adding *.ann at the end of the directory path (the Reader adds it automaticaly)
- Making sure to pass the .ann file as opposed to the .txt file (the Reader automatically converts it to .ann path)
- Making sure that the single file, or all the the .txt files in the directory have a corresponding .ann file (the Reader automatically creates empty .ann files for orphan .txt files)
If this seems appropriate, I will create a feature request followed by a Pull REquest.