Hi Berg,
> On 19. Aug 2025, at 14:33, Berg Oliveira <
file...@gmail.com> wrote:
>
> I couldn't find a pattern way to break sentences, nor any Inception settings to determine this.
> This break sometimes makes it difficult to annotate tokens (in my case, NER), requiring manual work.
If you provide your input files in such a way that each sentence is one a its own line (and does not span multiple lines)
then you can import the texts in the format "Plain text (one sentence per line)".
Otherwise, if you have programming skills, you could prepare your text in XMI format using the dkpro-cassis
python library and use a sentence/token splitter of your choice.
Finally, there is very experimental functionality in INCEpTION to adjust sentence and token boundaries.
If you want to help testing this, let me know and I can tell you how to activate it. However, please
note that it is really not well tested and changing sentence/token boundaries may have unexpected and
so far unknown side-effects. So best only test this and do not use it for serious work.
Cheers,
-- Richard