Hi Roland,
> On 20. Sep 2021, at 12:19, 'Roland Meyer' via inception-users <
incepti...@googlegroups.com> wrote:
>
> is it possible to insert tokens into corpus texts during annotation?
>
> Background: We are looking for a way to handle ellipsis annotation via inserting empty categories (as suggested in the enhanced UD analysis –
https://universaldependencies.org/u/overview/enhanced-syntax.html#ellipsis) which mostly behave like usual tokens, e.g. can be linked in coreference chains and the like. Inserting an empty span does not look quite right – it seems to be attached to a token rather than behave as a separate one, or am I mistaken there?
we have started looking into making tokens and sentences editable objects in INCEpTION,
but the feature is far from usable yet.
At the moment, working with zero-width spans is the only option for ellipsis annotation. Zero-width spans are not attached to tokens. The UI only allows creating zero-width spans at the beginning or end of a token or even within a token. You cannot create an annotation on the white space between tokens. But that doesn't imply that the span is attached to the token. That said, a zero-width span does not behave like a token either. Currently, a token is an invisible annotation type and a few other annotations like POS, Lemma and a few others internally hook up to the token. The UD importer/exporter knows how to work with these internal Token, POS or Lemma types. But if I remember correctly, the UD importer/exporter currently does not support ellipsis. You can build custom annotation layers which allow
zero-width annotations and also define relations over them and such, but you can then only
export the data as UIMA CAS XMI or WebAnno TSV and a conversion to/from CoNLL-U would need to happen externally e.g. using a Python script with the help of the DKPro Cassis library for XMI files.
To create a zero-width span on a layer that supports it, press "shift" and click at the position where you want to insert the span.
Here are some issues related to making tokens/sentences editable objects:
https://github.com/inception-project/inception/projects/53
Cheers,
-- Richard