ZWNJ Character

19 views
Skip to first unread message

Yotam Leibovici

unread,
Jul 13, 2025, 7:01:38 AMJul 13
to inception-users

Hi,
There seems to be an issue with the ZWNJ character (Unicode U+200C) — it's currently displayed as a diamond with a question mark inside, instead of being invisible.

Would appreciate it if this could be looked into.
Thanks

Richard Eckart de Castilho

unread,
Jul 13, 2025, 7:09:41 AMJul 13
to incepti...@googlegroups.com
Hi Yotam,

> On 13. Jul 2025, at 13:01, Yotam Leibovici <yotam.l...@gmail.com> wrote:
>
> There seems to be an issue with the ZWNJ character (Unicode U+200C) — it's currently displayed as a diamond with a question mark inside, instead of being invisible.

Some browsers do not count characters that are invisible (no width) which causes problems when we want to render annotations.
Therefore, INCEpTION replaces many characters with a non-breaking space. So ZWNJ should show up as a space in most cases.

The brat view is bit different, because the replacement character there is configurable and defaults to \uFFFD.

You can change this by adding a line like

ui.brat.white-space-replacement-character=X

To your settings.properties file where X should be the character you want to use. Note it has to be a character that browsers will count as a visible character, e.g.

ui.brat.white-space-replacement-character=\u00A0

If you believe the default replacement character for the brat view should be changed, please open a feature request
and explain why you believe the default should be changed.

https://github.com/inception-project/inception/issues/new/choose

-- Richard

Yotam Leibovici

unread,
Jul 13, 2025, 7:35:09 AMJul 13
to inception-users

Thank you very much for the quick reply.
I'm not sure I understood. In the HTML editor, the ZWNJ is displayed correctly — not as a space.
Is there a way to configure the same behavior for the brat editor as well?
ב-יום ראשון, 13 ביולי 2025 בשעה 14:09:41 UTC+3, Richard Eckart de Castilho כתב/ה:

Richard Eckart de Castilho

unread,
Jul 13, 2025, 7:36:41 AMJul 13
to inception-users

> On 13. Jul 2025, at 13:35, Yotam Leibovici <yotam.l...@gmail.com> wrote:
>
>
> Thank you very much for the quick reply.
> I'm not sure I understood. In the HTML editor, the ZWNJ is displayed correctly — not as a space.
> Is there a way to configure the same behavior for the brat editor as well?

Try adding the following line to your settings.properties file:

ui.brat.white-space-replacement-character=\u00A0

-- Richard

Yotam Leibovici

unread,
Jul 13, 2025, 9:01:57 AMJul 13
to inception-users
After adding that line, the ZWNJ is now displayed as a normal space, even though it should appear with no width.

ב-יום ראשון, 13 ביולי 2025 בשעה 14:36:41 UTC+3, Richard Eckart de Castilho כתב/ה:

Richard Eckart de Castilho

unread,
Jul 13, 2025, 11:18:49 AMJul 13
to inception-users

> On 13. Jul 2025, at 15:01, Yotam Leibovici <yotam.l...@gmail.com> wrote:
>
> After adding that line, the ZWNJ is now displayed as a normal space, even though it should appear with no width.

As I said: some browsers do not count characters with no width. That breaks the calculation of annotation positions within the document.
For that reason, we have to replace characters with no width with characters that do have a width and are counted by browsers.

See also: https://github.com/inception-project/inception/issues/1849

Cheers,

-- Richard


Richard Eckart de Castilho

unread,
Jul 13, 2025, 2:36:20 PMJul 13
to inception-users
Hi,

> On 13. Jul 2025, at 15:01, Yotam Leibovici <yotam.l...@gmail.com> wrote:
>
> After adding that line, the ZWNJ is now displayed as a normal space, even though it should appear with no width.

Try setting the replacement character to \u200A (hair space) and let me know if that provides a better experience... and if you can still reliably annotate.

-- Richard

Yotam Leibovici

unread,
Jul 14, 2025, 6:03:34 AMJul 14
to inception-users

I made the change and it is indeed a good solution to the problem. Thank you for the effort in finding it.
Are these characters included in the Trim Annotation Repair so that annotations cannot start or end with such characters?
ב-יום ראשון, 13 ביולי 2025 בשעה 21:36:20 UTC+3, Richard Eckart de Castilho כתב/ה:

Richard Eckart de Castilho

unread,
Jul 15, 2025, 4:33:49 AMJul 15
to incepti...@googlegroups.com
Hi,

> On 14. Jul 2025, at 12:03, Yotam Leibovici <yotam.l...@gmail.com> wrote:
>
> Are these characters included in the Trim Annotation Repair so that annotations cannot start or end with such characters?

That probably depends a bit on the INCEpTION version that you are using, but yes, they should be.

The current list of trim chars is here:

https://github.com/inception-project/inception/blob/main/inception/inception-support/src/main/java/de/tudarmstadt/ukp/inception/support/text/TrimUtils.java#L78-L130

-- Richard


Reply all
Reply to author
Forward
0 new messages