POS annotation

31 views

Skip to first unread message

محمد علي‎

unread,

Nov 2, 2023, 11:23:56 AM11/2/23

to inception-users

I would like to annotate Arabic text with a new part of speech (POS). I have established a new tagset with the 'tagsets' option. I followed these steps: New Project > Standard Project > Settings > Layer > Part of Speech > Granularity (I was unable to change it to character level) > XPOS > my new tagset. In Arabic, one token can consist of more than one morpheme, such as a verb with an object like (كتبه < كتبـ + ـه). I aim to annotate كتبـ as a verb and ـه as a pronoun. However, I am facing difficulty separating one token into more than one morpheme to annotate them with different POS. Is there a solution?

Richard Eckart de Castilho

unread,

Nov 2, 2023, 11:26:05 AM11/2/23

to inception-users

Hi,

> On 2. Nov 2023, at 16:23, ⁨محمد علي⁩ <⁨zmh...@gmail.com⁩> wrote:
>
> I would like to annotate Arabic text with a new part of speech (POS). I have established a new tagset with the 'tagsets' option. I followed these steps: New Project > Standard Project > Settings > Layer > Part of Speech > Granularity (I was unable to change it to character level) > XPOS > my new tagset. In Arabic, one token can consist of more than one morpheme, such as a verb with an object like (كتبه < كتبـ + ـه). I aim to annotate كتبـ as a verb and ـه as a pronoun. However, I am facing difficulty separating one token into more than one morpheme to annotate them with different POS. Is there a solution?

Instead of using the built-in POS layer, you could create your own custom layer for Part-of-Speech tagging. On that layer, you can configure the granularity to be character. Only, you won't be able to export your data in CoNLL formats etc. You can then only export as CAS XMI, CAS JSON (or WebAnno TSV).

-- Richard

Reply all

Reply to author

Forward

0 new messages