Morpheme Segmentation for Spanish and Catalan

18 views
Skip to first unread message

Lokesh S Pugalenthi

unread,
Dec 27, 2022, 9:00:45 AM12/27/22
to unim...@googlegroups.com, Grasso, Stephanie M
Good Morning,

I'm trying to construct a word for morpheme mapper (e.g. 'knocking' -> ['knock', 'ing']) for Spanish and Catalan. It appears that unimorph allows one to derive morphological info for a given word. Is it possible to use unimorph for morpheme segmentation?

Kind regards,
Lokesh Pugalenthi

Kat Vylomova

unread,
Dec 28, 2022, 3:34:13 AM12/28/22
to Lokesh S Pugalenthi, unim...@googlegroups.com, Grasso, Stephanie M, Khuyagbaatar Batsuren
Dear Lokesh,

Yes, the most recent UniMorph update contains segmentations exttracted from Wiktionary. Spanish has segmentations for 65k lemmas, Catalan has segmentation for ~15k lemmas (Table 7). Some cases might be ambiguous in terms of segmentations (and add some noise to the data), any feedback is very welcome. :-)

I am CC'ing Huygaa (Khuyagbaatar) who worked on the segmentation data and also is leading a corresponding task, he might provide some extra help and comments.

Warm regards,
Kat



--
You received this message because you are subscribed to the Google Groups "unimorph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimorph+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimorph/CANHTfLvMmpVfxaMzVJDmyzfyC2D3WJeZnaquGj3uPu_MRry0hA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages