Utilisation de PolyLex pour l'allemand

28 views
Skip to first unread message

denis lebailly

unread,
Jul 30, 2015, 6:31:18 AM7/30/15
to Unitex-GramLab
Bonjour, 

J'essaie de matcher des mots en allemand, indépendamment du fait qu'ils soient ou non agglutinés avec d'autres mots. 
ex : chercher <produkt> pour trouver produkt, mais également Abfallprodukt. 
L'utilisation de PolyLex dans le préprocessing ne semble pas aider, et je voudrais savoir s'il existe un autre moyen que le mode morphologique ?

D'avance merci pour votre aide

Denis

denis lebailly

unread,
Jul 30, 2015, 9:20:57 AM7/30/15
to Unitex-GramLab, lebail...@gmail.com
Hello,

I forgot i had to write in English on this list so i re-post:

I'm using Gramlab to do some named entities extraction in German. I cannot match words if they are agglutinate to other words.
if i'm looking for <produkt>, i will not be able to match  Abfallprodukt.
I thought i had to use PolyLex, but it doesn't see Abfallprodukt as compound word, while produkt is in the dictionary.

is there an other way that the morphological mode to solve this agglutination problem?

Thanks in advance

Denis

eric.laporte

unread,
Sep 22, 2015, 7:30:20 AM9/22/15
to Unitex-GramLab, lebail...@gmail.com
Dear Denis,

If Polylex analyses Abfallprodukt as a combination of Abfall with Produkt, it adds Abfallprodukt in the dictionary, so that <Abfallprodukt> will match it, but <Produkt> will not. You can check the Polylex combinations found in your corpus by opening the decomp.txt file in the directory of your corpus.
In addition to the morphological mode, you can try morphological filters, perhaps <N><<produkt(e|en)?$>> .
Best,

Eric

Reply all
Reply to author
Forward
0 new messages