test writing / MateLemmatizer Lemma integration

18 views

Skip to first unread message

cherep...@googlemail.com

unread,

Jan 29, 2017, 7:24:32 AM1/29/17

to dkpro-tc-users

Salut,

Surely because of a lack of knowledge in using dkpro/uima, I faced to following problems.

First, I have problems to writing my second test for a feature. It seems that MateLemmatizer does not work correctly in this code.

AnalysisEngineDescription desc = createEngineDescription(BreakIteratorSegmenter.class);
AnalysisEngineDescription desc2 = createEngineDescription(MateLemmatizer.class, MateLemmatizer.PARAM_LANGUAGE, "de");

AnalysisEngine engine = createEngine(desc);
AnalysisEngine engine2 = createEngine(desc2);

JCas jcas = engine.newJCas();
jcas.setDocumentLanguage("de");
jcas.setDocumentText("Er fährt und machte eine Runde von etwas mit Bäume sei geredet. Ja ja gesponnen ausgebrochen");
engine.process(jcas);
engine2.process(jcas);

TextClassificationTarget target = new TextClassificationTarget(jcas, 0, jcas.getDocumentText().length());
target.addToIndexes();

I got the same Lemma results as from the engine before (BreakIteratorSegmenter.class/Token). Here part output of

System.out.println(a.getType().toString());
System.out.println(a.getCoveredText().toString());

>>de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token
>>geredet
>>de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma
>>geredet

My second question is, if I want to work with elements from engine MateLemmatizer, should I add it only in the method "AnalysisEngineDescription getPreprocessing()" at the right position, or is there are anything else which I should take care of?

Thank you!

Johannes Daxenberger

unread,

Jan 30, 2017, 12:16:46 PM1/30/17

to cherep...@googlemail.com, dkpro-tc-users

Hi,

as for your first question, I do not really understand what the problem is. What output did you expect? What does not work as intended?
In any case, this is a question that is probably better issued (as you noted) in the context of DKPro Core, i.e. dkpro-c...@googlegroups.com

As for your second question: it depends on how you’re using the output of the preprocessing components. As long as you have any feature extractors which are relying on certain annotation thereof (e.g. Token, POS-tag etc.), you can freely change the classes in the getPreprocessing() method. If that change, however, results in different annotations being generated, you also need to adapt and/or change the feature extractors relying on these annotations.

Best,
Johannes

Am 29.01.17, 13:24 schrieb "cherepanov.ic via dkpro-tc-users" <dkpro-t...@googlegroups.com>:

--
You received this message because you are subscribed to the Google Groups "dkpro-tc-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-tc-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages