Problem with Models for Lexical Substitution

26 views
Skip to first unread message

Benjamin Klein

unread,
Aug 4, 2014, 5:47:07 PM8/4/14
to dkpro-simil...@googlegroups.com
Hello,

I wanted to know if the following code requires the 'Models for Lexical Substitution'. I downloaded TWSI2.zip and did the changes that were explained here (https://code.google.com/p/dkpro-similarity-asl/wiki/SettingUpTheResources) but got an empty file and the program was terminated.

Anyone was able to run the following similarity (the one in the code below) and can help me to overcome my problems with it?

Thank you!

The relevant code:
/ Lexical Substitution System wrapper for 
// // Resnik word similarity measure, aggregated according to Mihalcea et al. (2006)
// configs.add(new FeatureConfig(
// createExternalResourceDescription(
// TWSISubstituteWrapperResource.class,
// TWSISubstituteWrapperResource.PARAM_TEXT_SIMILARITY_RESOURCE, createExternalResourceDescription(
//     MCS06AggregateResource.class,
//     MCS06AggregateResource.PARAM_TERM_SIMILARITY_RESOURCE, createExternalResourceDescription(
//     ResnikRelatednessResource.class,
//     ResnikRelatednessResource.PARAM_RESOURCE_NAME, "wordnet",
//     ResnikRelatednessResource.PARAM_RESOURCE_LANGUAGE, "en"
//     ),
//     MCS06AggregateResource.PARAM_IDF_VALUES_FILE, UTILS_DIR + "/word-idf/" + mode.toString().toLowerCase() + "/" + dataset.toString() + ".txt")),
// "word-sim",
// "TWSI_MCS06_Resnik_WordNet"
// ));
//

Torsten Zesch

unread,
Aug 4, 2014, 7:58:11 PM8/4/14
to DKPro Similarity Users
Is there any output on the console that could point to the source of
the problem?
Without some more information it is hard to debug that.

-Torsten
> --
> You received this message because you are subscribed to the Google Groups
> "DKPro Similarity Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dkpro-similarity-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Benjamin Klein

unread,
Aug 6, 2014, 12:07:17 PM8/6/14
to dkpro-simil...@googlegroups.com
Thank you for your quick response.

Here is the output:

TWSI_MCS06_Resnik_WordNet
log4j:WARN No appenders could be found for logger (org.springframework.core.io.support.PathMatchingResourcePatternResolver).
log4j:WARN Please initialize the log4j system properly.
[Warning] taggerModelDir not found, setting to default !
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, be, ride, a, bicycle, .]
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, be, ride, a, bike, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, and, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, be, dance, in, the, rain, .]
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, and, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, dance, in, rain, .]
[someone, be, draw, .]
[someone, be, dance, .]
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, and, a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, kiss, each, other, .]
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, and, a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, talk, to, each, other, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, slice, an, onion, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, cut, an, onion, .]
[a, individual, human, being, creature, human being, individual person, body, character, human subject, individual party, individual subject, man, personage, be, peel, shrimp, .]
[a, individual, human, being, creature, human being, individual person, body, character, human subject, individual party, individual subject, man, personage, be, prepare, shrimp, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, cut, broccoli, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, slice, broccoli, .]
[a, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, be, play, a, musical instrument, string, acoustic guitar, electric guitar, guitar instrument, instrument, guitar part, guitar playing, stringed instrument, guitar instrumental, .]
[a, female, lady, girl, womankind, womenfolk, gal, womanhood, female gender, gentler sex, gentle sex, sex, be, play, the, musical instrument, string, acoustic guitar, electric guitar, guitar instrument, instrument, guitar part, guitar playing, stringed instrument, guitar instrumental, .]
[two, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, be, fistfight, in, a, ring, .]
[two, male, gentleman, guy, person, gent, male gender, gentlemen, son, fellow, individual, men only, fistfight, in, a, ring, .]
[a, small, lad, young man, male, fellow, child, guy, man, gentleman, youngster, male youth, be, play, with, a, canine, pooch, hound, domesticated canine, mutt, puppy, mongrel, pup, stray, .]
Error with instanceSrc and Dest differ in # of attributes: 24 != 1528java.lang.IllegalArgumentException: Src and Dest differ in # of attributes: 24 != 1528
Aug 05, 2014 1:30:31 AM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(410)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:394)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:224)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:145)
at dkpro.similarity.experiments.sts2013baseline.FeatureGeneration.generateFeatures(FeatureGeneration.java:324)
at dkpro.similarity.experiments.sts2013baseline.Pipeline.runTrain(Pipeline.java:114)
at dkpro.similarity.experiments.sts2013baseline.Pipeline.main(Pipeline.java:87)
Caused by: java.lang.NullPointerException
at dkpro.similarity.algorithms.lexsub.TWSISubstituteWrapper.getSubstitutions(TWSISubstituteWrapper.java:134)
at dkpro.similarity.algorithms.lexsub.TWSISubstituteWrapper.getSubstitutions(TWSISubstituteWrapper.java:84)
at dkpro.similarity.algorithms.lexsub.TWSISubstituteWrapper.getSimilarity(TWSISubstituteWrapper.java:51)
at dkpro.similarity.uima.resource.JCasTextSimilarityResourceBase.getSimilarity(JCasTextSimilarityResourceBase.java:22)
at dkpro.similarity.uima.annotator.SimilarityScorer.process(SimilarityScorer.java:97)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:378)
... 7 more

Torsten Zesch

unread,
Aug 6, 2014, 1:19:25 PM8/6/14
to Benjamin Klein, DKPro Similarity Users
ok, it seems that the TWSI is not properly initialized.
Are you sure you have properly installed everything that is needed?

You could try to debug the constructor of TWSISubstituteWrapper in
order to see why this is not properly loaded.

In the development trunk of the project I have also adapted the
component to not swallow exceptions anymore, which might complicate
the case here further.

-Torsten
>> > email to dkpro-similarity-...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "DKPro Similarity Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dkpro-similarity-...@googlegroups.com.

Benjamin Klein

unread,
Aug 6, 2014, 1:24:30 PM8/6/14
to Torsten Zesch, DKPro Similarity Users
Ok. I followed the following instructions:

Models for Lexical Substitution

The lexical substitution system based on supervised word sense disambiguation (Biemann, 2012) automatically provides substitutions for a set of about 1,000 frequent English nouns with high precision.

In order to use this system, download the word models here, and extract it to $DKPRO_HOME/TWSI2. In a final step, edit$DKPRO_HOME/TWSI2/conf/TWSI2_config.conf and set the correct absolute path for mainDir.

Is there anything else that I needed to do? 

Torsten Zesch

unread,
Aug 6, 2014, 1:30:34 PM8/6/14
to Benjamin Klein, DKPro Similarity Users
That should be it, but - well - we don't know for sure that your
configuration is right.
Did you set the DKPRO_HOME environment variable?
How does your TWSI2_config.conf look like?

-Torsten

Benjamin Klein

unread,
Aug 6, 2014, 1:51:51 PM8/6/14
to Torsten Zesch, DKPro Similarity Users
Yes I set it.

Here is the TWSI2_config.conf:

mainDir=F:/workspace4/TWSI2/
trainingsData=data/TWSI2/TWSI2_trainSentences.txt
ambiguousWordsFile=data/TWSI2/TWSI2_ambiguous_words.txt
inventoryFile=data/TWSI2/TWSI2_inventories.txt
substitutionsFile=data/TWSI2/TWSI2_substitutions.txt
monosemousWordsFile=data/TWSI2/TWSI2_targets-singlesense.txt
lemmaMapFile=data/TWSI2/TWSI2_lemma_fullform.txt
modelFolder=data/TWSI2/models_TWSI2
taggerModelPrefix=data/postagger/ptb3-tagger
wekaClassifier=weka.classifiers.functions.SMO
wekaOptions=


Where DKPRO Env = F:/workspace4

Torsten Zesch

unread,
Aug 6, 2014, 1:55:35 PM8/6/14
to Benjamin Klein, DKPro Similarity Users
ok, that looks correct.
I am sorry, but all I can think of now is that you have to debug the
TWSI code to find why it does not get initialized properly.

-Torsten
Reply all
Reply to author
Forward
0 new messages