setting up dkpro-similarity

180 views
Skip to first unread message

PW

unread,
Apr 27, 2016, 9:26:18 PM4/27/16
to DKPro Similarity Users
Hi All,
    I want to set up dkpro-similarity to compute the similarity between 2 sentences to be as simple as possible (without any pipeline). I can successfully compute all measures that do not require additional resources.

I add this to maven but it doesn't seem to be recognized when I do  import org.dkpro.similarity.algorithms.vsm.VectorComparator; in the main file.
<dependency>
<groupId>org.dkpro.similarity</groupId>
<artifactId>dkpro-similarity-algorithms-vsm-asl</artifactId>
<version>2.2.0-20150918.201552-5</version>
</dependency>

Also, I have difficulties trying to file example file to simply compute the similarity score using measures involving resources, e.g. Resnik 

Can anybody guide me what to do?

Thanks!

Torsten Zesch

unread,
Apr 28, 2016, 3:05:46 AM4/28/16
to PW, DKPro Similarity Users
The version 
<version>2.2.0-20150918.201552-5</version>
doesn't look right.
Try simply 2.2.0 which is the latest release version.

-Torsten

--
You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nicolai Erbs

unread,
Apr 28, 2016, 3:42:38 AM4/28/16
to dkpro-simil...@googlegroups.com
Hi,

the group ids have changed with the new version of DKPro Similarity on Maven Central. Please add the following dependencies to your POM:

<dependency>
   <groupId>org.dkpro.similarity</groupId>
   <artifactId>dkpro-similarity-algorithms-vsm-asl</artifactId>
   <version>2.2.0</version>
</dependency>
<dependency>
   <groupId>org.dkpro.similarity</groupId>
   <artifactId>dkpro-similarity-algorithms-lexical-asl</artifactId>
   <version>2.2.0</version>
</dependency>

The imports in the class are then:
import dkpro.similarity.algorithms.api.SimilarityException;
import dkpro.similarity.algorithms.api.TextSimilarityMeasure;
import dkpro.similarity.algorithms.lexical.ngrams.WordNGramContainmentMeasure;

You can get the similarity between two lists of tokens by using:
TextSimilarityMeasure measure = new WordNGramContainmentMeasure(3);
double score = measure.getSimilarity(tokens1, tokens2);

This should work smoothly.  In case you experience any difficulties, please let us know.

Best Regards,
Nicolai


Am 28/04/16 um 03:26 schrieb PW:
--

PW

unread,
Apr 28, 2016, 11:11:33 AM4/28/16
to DKPro Similarity Users
Thanks a lot both for suggestions. 
I changed to 2.2.0 and I tried what Nicolai suggested. It works fine for computing the similarity scores that do not require additional resources. 
However, I also want to compute the similarity score by using ESA and Word Similarity, e.g. Resnik on WordNet or other resources.
For ESA, I tried with 

import org.dkpro.similarity.algorithms.vsm.VectorComparator; 
 ...
 VectorComparator esa = new VectorComparator(
new VectorIndexReader(new File("[PATH TO DKPRO_HOME]/ESA/VectorIndexes/wiktionary_en")));
esa.setInnerProduct(InnerVectorProduct.COSINE);
esa.setNormalization(VectorNorm.L2);

score = esa.getRelatedness(lemmas1, lemmas2);

System.out.println("ESA relatedness: " + score);

 I cannot import the class. 

Thanks,
PW

Nicolai Erbs

unread,
May 2, 2016, 4:10:56 AM5/2/16
to dkpro-simil...@googlegroups.com
Dear PW,

using ESA for computing Text Similarity does not work out of the box. You need to download a vector index for your resource and set the DKPRO_HOME environment variable. The documentation https://dkpro.github.io/dkpro-similarity/settinguptheresources/ will help you doing the required steps.

Best regards,
Nicolai

Am 28/04/16 um 17:11 schrieb PW:
-- You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Torsten Zesch

unread,
Feb 7, 2017, 5:07:45 AM2/7/17
to kevin alex mathews, DKPro Similarity Users
The version of the dkpro-paren-pom doesn't look right.
"de.tudarmstadt.ukp.dkpro.core:dkpro-parent-pom:pom:2.2.0"

The current snapshot version uses
<parent>
<groupId>org.dkpro</groupId>
<artifactId>dkpro-parent-pom</artifactId>
<version>13</version>
</parent>

2017-02-07 8:33 GMT+01:00 kevin alex mathews <kevinale...@gmail.com>:
I am trying to replicate Step 2: Create a new Project in https://dkpro.github.io/dkpro-similarity/gettingstarted/.

I am getting the following error when I perform a Maven build in Eclipse.

[INFO] Scanning for projects...
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[FATAL] Non-resolvable parent POM for com.kevin.iitm:com.kevin.iitm.sandbox:0.0.1-SNAPSHOT: Failure to find de.tudarmstadt.ukp.dkpro.core:dkpro-parent-pom:pom:2.2.0 in http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-releases was cached in the local repository, resolution will not be reattempted until the update interval of ukp-oss-releases has elapsed or updates are forced and 'parent.relativePath' points at wrong local POM @ line 19, column 11
 @
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]  
[ERROR]   The project com.kevin.iitm:com.kevin.iitm.sandbox:0.0.1-SNAPSHOT (/home/kevin/Documents/Research/workspace/Test/pom.xml) has 1 error
[ERROR]     Non-resolvable parent POM for com.kevin.iitm:com.kevin.iitm.sandbox:0.0.1-SNAPSHOT: Failure to find de.tudarmstadt.ukp.dkpro.core:dkpro-parent-pom:pom:2.2.0 in http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-releases was cached in the local repository, resolution will not be reattempted until the update interval of ukp-oss-releases has elapsed or updates are forced and 'parent.relativePath' points at wrong local POM @ line 19, column 11 -> [Help 2]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException

-- You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-users+unsubscri...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-users+unsub...@googlegroups.com.

kevin alex mathews

unread,
Feb 7, 2017, 5:25:37 AM2/7/17
to DKPro Similarity Users, kevinale...@gmail.com
Thanks. The issue is solved.

What about computing similarity score using measures involving WordNet, e.g. Resnik 
Which is the right module?

The home page suggests
algorithms.lsr    Based on lexical-semantic resources such as WordNet or Wikipedia, e.g. GlossOverlap, JiangConrath, LeacockChodorow, Lin, Resnik,      WuPalmerComparator
But I can't find the module under the groupID org.dkpro.similarity.

Thanks,
Kevin.
-- You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-users+unsub...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.

Torsten Zesch

unread,
Feb 7, 2017, 5:35:51 AM2/7/17
to kevin alex mathews, DKPro Similarity Users
Due to dependency problems, the wordnet-based modules are not on maven central (yet).
However, you can find them here:

-T

-- You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-users+unsubscri...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "DKPro Similarity Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-similarity-users+unsubscri...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages