Tuning parameters for Comparators Make No Difference

39 views

Skip to first unread message

Lillian R Ashmore

unread,

Jun 23, 2016, 6:13:23 PM6/23/16

to Silk

Hello! I have a .csv file with generic metabolite names, and I am trying to match them to the closest (Levenshtein) node in an RDF graph (specified by a data file from WikiPathways), based on the rdfs:label attribute.

What I am getting is only 111 of 620 metabolites matched between my .csv. file and the RDF graph. Most importantly, Silk is only returning matches that are 100% (perfect) string matches. I want Silk to return imperfect matches, too! I have played around with the threshold, and weight of all three available Comparators (Levenshtein, Equality, Jaccard), and they do not change the resulting links I get back for a given Comparator type.

What's going on here? Why are only perfect matches coming back? See screen shots below:

First run with parameters Threshold = 0.0, Weight = 2:

Second run with parameters Threshold = 0.5, Weight = 1:

Screen Shot 2016-06-23 at 5.10.15 PM.png

Screen Shot 2016-06-23 at 5.10.29 PM.png

Screen Shot 2016-06-23 at 5.10.55 PM.png

Screen Shot 2016-06-23 at 5.11.18 PM.png

Reply all

Reply to author

Forward

0 new messages