Match score using option --linkfile

28 views
Skip to first unread message

Mauro Fraboni

unread,
Feb 20, 2017, 9:05:51 AM2/20/17
to duke
I run duke for RecordLinkage using the following command:

java no.priv.garshol.duke.Duke --linkfile=test.txt --progress  --singlematch  --threads=3 --profile --lookups myexample.xml

I attach the config file myexample.xml.

I received the following console output:
9591 processed, 1 records/second; comparisons: 278118

Total records: 44639591
Total matches: 15365
Total non-matches: 24226
Run completed, 1 records/second
39591 records total in 24140 seconds
Reading from source: 0 (0%)
Indexing: 0 (0%)
Searching: 67483 (99%)
Comparing: 3 (0%)
Callbacks: 2 (0%)

Total memory: 203423744, free memory: 26320528, used memory: 177103216


and these are some of the records of the link file; I don't understand why the score is always the same and equal to 1.0; I noticed some cases in which the compared fields are not exactly the same so I expected to receive a score lower than 1.0.

+,3,101496659,1.0
+,5,G38595309,1.0
+,4,F39566171,1.0
+,8,201560083,1.0
+,12,501569233,1.0
+,59,F36381040,1.0
+,60,800060848,1.0
+,61,200273240,1.0
+,67,901461448,1.0
+,80,602301688,1.0
+,90,936099939,1.0
+,68,401597965,1.0
+,92,801976856,1.0
+,98,101099351,1.0
+,109,802471989,1.0
+,107,501628224,1.0
+,115,001782027,1.0
+,129,C00865391,1.0
+,121,700806171,1.0
+,289,B16782036,1.0
+,119,J35548153,1.0
+,130,241415483,1.0
+,304,J39897009,1.0
+,320,S35128853,1.0
+,294,A34986139,1.0
+,321,L38612405,1.0
+,718,336569516,1.0
+,768,J38269005,1.0
+,921,940383190,1.0
 

Thanks.
myexample.xml
Reply all
Reply to author
Forward
0 new messages