Dear Tiago,
indeed, the recognition rate under a threshold is simply a closed-set recognition rate with a threshold, which does not meet any standards. Also, as pointed out above and as clearly documented, the new implementation of the recognition rate in the DIR branch is not anything standard -- at least not under a threshold or with open-set scores. I tried to implement what I thought might be interesting. Maybe we can try to push a paper with this new measure :-)
The recognition rate is only valid for closed-set scores with no threshold. I am not even sure if the handbook of face recognition even uses this term at all.
I thought, in the master branch, you were implementing another measure -- the detection_identification_rate in the DIR branch -- as an open set adaptation of the recognition rate. I thought that because the name of your test was indicating open set, and it turned out to be that, when using the new score IO, see below.
Also, as you pointed out, the negative scores with no according positive scores are not read by the cmc_four_column function. However, as the original post from Jana requested to change that, I have implemented this in the DIR branch, where now all scores for all probes are read and returned. When only positive or negative scores exist for a given probe, the other pair is simply None (an empty array should also work). In this case, your test for the open set recognition rate failed -- as it was assuming something different. Now, any of the new measures can use this fact, and sort out any score pairs that they like:
* only closed-set scores for the detection_idetification_rate
* only open-set scores for the false_alarm_rate
* both types for the open set recognition rate under a threshold (which is the non-standard method)
I have seen your test cases, and I have adapted them to work with the new measures:
1) equation (1) -- the detection_identification_rate with threshold 0.5 (there is no case with no threshold any more)
* 7/7 for scores-cmc-4col-open-set.txt
* 6/7 for scores-cmc-4col-open-set-one-error.txt
* 6/7 for scores-cmc-4col-open-set-two-errors.txt
2) equation (2) -- the false_alarm_rate with threshold 0.5 (this is an error rate, lower values are better):
* 0/2 for scores-cmc-4col-open-set.txt
* 0/2 for scores-cmc-4col-open-set-one-error.txt
* 1/2 for scores-cmc-4col-open-set-two-errors.txt
3) no equation, just my random implementation of the open set recognition_rate under threshold 0.5:
* 7/7 for scores-cmc-4col-open-set.txt
* 6/7 for scores-cmc-4col-open-set-one-error.txt (all open set scores filtered by the threshold)
* 6/8 for scores-cmc-4col-open-set-two-errors.txt (one open-set score was not filtered by the threshold)
4) and without a threshold (all open set scores count as mis-recognized):
* 7/9 for scores-cmc-4col-open-set.txt
* 6/9 for scores-cmc-4col-open-set-one-error.txt
* 6/9 for scores-cmc-4col-open-set-two-errors.txt
@all: For me this seems reasonable, but I can understand if people don't like the denominator to change in 3). Let me know of your opinion. The travis builds for this branch are green, so I could merge it into the master branch when there are no oppositions.