On Thu, 08 Feb 2024 08:18:50 GMT, Willy Nilly wrote:
> Using the accuracy
> figures that you provide, "99% accurate at identifying faces of
> undesirables" means ...
I even clarified what it means: “if it says somebody is on their match
list, there is only 1% chance it’s a false positive”.
> Your statement that the same ratio, 1/100, also applies to "innocent"
> people being tagged as "undesirable" is a total misunderstanding of how
> such ratios work ...
Let’s define
condition U -- person is an undesirable
condition I -- person is identified as an undesirable
We can also have the opposite conditions
condition ¬U -- person is not an undesirable
condition ¬I -- person is not identified as an undesirable
In usual probability notation, P[U] means “probability that the person
who just walked through the door is an undesirable”, and like any
probability, it must have a (real) value between 0 and 1 inclusive.
We can also have conditional probabilities, where P[I|U] means
“probability that a person is identified as an undesirable, given that
they are an undesirable”, and P[I|¬U] means “probability that a person
is (incorrectly) identified as an undesirable, given that they are
*not* an undesirable”.
So my statement about the reliability of the system can be expressed as
P[I|¬U] = 0.01
Note that I didn’t say anything about P[I|U]. That will likely be less
than 1, but its exact value is unimportant for this analysis. Let’s just
say it’s 1. If the actual value is less than 1, then this term makes
even less of a contribution to the total result below, which, we will
soon see, is dominated by the other term.
Note that, by definition, since any condition is either in effect or
is not,
P[I|U] + P[¬I|U] = 1
P[U] + P[¬U] = 1
We also have the probability that any person walking through the door
is actually an undesirable, which I gave as
P[U] = 0.001
or conversely,
P[¬U] = 0.999
So now, by Bayes’ theorem, we can compute P[I], the probability that
the system will register a match, as
P[I] = P[I|U]P[U] + P[I|¬U]P[¬U]
= 1 × 0.001 + 0.01 × 0.999
= 0.01099
This is about 11 times the value of P[U]! Which means our system is
identifying about 11 times as many “undesirables” as are actually
present. So we have to wade through 10 false positives for every
“undesirable” we actually find.
That is the “base-rate effect”.