Thanks for reaching out. The "Analyzing ML Fairness" section of
does briefly describe how this feature works but not in great detail.
With demographic parity, the tool attempts to find a set of thresholds where the positive prediction rate for each facet (the % of people that get classified in the positive class - meaning the model predicting they will reoffend within 2 years) is equal among all groups. Additionally, it is looking to find the best set of these thresholds to minimize misclassifications (either false positives or false negatives).
So, the tool takes the first facet ("African-American" in this case) and tries each possible threshold value for it, from 0.00, then 0.01, up to 1.00). At each threshold, it looks at the positive prediction rate and finds the threshold for each of the other facets that best matches that positive prediction rate. With all the thresholds calculated, the tool determines how many misclassifications there are. It repeats this for each possible threshold value for the first facet until it has 100 different sets of faceted thresholds all exhibiting demographic parity. The set of thresholds with the lowest amount of misclassifications among those is the one displayed in the tool, as it is the best set of thresholds that satisfies demographic parity.
Additionally, if you were to set the "cost ratio" to something other than 1, then the calculation for each set of thresholds wouldn't be based on the lowest amount of misclassifications, but it would weigh false positives or false negatives higher depending on what you set the cost ratio too.