It's all relative and any cutoffs would be arbitrary. Also, we sometimes worry at the amount of optimism (overfitting) and sometimes we just care about how decent the final overfitting-corrected indexes are.
If you use validations to select from among more than 2 competing models you will have to enclose the entire process in an outer bootstrap loop to properly penalize for this additional layer of data dredging.
You can think of 1 minus the calibration slope as a "proportion of overfitting", i.e., the fraction of what was learned from the data that was based on noise.