A couple more thoughts. The rule of 15 is only half of the problem. That
rule insures that your models are stable and repeatable. But you also
have to insure adequate power and precision. The very fact that you need
to use propensity scores is an indication that you expect to lose some
power because of multicollinearity between your treatment variable and
various covariates. This loss of power manifests itself in a large
number of observations that can't be matched properly. If you stratify
by your propensity score, you lose power by the imbalance in sample
sizes in the various strata.
You can and should account for this in your sample size justification.
The formulas are tedious but not difficult.
Another point is that matching is almost always a bad choice unless you
have a lot of data that you're willing to throw away. Serious matching
will leave a lot of your data unmatched, especially if the propensity
score matching is needed. The only time I would match is if you have
lots of controls for every possible treated patient. Then you're losing
from the group that has "too many" in it anyway, so the loss doesn't
sting as much.
And those people who use a rule of 10 versus a rule of 15 have nothing
really to back themselves up with other than a fear that the rule of 15
is too harsh. I've actually heard that it might be better to move in the
opposite direction and that you should strive for 20 events per
independent variable.
Finally, from what I've read, there's not a lot of consensus in the
research community about how to compute and use propensity scores.
Whenever there is lack of consensus, that gives you the green light to
do what you think is best, as there is no definitive source that
everyone uses. Be ready to adapt to a different approach though, as the
peer reviewers are unpredictable and are likely to ask for changes no
matter which approach you choose.