Your adaptation of those ideas looks quite interesting. Excited to see what will be the result. If the extra computation time is not too bad, this will probably works.
I must agree that according to my few experiments, SPSA is amazing to tune, but without good starting values, it will not help much. I beg to be corrected, or enlightened on this, but here is my understanding:
For example, in a few private experiments, if I introduce a new parameter (which could be worthless or worse, detrimental), SPSA (with default R and A values ) will not converge to 0 to cancel the parameter (as I was expecting).
Once you have the best SPSA value, it is still not a guarantee that you will improve the master code with your new parameter...
The problem can be that the extra computation time will be the culprit. Or it could be that the code is simply wrong or random.
SPSA will find the best average value for this parameter, but simply, this average value might not do any good in some extreme cases, which overall will be more harmfull then the "good" average cases. The different cases are not necessarily distributed "normally".
To conquer this, my thought was that by breaking the parameter in more parameters, I would be able to find out the cases that pay. But still, it does not seem to work that way.
Also, one can look at tests done by hw (the endless optimise best move series) to get convinced that SPSA will eventually converge to "something", but that will never tell you if the original concept was good or not, however hard you try to make it work.