This year has been incredibly busy. These forums or GitHub are the best place for assistance; sorry for the StackOverflow silence.
At first glance, StackOverflow has a good idea start with refactoring the setup out from the innermost loop, simply setting new inputs and calling the ControlSystemSimulation. However, your largest gains will probably be from ensuring your universes are as sparse as possible.
As written, you have
a = ctrl.Antecedent(np.arange(0, 70.1, 0.1), 'a')
b = ctrl.Antecedent(np.arange(0, 6.01, 0.01), 'b')
c = ctrl.Consequent(np.arange(0, 12.01, 0.01), 'c')
which are lengths 701, 601, and 1201, respectively. Every calculation applies against these universes.
Importantly, as of Scikit-Fuzzy 0.2 there is no difference in results or accuracy for piecewise-linear membership functions if only the critical points are included. Linear interpolation is used to fill in the gaps, so all you need are the anchor points; these need not be evenly spaced. So, instead of the above, try
a = ctrl.Antecedent([0, 3, 6, 16, 24, 30, 45, 70], 'a')
b = ctrl.Antecedent([0, 0.01, 0.02, 0.03, 0.05, 0.1, 0.12, 6.], 'b')
c = ctrl.Consequent([0, 0.01, 0.02, 0.04, 0.05, 0.1, 0.2, 12], 'c')
which now have lengths of 8, 8, and 8.
When I run this on my workstation, I see a single pass through the slimmed-down FIS function taking 1.7 ms with sparse universes, versus 1.7 ms with non-sparse universes. So this should gain you an additional factor of at least 5.
Beyond that point, it's worth noting that every simulation run is independent. If you have multiple cores, you could create separate ControlSystemSimulation objects for each core and start cranking; this should scale well.
There is likely additional opportunity for optimization in the code itself, but first try making your universes as sparse as possible.
Regards,
Josh