It would. I've thought of that. At the moment the whole application is
small enough that I'm planning on using a home-rolled cooperative
multitasker that dodges the whole context-switch thing at the expense of
weighing down the developer with the need to chop low-priority
computations up into bits that are small enough that they don't bog down
important tasks. So the whole "can't RTOS" thing is moot for me at the
moment.
As far as the "only one task gets the math processor", I've actually
already been there, done that (sorta), with the ADSP 2101 using an RTOS.
The ADSP 2101 has some hardware context associated with its DSP
functionality that is simply not accessible via software (except by
"push" and "pop" into very shallow hardware stacks). It's not even a
matter of "slow" -- it's "you can't, sucker". So if you want to use its
DSP features in an RTOS you're limited to doing it in one task. (Well,
one task and one ISR, thanks to those shallow stacks).
All the "regular processor" stuff can be context-switched just fine,
however. So we used the thing exactly that way: we had one task for the
heavy lifting (running a spinning-wheel gyroscope that had to be in
closed loop) with a bunch of tasks to make it play nice with the balance
of the system. That one magic control task was the _only_ task that got
its fingers onto the MAC and associated instructions; everything else was
kept away.
The board, by the way, worked great.
It would be harder to do this with the M4F. Ironically, it's because the
tools support floating point -- in the case of that 2101, the tools
didn't know what to do with a MAC instruction and never generated one.
So it was easy to tuck all the "DSP" stuff away in assembly language code
that was only called from one c file.
I suppose it might be possible to compile just one or two magic files
using the M4F switch, and compile the rest using the M4 switch (or
whatever the gnu compiler supports -- that's my next task!!!). If so,
and if it works without weird namespace or other collisions, then I'd get
software-synthesized math for most of the thing, and hardware math for
the important stuff.