Let me try to make the question to the group a bit simpler:
If we were to use the PRUs to expand the parallel I/O capability of a PocketBeagle, is that useful and what would be the system interface?
One example of this being done is in something like the PocketScroller, which uses series of shift registers to control massive arrays of LEDs through on cheap HUB75 panels. That gives you an idea of what the performance and extensibility could be like.
So, what would be a rational way to expose this sort of thing to the Machinekit software? Linux driver? DDR ring buffer?
Of course, the above doesn't really give you much of a feel for what can be done on the input side. The scope of the project has been meant to be bi-directional, not just parallel output.
The BeagleLogic project gives a good idea of parallel capture capabilities, though here we are focused on augmenting the stream with shift registers.
Perhaps today Machinekit looks at the PRUs as just a way to dump stepper pulses out, but is there a reasonable way to augment it to include more generic parallel I/O?