What's more rare is to have in-order processing (using a stream) that can be split to multiple threads, as in-order and multi-threaded tend to be at-odds. If you are just processing images, then you can commonly process them each individually and would use separate RPCs. For in-order+multi-threaded, you need to cut up the input data in a server-specific way (e.g., I-frame frequency for video encoding), do heavy-lifting in a thread-for-each-chunk, and then recombine in the end. (If it isn't server-specific, the client is more likely to do the cutting.) The cutting and merging easily become application-specific. Although your tool would probably help many cases even with application-specific logic, it's just not a common enough case for us to have utilities on hand. I think I've seen most of such cases show up in server→client processing, like pubsub, where the server is actually sending work to the client yet can't "just do another RPC" to the client.
That said, I'm certain there are others that will be happy to pick up and use your code. Thanks for posting it!