I have created a gist with my thoughts of what such a function could look like.
The examples shown are a bit contrived, with border_size=(0,0) and non-overlapping 'rolling' windows, but the idea should be clear I think.
The proc_func function can either process and return a value synchronously, or create a separate job/process for each window, depending on implementation. By keeping this logic separate from the windowing function any multiprocessing-type solution can be used according to preference. In the case of asynchronous processing the results tuple will either be filled with None, or any other value returned by proc_func, ie it will not be the asynchronous result itself and the proc_func will have to implement a way for the asynchronous results to be returned when available (callback function as an example).
Your thoughts?
Riaan