Stephan and other PJS folk-
On 30/04/2013 22:56, Herhut, Stephan A wrote:
> If the programmer also supplied an explicit return that returns something other than undefined, that result will overwrite the result. I have decided to use undefined as a signal rather than presence of a return statement as the former allows for easy implementation of the API as a library.
In my own prototype, since I do use the `set()` method, I've been using
the invocation of that method as the signal meaning "ignore the return
value from the callback." Obviously since you do not want the `set()`
method, that option is not immediately available, unless we treat
`out[i] = expr;` as having the signalling effect, which I infer you do
not want to do.
I don't have any strong allegiance to the `set()` method or the various
write-only/write-once constraints at this point; they were largely an
attempt to restrict the protocol as much as possible.
> I have not fleshed out the API for all methods, as I do not quite know what it should look like, yet. So far, I think constructor, map and filter make sense. For the other ones, I am not convinced this is needed. What does reduction on multiple values even mean?
Indeed, I have struggled with that question myself: Which methods should
be generalized for this token API.
The rule-of-thumb I've employed is: If one can predict the result shape,
and if it can be used to help avoid intermediate allocations, then a
token (oriented around that result shape) might be warranted.
So:
* constructor and map() make sense.
* filter() : does it make sense? You cannot predict the precise
length ahead of time. I guess the caller could guess the length (and
thus the whole shape); but what happens if they guess wrong? Do they
need to guess the exact length, or will it suffice for them to guess >=
the end length? If they guess wrong, do we freshly allocate a result
then, or throw an exception? (Nonetheless, there is probably some
variant here that is reasonable.)
* reduce() : my intuition is that an outptr does not make sense given
its current API. [1] But this is not a case I feel the need to optimize
right now; I've only been thinking about it in terms of making the API
"uniform" in some sense.
* scan() : seems analogous to reduce(). But then the fact that you
build up a result array/matrix leads me to wonder if I should not be so
quick to dismiss the case of reduce. (I worry in particular about the
row-computations on the image-resize benchmark, and if some sequential
variant of scan could allow us to compute the rows in-place.)
----
The most important thing to me is that the token itself have parallel
methods, so that a large sub-computation can itself be distributed.
(This is again related to what I'd like to see for the image-resize
benchmark.) I didn't see this addressed in Stephan's note, but maybe it
was implicit. (Maybe it would just be built-in to all StructViews in
engines with PJS.)
So, as a toy-demonstration of one potential API method:
var [A,B,C] = [2,3,2];
print(new ParallelMatrix([A], [B,C], (i, outptr) => outptr.gather((j,k) => 1000+100*i+10*j+k)))
prints:
[[[1000,1001], [1010,1011], [1020,1021]],
[[1130,1131], [1100,1101], [1110,1111]]]
The `outptr.gather()` is allowed to fork off parallel threads to compute
the [B,C] elements; so even though outptr is not an instance of
Array/ParallelArray/ParallelMatrix/whatever-we-call-it, it nonetheless
has parallel capabilities.
Cheers,
-Felix
[1] At times I've wondered if all we would need is the programmer to
provide the shape of the expected result from the reduction operator, so
that if the user is constructing an array/matrix from a reduce()
invocation, we could do some preallocation and get some reuse of the
associated intermediate results. But inevitably I run into the issue
that the original cell-shape and the intermediate values are not
necessarily of the same domain, and I wonder if this use-case even
actually exists. That's about where I give up on trying to shoe-horn
this into the API.