Thanks both for coming today!
The following link leads to the final message in the logs, so if either
of you wish to refer back to all that was said, go here and scroll back
as far as necessary:
http://snailgeist.com/irc-logs/?message_id=282008
We left it with, as I understand it, one open question: the explicit
control of laziness in Petalisp's compute operation compared to the
ability for an implementation of SRFI-231 to have implicit laziness.
Does an explicit "compute" step give users a useful ability to control
where the "heavy lifting" occurs?
I presume that lazy array operations will take O(1) time or maybe
O(number of dimensions) - "heavy lifting" being any operation that
actually materialises an array representation so will take time that's
O(f(size of the data)), (where "size of the data" may be much less than
the product of the dimensions in the case of a sparse array, but still
very large); and f will hopefully be at worse linear; logarithmic would
be nice (but something like an FFT ends up being O(N log N), of course).
I won't try and risk defining a definitive boundary between the time
cost of lazy operations versus materialising an array, but I think it's
clear the former will scale at worst with the dimensionality of the
array while the latter will scale with some function of the number of
elements in the array (not forgetting sparseness).
As I understand it a lazy SRFI-231 implementation *could* provide
at-most-O(number of dimensions) implementations for all the operations
that create arrays from other arrays in some form, and only do the
"heavy lifting" when the user first calls array-copy (which by my
reading explicitly materialises an array), array->list[*],
array->vector[*], or calling an array-getter (directly or via
array-ref). Presumably such an implementation would also cache the
materialised form to avoid having to recreate it.
Marco's experiences with Petalisp imply that such laziness - compressing
a stack of array operations into one larger operation - is useful for
performance reasons; I can certainly imagine it would reduce
garbage-collector pressure by avoiding creating unnecessary intermediate
arrays and there's probably a lot of operations that are composable in
ways that collapse a sequence of such operations into a single
operation. But that holds whether the laziness is explicit or implicit.
And there is also the possibility that making such laziness explicit has
some value in allowing applications to create "array recipes" with a low
expectation of cost, particularly if a way of inspecting such a
resulting recipe other than just calling (compute) on it is available -
perhaps one can query the dimensions of the resulting array, or query
the expected time complexity of calling (compute) on it, or directly
expose the underlying graph?
On the other hand, having an explicit compute operation to turn a recipe
into an array is a "performance wart" that users have to think about,
rather than just telling the implementation what they want done and
trusting the implementation to do it - it slightly leaks implementation
details in ways that are only helpful in some specific cases (and the
only case that comes to mind for me for wanting to control when
materialisation occurs is that we might give a user an interactive UI
for constructing an array recipe, assuming all those operations complete
in somewhat bounded time so can be allowed to block the UI, while
(compute) is run in a thread while the user is shown a progress bar).
So - I suspect that WG2 will probably be interested in adopting
SRFI-231bis, because it's already in SRFI form and based on Scheme
idioms and so on; this isn't "Do we pick SRFI-231 or Petalisp?" so much
as "Does SRFI-231 need to be modified to allow implementations to do all
the good stuff Marco developed in Petalisp?"; so I think what we're
asking is for you two good people to work out how to make SRFI-231 as
good as it can be so that we can adopt it, if you're OK with that, and
we'll gladly do anything we can to help :-)
Many thanks for your time,