hi,
sorry to come back so quickly, after those last updates about memory management optimizations,
i want to share those latest optimizations about the cpu and some thoughts...
the results are good, on this configuration, compared to the bounded.go.
the solution consists of a new type of stream that detects the input functions and return instances
of compiled function stored into a cache.
the example provided is made to get ride of 95% of the reflect package calls using the original api.
for this demonstration the cache was manually created.
but it is reasonable to say that static analysis cannot fully help.
overall, there is one thing i want to emphasizes on,
its not central, but it matters.
the context management is leaky.
what i notice the most is the additional useless work
it creates to check for cancellation everywhere
because in theory and in practice it should be tested
within each independent component.
while this api clearly highlights the matter i believe its phenomena that
also occurs for regular writers as soon as they try to write larger
piece of code.
i need to add more on the fact that static analysis is not a good
option to generate some kind of hot path like demonstrated.
there is some problems inherent both to the api proposed
and the type system where the call stack graph can t be built entirely statically upfront.
leading to extraneous callstacks and type conversions.
at best a static analysis can help discover some hot path,
but inevitably some wont be solved until the very last moment,
this all thing leading to a mixture of stubbed functions
from cache and from jit reflection. which in both cases are bad for performance.
not to mention maintenance.
my thoughts is that while regular go code when written with all optimizations and care,
is definitely better, it is also harder to write, over redundant and harder to manage over-the-quantity.
the language does not provide elegant solution about that.
the attempt to move toward generic and error handles (?) seems
to try to answer those questions among others, i guess, I might be wrong.
but for now this is still questionable.
i have tried to provide a solution around that,
implementing in a somewhat-generic way those patterns that the language
has promoted.
the result is that the cost/benefit is not the same for all aspects.
for performance it s a win against simple implementations,
but still a loss against well developed solutions, untested but i strongly believe so.
for usability (?), i say this is a win for this implementation.
i hope for the language to find a way to improve expressiveness
so it is easier to consumer resources with best care and performance.