Hi everybody,
I would like to propose a new topic for this group.
In neural networks it's quite important to have, for every "block" used in a network, a backpropagation function that explains the variation of inputs and parameters in relation to a variation of the output, i.e. the gradients.
In recent years, several libraries has been proposed, and recently I'm trying to collect and study some of them, do understand if they fit the problem, but in general I have some concerns: One of the approaches, for example, is to take a generic function (either a generic lambda or a templated function), traverse by feeding in a special type and produce a new type representing the AST, then transform this last type to a new type representing its derivative, and then generating code. This approach as some problems, one is that it discards the original structure and flattens all the call structure into a single AST (that is redundant, and so takes more memory to generate and is harder optimize). I'm sure there're might be more efficient methods, but I to my understanding most of them fall under the hat of "abusing the type system" methods.
In the last couple of years I've been researching possible "language level" solution, mostly from a theoretical point of view (trying to figure out if it was better to have just the differentiability, or some more complex mechanism that could allow for arbitrary AST transformations into a toy language), but more recently I was pointed to this effort
https://gist.github.com/rxwei/30ba75ce092ab3b0dce4bde1fc2c9f1d, implemented by Google engineers into Swift for Tensorflow, and presented yesterday here:
https://www.youtube.com/watch?v=Yze693W4MaU.
What they did is embed the "differentiability" concept deep into the Swift language, allowing the swift compiler itself to solve the problem of differentiation in a much more elegant and efficient way.
Is there in this group anyone interested to work on how a similar approach could be applied to C++?