Hi all,
Great work, many thanks Thomas!
I was just wondering if there is any implementation for the new gradient TD algorithms using non-linear representations, MLPs to be specific?
I have seen the results on RLGO on the main papers, but as I believe they used logistic linear model, something of that sort. So is there also any reported results also?
In my side, I have an implementation for it and the multiplication by hessian is inspired by the details in Bishop's book, pattern recognition. Though I would like to get code/ results to consult and verify.
Regards,
Aisha