Hi,
I am trying to replicate the Generative Adversarial Network (GAN) example found
here with Julia. I think I manged to make it work (see
here), but I was wondering if I could get some feedback on one issue. I've read in a few post about the dangers training two networks at once, such as longer training times due to gradients with respect to another network or unknowingly updating the weights of the other network. On the original example in MXNet they use a function called .detach() to remove a variable from a graph (to prevent the errors previously mentioned). It seems to me that this is unnecessary with knet since Grad only takes gradients with respect to the first input.
Could anyone take a look at my code (see
here) and comment on any improvements or if my assumption is correct?
Thanks!