Hi,
Here is a convolution layer that I use:
# CONVOLUTION LAYER ############################################################
# Dropout
# Default Pooling: Max pooling with 2x2 kernel size (Note that this will change single sample data size
# but this is not a concern while defining convolution layers)
struct Conv; w; m; wb; f; p; end
(c::Conv)(x) = pool(
c.f.(
batchnorm(
conv4(c.w, dropout(x,c.p);stride=1, padding=0), c.m, c.wb; training=true)))
# Constructor
Conv(w1::Int,w2::Int,cx::Int,cy::Int;f=relu,pdrop=0) = Conv(param(w1,w2,cx,cy), bnmoments(), param(bnparams(cy)), f, pdrop)
# 0.01*randn(w1,w2,cx,cy)
And here is how I update the learnable parameters, where d_lenet is the discriminator chain in DCGAN:
fval = @diff d_lenet(xx,yy)
for param in params(fval) # For all learnable parameters
∇param = grad(fval, param) # Gives the gradient of func for the parameter param
update!(param, ∇param)
end
Problems:
1. The optimization of the generator is tricky. Because the generator loss uses both the generator and the discriminator. Hence, when I use "for param in params(fval) " above, it iterates over all learnable parameters, inc the discriminator's. Is there a way of selecting the generator parameters? I can try to put a counter in the for loop and optimize only for the nth param in params(fval) but this is dirty :)
2. When I set the "f" function in the layers (eg. the convolution layer) to relu all looks fine. But when I set it to leaky_relu (which is a function that I explicitly define) it just does not work.
Thanks