Implementing DCGAN with Struct Chain

6 views
Skip to first unread message

burak...@gmail.com

unread,
Sep 28, 2021, 1:40:03 PM9/28/21
to knet-users
Hi,
I am trying to reimplement the dcgan.jl example (https://github.com/denizyuret/Knet.jl/tree/master/examples/dcgan-mnist) using struct Chain, similar to the approach in Knet CNN example. I have run into some problems that I could not resolve.

Here is a convolution layer that I use:

# CONVOLUTION LAYER ############################################################
#       Dropout
#       Default Pooling: Max pooling with 2x2 kernel size (Note that this will change single sample data size
#                                                          but this is not a concern while defining convolution layers)
struct Conv; w; m; wb; f; p; end

(c::Conv)(x) = pool(
                    c.f.(
                        batchnorm(
                              conv4(c.w, dropout(x,c.p);stride=1, padding=0), c.m, c.wb; training=true))) 
# Constructor
Conv(w1::Int,w2::Int,cx::Int,cy::Int;f=relu,pdrop=0) = Conv(param(w1,w2,cx,cy), bnmoments(), param(bnparams(cy)),  f, pdrop)
#                                                           0.01*randn(w1,w2,cx,cy)

And here is how I update the learnable parameters, where d_lenet is the discriminator chain in DCGAN:
        fval = @diff d_lenet(xx,yy) 
        for param in params(fval)  # For all learnable parameters
            ∇param = grad(fval, param) # Gives the gradient of func for the parameter param
            update!(param, ∇param) 
        end

Problems:
1. The optimization of the generator is tricky. Because the generator loss uses both the generator and the discriminator. Hence, when I use "for param in params(fval) " above, it iterates over all learnable parameters, inc the discriminator's. Is there a way of selecting the generator parameters? I can try to put a counter in the for loop and optimize only for the nth param in params(fval) but this is dirty :)

2. When I set the "f" function in the layers (eg. the convolution layer) to relu all looks fine. But when I set it to leaky_relu (which is a function that I explicitly define) it just does not work.

Thanks
Reply all
Reply to author
Forward
0 new messages