I need help thinking about what numbers go into DCGAN Generator models, in order to produce larger images

216 views
Skip to first unread message

Grayson Earle

unread,
Aug 4, 2020, 2:51:59 PM8/4/20
to Discuss
Hey all,

I really have tried to do my due diligence here but I can't find a lot of documentation on why certain numbers are chosen. I'm also fairly hazy on how convolutions work in generators (have a better understanding in terms of classifiers) so that's not helping my case. I think my question should be pretty simple to address for some more experiences folks out there though.

Take Google's tutorial for example, the Generator class:

def make_generator_model():
    model
= tf.keras.Sequential()
    model
.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model
.add(layers.BatchNormalization())
    model
.add(layers.LeakyReLU())

    model
.add(layers.Reshape((7, 7, 256)))
   
assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size

    model
.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
   
assert model.output_shape == (None, 7, 7, 128)
    model
.add(layers.BatchNormalization())
    model
.add(layers.LeakyReLU())

    model
.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
   
assert model.output_shape == (None, 14, 14, 64)
    model
.add(layers.BatchNormalization())
    model
.add(layers.LeakyReLU())

    model
.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
   
assert model.output_shape == (None, 28, 28, 1)

   
return model

I understand that the input is a noise vector of 100x1, and the output is a 28x28x1 image. My goal is to produce a 512x512x3 image.

In the line:
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))


Where is 7x7x256 coming from? I understand that 7x7 is a multiple of the eventual 28x28 size, so that makes sense somewhat, but what is the 256 all about? And then in the following layers, I notice a pattern but I'm not sure how to rew-rite it so it works for a wholly different image size.
Any help or direction is appreciated.

Thanks!

Sambath S

unread,
Aug 4, 2020, 6:07:26 PM8/4/20
to Discuss, grayso...@gmail.com
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))  

This Layer takes a (1,100) vector and passes it through a dense layer with (7*7*256) neurons.
In the end, the output the author needed was (28, 28,1) and therefore he started with 7*7.
It is not necessary that you should start with this.

Maybe the number 256 was chosen because it produced the best result.
You can change that number if you want and the the code will work provided you make necessary adjustments in the inputs of other layers.

In this case, The author chose (7*7*256) neurons earlier so that he can reshape it to (7, 7 256 ) using Reshape and use this to generate the image by upsampling.
You can use other image sizes too and the code will work.

In this case, the The author uses Conv2DTranspose to upsample the image.

Sambath S

unread,
Aug 4, 2020, 6:17:29 PM8/4/20
to Discuss, Sambath S, grayso...@gmail.com
I have edited this code so that you could get a (512, 512, 3) shape image.

def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(8*8*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((8, 8, 256)))
    assert model.output_shape == (None, 8, 8, 256) # Note: None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 8, 8, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 16, 16, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(32, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 32, 32, 32)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(16, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 64, 64, 16)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(8, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 128, 128, 8)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(4, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 256, 256, 4)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 512, 512, 3)

    return model


Hope this helps.

Reply all
Reply to author
Forward
0 new messages