Hi mate,
I've written and debugged a few autoencoders. First check you've got all your fundamentals down:
One, make sure your data is suitably normalized, your data should approximately have a mean of 0
and a standard deviation of 1 at every pixel, input and output (you might need to tweak your activation
function on the output) . Even better, whiten your data, most autoencoder work uses whitened data.
Look at using ZCA or Local Contrast Normalization.
Two, your architecture is a little weird: you use convolutions, but don't have any bias parameters.
This is not totally wrong, but it's probably not what you want. In Tensorflow just using convolutions
doesn't automatically include a bias term.
Three, general NN debugging tip: quadruple check everything has the exact shape you expect at
every point and every aggregation is being performed over the dimension you want.
Four, the weird thing about training autoencoders (especially when not doing denoising) is that
the reconstruction MSE on its own isn't necessarily a good indication of the quality of the
internal representation. Look into other tools like t-SNE to actually assess the quality of internal
representations.
Debugging Tips:
Start small, can you get a simple 1 layer network with no pooling or bottleneck to converge?
There is obviously a trivial solution to this, so you should hit 0 MSE pretty quickly. Slowly
add more layers one-by-one, but make sure the system converges every time you do.
Try building a fully connected auto-encoder and getting it working before moving to
a more complicated convolutional auto-encoder architecture. Try replicating something
from a paper before trying to put together a custom architecture, that way you'll know
that the bug is probably in your code if it doesn't work.
Continually asses the quality of your internal representation, for highly structured
30x30 images (i.e. like MNIST) you may only need one encoding layer and one
decoding layer to get useful results.
Visualise everything, all the time, if you've done things correctly you should see
very clear structure develop in your layer 1 weights. If you don't, your model probably
isn't converging.
Hope that helps,
Kind Regards
AK