Autoencoder validation loss

124 views
Skip to first unread message

Michail Kravchenko

unread,
Jun 8, 2021, 4:59:07 AM6/8/21
to Machine Learning for Physicists
Dear Prof. Dr. Marquardt and all,

I'm doing now an ML task (for signals from detectors) that contains an autoencoder part. Here is the nice autoencoder diagram (plotted using this nice online tool:  https://alexlenail.me/NN-SVG/LeNet.html):
Autoencoder diagram.jpg
Nothing special, just Cov/CovTranspose1D and AvgPooling/UpSampling layers.

And model training history (600 epochs in total):
Model loss.png
Note that y-axes are in log-scale (in both cases). Thus validation loss fluctuations aren't extremely/relatively large.

But still, are such fluctuations really significant, do I need to worry about them?
Also, how can the best moment to stop the training process be found in such a case? And should I look at all at losses on log-scale?
I will be grateful for any comments/hints :)

Regards,
Michail.

Michail Kravchenko

unread,
Jun 8, 2021, 5:00:52 AM6/8/21
to Machine Learning for Physicists
PS The bigger autoencoder diagram:
Autoencoder diagram.jpg

Florian Marquardt

unread,
Jun 8, 2021, 1:09:01 PM6/8/21
to Machine Learning for Physicists
I guess maybe the number of validation samples is not so large - thus the fluctuations (larger than those for training).

In keras, one can use 'early stopping' to stop training when the validation loss does not become better any more. And to avoid the fluctuations, there is a parameter called 'patience' which essentially says that one should wait for a few steps whether it really does not get better any more.

See:


(this is a callback that can be used with 'fit', as seen in the example on that page)

Michail Kravchenko

unread,
Jun 8, 2021, 1:32:44 PM6/8/21
to Machine Learning for Physicists
Dear Prof. Dr. Marquardt,

Thank you for your answer! The full dataset for my task has ~830k events (large enough). I use the following split: 70% for training, 15% for validation, and 15 % for test. In your view, could an increase in the train and test data sets help with these fluctuations?
Also, will the early stopping be effective with such fluctuations?

Regards,
Michail.
Reply all
Reply to author
Forward
0 new messages