model evaluates w/ score NaN... Where am I going wrong here?

159 views

Skip to first unread message

dan....@gmail.com

unread,

Apr 14, 2016, 10:27:06 AM4/14/16

to Keras-users

Hi all,

I'm new to Keras (and to deep learning in general), so please forgive whatever level of ignorance I'm betraying here :)

I'm trying to develop a model that trains on images of clock faces, each of which is labeled with a "score" attribute between 0.0 and 100.0. The output when testing should be a single, continuous value in that same range. The network trains without complaining, but then when I evaluate it on a test set, I get a score of NaN returned.

I have a class, CFDataProvider, which provides training and testing data -- it uses opencv's imread() to read in the images, resizes them to the provided image size (in this case, 64x64) and optionally converts them to grayscale, then returns results in a numpy array. It handles randomizing a test set based on a provided seed, and splits the training set into batches of the specified size. To the best of my knowledge this class is working correctly across the board.

So, here's the model train/test code that utilizes that data provider:

https://gist.github.com/MITREDan/6c94a7c83816540d6a1fd72a545603b0

As I said, the printout at the end is "Score: NaN". Same thing happens if I run it as RGB instead of grayscale, change the batch sizes or sample numbers, etc...

As far as I can tell there's no way to make model.train_on_batch(...) operate in verbose mode the way model.fit(...) can, but there are way too many training examples to hold in memory, so I'm using the batch training method.

It's highly likely that I have something totally wrong in the network design. I based it on a network I found online. Or, maybe I'm doing something wrong that has nothing to do with the network design?

Regardless, any assistance would be deeply appreciated!

Thanks,

-Dan

dan....@gmail.com

unread,

May 3, 2016, 2:42:21 PM5/3/16

to Keras-users, dan....@gmail.com

So, as it turns out, the problem here was just that the learning rate had to be lowered dramatically. Set it to 0.0001, and suddenly everything works just fine!

Reply all

Reply to author

Forward

0 new messages