Can Theano overflow (with RNN)?

650 views
Skip to first unread message

Yizhou Hu

unread,
Oct 29, 2014, 6:10:29 PM10/29/14
to theano...@googlegroups.com
I am novice to Theano and I modified some code from the tutorial to make an LSTM (which is a type of RNN, recurrent neural network). The code is here: https://github.com/xlhdh/classycn/blob/master/lstm.py

I have 

...
output = T.iround(output_sequence)
...
co = T.sum(output)
...

I would assume 'co' to be an integer and it stayed so after I trained on my own data for a few days (like 40 iterations over the dataset).
I got returned co values like -2.57332079828e+21 and -4.88838717953e+20. At the same time, my loss/cost become NAN.

It couldn't be one of my sequences because I have been iterating through them over and over again, so I wonder if I encountered some overflow in the process. How can I find out?

Thanks!!!

Pascal Lamblin

unread,
Oct 29, 2014, 6:18:17 PM10/29/14
to theano...@googlegroups.com
Yes, if the values in output_sequence get bigger and bigger during training,
it is possible that the numeric types Theano uses can no longer hold those
values, and that an overflow happens.

This can happen if the learning rate is too big, for instance, or if
weights are initialized with values that are too large, if the inputs
have not been pre-processed adequately...

--
Pascal

Frédéric Bastien

unread,
Oct 30, 2014, 9:46:09 AM10/30/14
to theano-users
Just to be sure, the overflow isn't specific to Theano. It is a natural properties of floating point in computer that have finite precision. I know some software that allow arbitrary precission as long as there is enough memory on the system. But they ask for more memory for the data and slow down the computation.

If you want to get rid of the nan, see Pascal reply. Most of the time you can modify the algo to don't cause explosion of the weights. Also, when the weights explode this isn't good most of the time.

If you use float32, you can use float64 to help. It still have fixed precission, but allow bigger number.

Fred


--

---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yizhou Hu

unread,
Oct 30, 2014, 9:23:14 PM10/30/14
to theano...@googlegroups.com
Thank you - I was thinking that Theano might raise an overflow exception or crash as it recognize some overflow. 
('output' is actually some sig() product of something else. So the overflow probably happened before that... )

Yizhou Hu

unread,
Nov 7, 2014, 9:35:01 PM11/7/14
to theano...@googlegroups.com
Thank you.... my inputs are all between 0 and 1 and got sigmoid'ed a lot of times before they become output. If I use ultra_fast or hard sigmoid, this over/underflow occur in fewer iterations. 
Should I have asked about underflow? Could it be the gradients? 

Would "numpy.seterr(all='warn')" possibly help debugging? I tried this but nothing happened... 

Yizhou Hu

unread,
Nov 8, 2014, 1:21:21 AM11/8/14
to theano...@googlegroups.com
Thanks everyone who helped: I think I fixed the problem! (I was really silly....)

What happened is I have really large output and sig(output) returns 0 and log(0) gives NaN for my loss function. 
The official Theano tutorial explains it: http://deeplearning.net/tutorial/rbm.html#implementation

What fooled me was the -4.88838717953e+20 -ish stuff in my outputs. Initially I thought those are the problems. I guess for some reason NaN weights turned to these numbers after being sigmoid'ed X times, rounded and summed over. I should have focused on the NaN at the beginning!
Reply all
Reply to author
Forward
0 new messages