RNN clone memory problem

62 views
Skip to first unread message

He He

unread,
Oct 2, 2015, 1:26:10 PM10/2/15
to torch7

Hi all,

I'm trying to train an RNN based on Karpathy's code at https://github.com/karpathy/char-rnn.

My RNN has ~200 steps, 128 hidden unit and the vocab is ~20k. In clone_many_times (cloning RNN units with shared params and grads), even if collectgarbage('count') shows that about 5MB is in used, `top` shows that the memory is used pretty quickly and it eventually got a out-of-memory problem...

I don't think 200 clones could eat up all memory on the machine which has >100g memory..

Does anyone have any idea? Thanks!!

He

alban desmaison

unread,
Oct 2, 2015, 1:32:00 PM10/2/15
to torch7
You can try to put at the beginning of your script:
torch.setheaptracking(true)

He He

unread,
Oct 2, 2015, 2:00:36 PM10/2/15
to torch7
Thanks, Alban!

I tried it and also collectgarbage in clone_many_times(), but I still got out-of-memory problem and again, lua "thinks" it's only using <5MB...

He He

unread,
Oct 2, 2015, 2:06:56 PM10/2/15
to torch7
Could it be that the recursive readObject is taking memory?
Reply all
Reply to author
Forward
0 new messages