UnboundLocalError: local variable 'num_sents' referenced before assignment

17 views
Skip to first unread message

Mohammad Mumin

unread,
Jul 17, 2019, 8:44:39 AM7/17/19
to Nematus Support
Dear Sir,
Thanks Sir for your previous detail suggestions. That helps me a lot.
Now, I get the following error:

INFO: Building model...
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
INFO: Initializing model parameters from scratch...
INFO: Done
INFO: Reading data...
INFO: Done
INFO: Initial uidx=0
INFO: Starting epoch 0
Traceback (most recent call last):
  File "nematus/train.py", line 447, in <module>
    train(config, sess)
  File "nematus/train.py", line 170, in train
    write_summary_for_this_batch)
  File "/playground/nematus/nematus/model_updater.py", line 73, in update
    self._config.max_tokens_per_device)
  File "/playground/nematus/nematus/model_updater.py", line 200, in _split_minibatch_for_device_size
    start_points = range(0, num_sents, max_sents_per_device)
UnboundLocalError: local variable 'num_sents' referenced before assignment

I seek your kind assistance.
Thanks in advance.

Rico Sennrich

unread,
Jul 19, 2019, 7:19:54 AM7/19/19
to nematus...@googlegroups.com
Hi Mohammad,

thanks for reporting this. The bug was triggered by "--max_sentences_per_device". It has now been fixed.

(we tend to use --token_batch_size and --max_tokens_per_device nowadays, which is slightly more efficient)

best wishes,
Rico
--
You received this message because you are subscribed to the Google Groups "Nematus Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nematus-suppo...@googlegroups.com.
To post to this group, send email to nematus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nematus-support/beef173d-c8a3-438a-9c83-066f636c1ce2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Rico Sennrich

unread,
Jul 22, 2019, 6:27:54 AM7/22/19
to nematus...@googlegroups.com
Hi Mohammad,

if you want to use a similar amount of memory, you can multiply "batch_size" and "maxlen" to get "token_batch_size", and multiply "max_sentences_per_device" and "maxlen" to get "max_tokens_per_device".

With this, you will probably have slightly larger batches on average than before (in terms of number of actual words) because less computation is wasted. You can reduce batch size if you find that larger batches hurt quality; this tends only to be the case if you have very small amounts of training data.

best wishes,
Rico


On 20/07/2019 10:26, Mohammad Mumin wrote:
Thank you very much, Sir. These are great helps for me. I am going to submit an article on English-Bangla NMT using Nematus very soon, if I succeed in experiment.
Anyway, I will also use --token_batch_size and --max_tokens_per_device, if these are more efficient.
But, I have no clue about their values.
Usually, I use --batch_size 80 and --max_sentences_per_device 20.
In this case, if my average English sentence length is 18, should I assign --token_batch_size as 80*18=1440 and --max_tokens_per_device as 20*18=360.
Can you give me some idea?
Thanks in advance.
Thanks again, Sir.
Best regards.

Mohammad Mumin

unread,
Jul 25, 2019, 3:08:56 AM7/25/19
to Nematus Support
Thank you sir for your explanation.
Reply all
Reply to author
Forward
0 new messages