ShapeSizeException - looking for pointers!

65 views
Skip to first unread message

Matt Russell (clojurian)

unread,
Apr 27, 2023, 9:02:00 AM4/27/23
to marian-nmt
Hi,
I'm getting the following error (Marin NMT v1.12.0):

[2023-04-26 12:13:17] Error: Caught std::exception in sub-thread: Expanded shape size 3200000000 exceeds numeric capcacity 2147483647

Could someone point as to where I should be looking?

Platform: docker (base image: nvidia/cuda:11.7.1-devel-ubuntu20.04)

I've attached the transformer.yaml config I'm using.

The command I'm running is:

 marian -c transformer.yaml \ 
        --devices 0 1 \
         --model model.npz  \
        --train-sets corpus.train.cy corpus.train.en \
         -valid-sets corpus.valid.cy corpus.valid.en \
         --valid-translation-output valid.en.out -\
          -valid-log marian-validation.log \
          --log marian.log \
          --quiet-translation

Background:
 sentencepiece model, attempting to train a 'general" model, using the same data set that I've previously managed to successfully use to train a domain specific model, so difference(s) are in the distribution (sizes) of the train and valid sets, and the size of the vocabulary.
 Between the runs, we've also added another GPU to the machine (was 2 x A600 now is 3 x A600).

 The sizes of the (joint vocabulary) sentencepiece model in both cases is 1.1Mb

  The language direction does not affect the result (i.e en-cy, cy-en). 

Many thanks!
Matt 
transformer.yaml
Reply all
Reply to author
Forward
Message has been deleted
0 new messages