marian-decoder generates nothing!

Peyman Passban

unread,

Jul 5, 2018, 5:29:52 PM7/5/18

to marian-nmt

Hi all,

I've trained a basic seq2seq model over a small dataset (100K) but when decoding the model generates nothing!

I've run the decoder with --allow-unk then it generated a sequence of unks for everything! the same number of unks for all sequences!

any idea?

Cheers

-P

Marcin Junczys-Dowmunt

unread,

Jul 5, 2018, 5:38:50 PM7/5/18

to maria...@googlegroups.com, Peyman Passban

Hi,

please post your training configurations and the commands you are using for translation.

Marcin

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/a42a0fa4-7fc9-445f-9fe4-61491b0914af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peyman Passban

unread,

Jul 5, 2018, 5:43:06 PM7/5/18

to marian-nmt

Thanks Marcin for your reply,

I'm using the basic ones:

Train:

./marian/build/marian \

--train-sets corpus.en corpus.ro \

--vocabs vocab.en vocab.ro \

--model model.npz

Test:

./marian/build/marian-decoder -m model.npz -v vocab.en vocab.ro <<< "This is a test ."

I've trained the model 8000 itrs! (almost 24 hrs)

Cheers

-P

Marcin Junczys-Dowmunt

unread,

Jul 5, 2018, 5:53:51 PM7/5/18

to maria...@googlegroups.com, Peyman Passban

Hm, are you sure these are all your settings? For 8,000 iterations you should not even have a model saved as it saves by default after 10,000 iterations. Are you training on the CPU? Because usually 8,000 iterations would not take much longer than at most an hour, much less on my GPUs. A model becomes usable after maybe 50,000 iterations, rather 100,000.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/b4f582fe-be29-4bc3-8003-299ba5295450%40googlegroups.com.

Peyman Passban

unread,

Jul 5, 2018, 5:56:03 PM7/5/18

to Marcin Junczys-Dowmunt, maria...@googlegroups.com

Oops I meant 80,000! all on GPU!

Cheers

-P

--

Sent from my iPhone

Marcin Junczys-Dowmunt

unread,

Jul 5, 2018, 6:01:18 PM7/5/18

to maria...@googlegroups.com

Can you check if corpus.en / ro have the same number of lines and post the first few lines for both corpora, also the first few lines for both vocabularies.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35RsTfTp%3DEr2OnTjv1QgpRBJHAkxDD8Fb-ASQxZQ50PyWg%40mail.gmail.com.

Peyman Passban

unread,

Jul 5, 2018, 6:03:22 PM7/5/18

to maria...@googlegroups.com

I’ve already done! everything looks fine but translation results!

I’ll give a try with a new dataset!

Cheers

-P

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/cb51cdc01528d0cc85c5bd99384d18c5%40amu.edu.pl.

For more options, visit https://groups.google.com/d/optout.

Marcin Junczys-Dowmunt

unread,

Jul 5, 2018, 6:11:44 PM7/5/18

to maria...@googlegroups.com, Peyman Passban

OK, can you still have me have a look? Especially at the vocabs. If there is something wrong it might cause the issue you are seeing.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35SHGqgC3YZdPipbUgd7-PgKAbAfV%2Bx69ohN6_uxMH7Wuw%40mail.gmail.com.

Peyman Passban

unread,

Jul 6, 2018, 11:27:16 AM7/6/18

to Marcin Junczys-Dowmunt, maria...@googlegroups.com

On Thu, Jul 5, 2018 at 6:11 PM, Marcin Junczys-Dowmunt <jun...@amu.edu.pl> wrote:

OK, can you still have me have a look?

Sure, I've just shared the dataset and vocab files with you. Please have a look and let me know if there is any problem!

Especially at the vocabs. If there is something wrong it might cause the issue you are seeing.

Cheers

-P

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/b4f582fe-be29-4bc3-8003-299ba5295450%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35RsTfTp%3DEr2OnTjv1QgpRBJHAkxDD8Fb-ASQxZQ50PyWg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/cb51cdc01528d0cc85c5bd99384d18c5%40amu.edu.pl.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

Marcin Junczys-Dowmunt

unread,

Jul 6, 2018, 5:35:32 PM7/6/18

to maria...@googlegroups.com

W dniu 2018-07-06 14:32, Marcin Junczys-Dowmunt napisał(a):

I took a look at the files you made available. I am not sure why there is basically empty output, but a few observations:

1) There is no preprocessing at all, no tokenization etc. Marian expects preprocessed, at least tokenized data. You should also familiarize youself with BPE subwords: https://github.com/rsennrich/subword-nmt

2) It seems your corpora have windows line endings, linux has different line endings, not sure how much that harms, but it's probably not a good idea.

3) Because you have no preprocessing, your vocabularies are huge (about 200,000 items each) this may cause data sparsity and prohibit learning.

I recommend trying one of the examples from https://github.com/marian-nmt/marian-examples , maybe start with training-basics, there you have a full example with proper preprocessing.

W dniu 2018-07-06 14:16, Peyman Passban napisał(a):

here is the result from another model:

root@a7874f8a1673:/Marian/tep++# /home/ml/Marian/marian/build/marian-decoder -m en-tr.npz -v vocab.en vocab.fa --allow-unk --n-best <<<"hi how are you"

[2018-07-06 21:11:49] [config] allow-unk: true

[2018-07-06 21:11:49] [config] beam-size: 12

[2018-07-06 21:11:49] [config] best-deep: false

[2018-07-06 21:11:49] [config] clip-gemm: 0

[2018-07-06 21:11:49] [config] cpu-threads: 0

[2018-07-06 21:11:49] [config] dec-cell: gru

[2018-07-06 21:11:49] [config] dec-cell-base-depth: 2

[2018-07-06 21:11:49] [config] dec-cell-high-depth: 1

[2018-07-06 21:11:49] [config] dec-depth: 1

[2018-07-06 21:11:49] [config] devices:

[2018-07-06 21:11:49] [config] - 0

[2018-07-06 21:11:49] [config] dim-emb: 512

[2018-07-06 21:11:49] [config] dim-rnn: 1024

[2018-07-06 21:11:49] [config] dim-vocabs:

[2018-07-06 21:11:49] [config] - 60588

[2018-07-06 21:11:49] [config] - 91728

[2018-07-06 21:11:49] [config] enc-cell: gru

[2018-07-06 21:11:49] [config] enc-cell-depth: 1

[2018-07-06 21:11:49] [config] enc-depth: 1

[2018-07-06 21:11:49] [config] enc-type: bidirectional

[2018-07-06 21:11:49] [config] ignore-model-config: false

[2018-07-06 21:11:49] [config] input:

[2018-07-06 21:11:49] [config] - stdin

[2018-07-06 21:11:49] [config] interpolate-env-vars: false

[2018-07-06 21:11:49] [config] layer-normalization: false

[2018-07-06 21:11:49] [config] log-level: info

[2018-07-06 21:11:49] [config] max-length: 1000

[2018-07-06 21:11:49] [config] max-length-crop: false

[2018-07-06 21:11:49] [config] max-length-factor: 3

[2018-07-06 21:11:49] [config] maxi-batch: 1

[2018-07-06 21:11:49] [config] maxi-batch-sort: none

[2018-07-06 21:11:49] [config] mini-batch: 1

[2018-07-06 21:11:49] [config] mini-batch-words: 0

[2018-07-06 21:11:49] [config] models:

[2018-07-06 21:11:49] [config] - en-tr.iter10000.npz

[2018-07-06 21:11:49] [config] n-best: true

[2018-07-06 21:11:49] [config] normalize: 0

[2018-07-06 21:11:49] [config] optimize: false

[2018-07-06 21:11:49] [config] port: 8080

[2018-07-06 21:11:49] [config] quiet: false

[2018-07-06 21:11:49] [config] quiet-translation: false

[2018-07-06 21:11:49] [config] relative-paths: false

[2018-07-06 21:11:49] [config] right-left: false

[2018-07-06 21:11:49] [config] seed: 0

[2018-07-06 21:11:49] [config] skip: false

[2018-07-06 21:11:49] [config] skip-cost: false

[2018-07-06 21:11:49] [config] tied-embeddings: false

[2018-07-06 21:11:49] [config] tied-embeddings-all: false

[2018-07-06 21:11:49] [config] tied-embeddings-src: false

[2018-07-06 21:11:49] [config] transformer-aan-activation: swish

[2018-07-06 21:11:49] [config] transformer-aan-depth: 2

[2018-07-06 21:11:49] [config] transformer-aan-nogate: false

[2018-07-06 21:11:49] [config] transformer-decoder-autoreg: self-attention

[2018-07-06 21:11:49] [config] transformer-dim-aan: 2048

[2018-07-06 21:11:49] [config] transformer-dim-ffn: 2048

[2018-07-06 21:11:49] [config] transformer-ffn-activation: swish

[2018-07-06 21:11:49] [config] transformer-ffn-depth: 2

[2018-07-06 21:11:49] [config] transformer-heads: 8

[2018-07-06 21:11:49] [config] transformer-no-projection: false

[2018-07-06 21:11:49] [config] transformer-postprocess: dan

[2018-07-06 21:11:49] [config] transformer-postprocess-emb: d

[2018-07-06 21:11:49] [config] transformer-preprocess: ""

[2018-07-06 21:11:49] [config] type: amun

[2018-07-06 21:11:49] [config] version: v1.5.0+1582f99

[2018-07-06 21:11:49] [config] vocabs:

[2018-07-06 21:11:49] [config] - vocab.en

[2018-07-06 21:11:49] [config] - vocab.fa

[2018-07-06 21:11:49] [config] word-penalty: 0

[2018-07-06 21:11:49] [config] workspace: 512

[2018-07-06 21:11:49] [config] Model created with Marian v1.5.0+1582f99

[2018-07-06 21:11:49] [data] Loading vocabulary from text file vocab.en

[2018-07-06 21:11:49] [data] Setting vocabulary size for input 0 to 60588

[2018-07-06 21:11:49] [data] Loading vocabulary from text file vocab.fa

[2018-07-06 21:11:52] [memory] Extending reserved space to 512 MB (device gpu0)

[2018-07-06 21:11:52] Loading scorer of type amun as feature F0

[2018-07-06 21:11:52] Loading model from en-tr.iter10000.npz

[2018-07-06 21:11:52] [memory] Reserving 606 MB, device gpu0

[2018-07-06 21:11:53] Best translation 0 : <unk> <unk> <unk> <unk>

0 ||| <unk> <unk> <unk> <unk> ||| F0= -0.813782 ||| -0.813782

0 ||| <unk> <unk> <unk> <unk> <unk> ||| F0= -1.54337 ||| -1.54337

0 ||| <unk> <unk> <unk> ||| F0= -1.67674 ||| -1.67674

0 ||| <unk> <unk> ||| F0= -2.53435 ||| -2.53435

0 ||| <unk> <unk> <unk> <unk> <unk> <unk> ||| F0= -2.98951 ||| -2.98951

0 ||| <unk> <unk> <unk> <unk> <unk> <unk> <unk> ||| F0= -4.42481 ||| -4.42481

0 ||| <unk> ||| F0= -4.81378 ||| -4.81378

0 ||| ||| F0= -5.97186 ||| -5.97186

0 ||| <unk> <unk> <unk> <unk> <unk> jybHay: 21939 ||| F0= -22.214 ||| -22.214

0 ||| <unk> <unk> <unk> <unk> <unk> vlyCk: 88410 ||| F0= -22.218 ||| -22.218

0 ||| <unk> <unk> <unk> <unk> <unk> ahmganH: 1006 ||| F0= -22.2183 ||| -22.2183

0 ||| <unk> <unk> <unk> <unk> <unk> anSatvn: 54540 ||| F0= -22.2186 ||| -22.2186

[2018-07-06 21:11:53] Total time: 0.067736s wall, 0.030000s user + 0.030000s system = 0.060000s CPU (88.6%)

Cheers

-P

On Fri, Jul 6, 2018 at 3:52 PM, Peyman Passban <pe....@gmail.com> wrote:

sorry for spamming, I forgot to attach this:

/home/ml/Marian/marian/build/marian --after-batches 50000

--train-sets ./tep++/train.en ./tep++/train.fa

--model ./tep++/en-tr.npz --vocabs ./tep++/vocab.en

./tep++/vocab.fa >log-training-en-fa.txt

/home/ml/Marian/marian/build/marian-decoder -m ./en-tr.npz -v vocab.en vocab.fa <en.txt >out.txt

Cheers

-P

On Fri, Jul 6, 2018 at 3:50 PM, Peyman Passban <pe....@gmail.com> wrote:

Hi Marcin,

I've trained a new model for translating from En to Farsi. I've trained the model 4hrs and the dataset size is 500K.

Now I've just tried to translate the first 10 lines of the training set but it generated empty lines again. For some un known reason it also puts "xxxx:8253" in the last line of the translation file. "xxxx" is a farsi word in the vocab.fa file and 8253 is its freq. Do you have any idea?

Cheers

-P

On Fri, Jul 6, 2018 at 11:30 AM, Peyman Passban <pe....@gmail.com> wrote:

Hey Marcin,

thank a mill for your attention.

Here is the link to the files: https://drive.google.com/open?id=1wSDZ0_TMyJHe5gcBd9BDGnXhcwFYEaUY

Using this dataset, I tried to train a simple chatbot.

Both source and target langs are En, this might be a problem! I have no idea about Marian! or Line 9 in the vocab file looks a bit strange! this could be another reason!

Now I'm training a new model for translating from En to Farsi (Persian).

I'll let you know the result.

Cheers

-P

On Thu, Jul 5, 2018 at 6:11 PM, Marcin Junczys-Dowmunt <jun...@amu.edu.pl> wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/b4f582fe-be29-4bc3-8003-299ba5295450%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35RsTfTp%3DEr2OnTjv1QgpRBJHAkxDD8Fb-ASQxZQ50PyWg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/cb51cdc01528d0cc85c5bd99384d18c5%40amu.edu.pl.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+unsubscribe@googlegroups.com.

To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

Marcin Junczys-Dowmunt

unread,

Jul 6, 2018, 5:43:22 PM7/6/18

to maria...@googlegroups.com

Can you try one of the prepared examples from marian-examples, for instance training-basics?

Otherwise if you make the training data available I can give it a try. It should just work, so I am not sure why that would happen.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/857fbc179bd668e473759cfb8b023157%40amu.edu.pl.

bici...@gmail.com

unread,

Sep 18, 2018, 1:32:23 PM9/18/18

to marian-nmt

I also receive bogus translation results from marian. Is this issue resolved?

Cheers

-P

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/b4f582fe-be29-4bc3-8003-299ba5295450%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35RsTfTp%3DEr2OnTjv1QgpRBJHAkxDD8Fb-ASQxZQ50PyWg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/cb51cdc01528d0cc85c5bd99384d18c5%40amu.edu.pl.
For more options, visit https://groups.google.com/d/optout.

--

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35SHGqgC3YZdPipbUgd7-PgKAbAfV%2Bx69ohN6_uxMH7Wuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Peyman Passban

unread,

Sep 18, 2018, 1:56:32 PM9/18/18

to maria...@googlegroups.com

I tried so many things but at the end had to compile marian again and it worked! if you need I can share my docker setting/files with you.

Cheers

-P

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/0c561102-8192-497b-b2ff-b450796aaa60%40googlegroups.com.

Ergun Bicici

unread,

Sep 18, 2018, 2:04:41 PM9/18/18

to maria...@googlegroups.com

Thank you Peyman. Do you mean marian-dev or marian? I obtained the same with both. I managed to build marian with g++-4.9 and boost1.58. marian-dev was compiling with g++-7.

I'll recompile and rerun then and then I'll post the result.

Best Regards,
Ergun

Ergun Biçici

http://bicici.github.com/

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35TsYDr51khAKRx4%2BQ17%2BQAutwqJw27qrZFQK3Zp5GOqXg%40mail.gmail.com.

Ergun Bicici

unread,

Sep 18, 2018, 4:13:17 PM9/18/18

to maria...@googlegroups.com

Hi Peyman,

I compiled both marian and marian-dev again and obtained the same results. Here is the command I use:

$marian --type s2s --model model.npz --train-sets train.truecased.bpe.vt7.tr train.truecased.bpe.vt7.en --valid-sets dev.tok.truecased.bpe.500.vt7.tr dev.tok.truecased.bpe.500.vt7.en --vocabs traindev.truecased.bpe.vt7.entr.nmtvocab.yml traindev.truecased.bpe.vt7.entr.nmtvocab.yml --valid-metrics ce-mean-words perplexity translation --cost-type=ce-mean-words --log model.npz.trainlog --valid-log model.npz.validlog --exponential-smoothing --normalize=1 --quiet-translation --disp-freq 500 --mini-batch-fit --maxi-batch 250 -w 512 --learn-rate 0.0003 --lr-report --optimizer-params 0.9 0.98 1e-08 --clip-norm 5 --early-stopping 20

ce-mean-words decrease from 7.47 to 5.12 and translation is all the same sentence.

Here is the en-tr data I use:

https://drive.google.com/open?id=15OXQpdt5PaGmeZzJXXDbxIrza3hCQ1oP

Can you share the g++ compiler and boost versions you used? Thank you.

Best Regards,
Ergun

Ergun Biçici

http://bicici.github.com/

Peyman Passban

unread,

Sep 18, 2018, 4:27:16 PM9/18/18

to maria...@googlegroups.com

everything looks ok in your setting!

I had the same problem, I tried almost everything but it didn't work and I had to recompiled it again.

BTW if the loss/cost value changes during training you might need to wait a bit more! you'll probably get proper translations after a certain number of iterations.

I'm using boost 1.58.0.1 and the g++ version is 5.4.0

good luck

Cheers

-P

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAB2pGneg33MJL7JDfyzM9wTO3XWC_frOQokN3uhPZ5yLuDp_sA%40mail.gmail.com.

Roman Grundkiewicz

unread,

Sep 18, 2018, 4:44:10 PM9/18/18

to marian-nmt

Hi,

Such a very low performance or repeated translation outputs usually indicate wrong hyperparameters and are not related to the compilation. In your command, for instance, the parameter `--workspace` seems to be quite small, so you need to be patient training even a RNN model and it's probably too low for training a transformer model.

For compilation issues: make sure that the g++ version you use is supported by your CUDA version. I think GCC 6/7 is not compatible with CUDA 9 or earlier.

Best,
Roman

Ergun Bicici

unread,

Sep 18, 2018, 4:59:55 PM9/18/18

to maria...@googlegroups.com

Thanks. g++ 4.9 worked for me. g++-5/6/7 received error.

Maybe I should flip a coin before my next trial ;)

Best Regards,
Ergun

Ergun Biçici

http://bicici.github.com/

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/CAC-B35SQODevwKQ%2BvY9RE5GJ-prMvUgWgB5N5wEbFtDOMFzT%2BA%40mail.gmail.com.

Ergun Bicici

unread,

Sep 21, 2018, 2:44:47 AM9/21/18

to maria...@googlegroups.com

Dear Roman and Dear Peyman,

Thank you for your suggestions. nematus was running with 4GB GPU. Is there some memory related debugging messages? I would like to get the model run with 4GB GPU but I can also use 6GB GPU.

without -w option, workspace is set to 512.

According to your suggestion, I used -w 2950 / 2000 / 1000 / 700

[config] workspace: 2950

[memory] Reserving 513 MB, device gpu0

[memory] Extending reserved space to 3072 MB (device gpu0)

[memory] Reserving 513 MB, device gpu0

[memory] Reserving 1026 MB, device gpu0

but even then some gpu space is reserved and received the following error in all:

No batches to fetch, run prepare()
Aborted from marian::data::BatchGenerator<DataSet>::BatchPtr marian::data::BatchGenerator<DataSet>::next() [with DataSet = marian::data::CorpusBase; marian::data::BatchGenerator<DataSet>::BatchPtr = std::shared_ptr<marian::data::CorpusBatch>] in marian-dev/src/data/batch_generator.h: 195
Aborted

There might be such memory/workspace related issues that are silently breaking the system without even such error messages.

CUDA 8 might be compatible with gcc 5. marian-dev is compiling with gcc 7. I recompiled the code and rerun the experiments with marian / marian-dev and obtained the same 0 translation results.

I changed type to nematus:

$marian --type nematus --model model.npz --train-sets train.truecased.bpe.tr train.truecased.bpe.en --valid-sets dev.tok.truecased.bpe.500.tr dev.tok.truecased.bpe.500.en --vocabs traindev.truecased.bpe.en.nmtvocab.yml traindev.truecased.bpe.tr.nmtvocab.yml --dim-vocabs 45993 73268 --valid-metrics ce-mean-words perplexity translation --cost-type=ce-mean-words --log model.npz.trainlog --valid-log model.npz.validlog --exponential-smoothing --normalize=1 --quiet-translation --disp-freq 500 -w 700 --mini-batch-fit --maxi-batch 250 --learn-rate 0.0003 --lr-report --optimizer-params 0.9 0.98 1e-08 --clip-norm 5 --early-stopping 20 --enc-depth 1 --enc-cell-depth 4 --enc-type bidirectional --dec-depth 1 --dec-cell-base-depth 8 --dec-cell-high-depth 1 --dec-cell gru-nematus --enc-cell gru-nematus --tied-embeddings --layer-normalization

Interestingly, the ce-mean-words started decreasing from 4.03 this time and reached and translation is all the same sentence.

Then I check the following tutorial:

https://marian-nmt.github.io/examples/mtm2018-labs

I ran:

$marian --type s2s --model model.npz --train-sets train.truecased.bpe.tr train.truecased.bpe.en --valid-sets dev.tok.truecased.bpe.tr dev.tok.truecased.bpe.en --vocabs traindev.truecased.bpe.en.nmtvocab.yml traindev.truecased.bpe.tr.nmtvocab.yml --valid-metrics cross-entropy perplexity bleu --cost-type=cross-entropy --log model.npz.trainlog --valid-log model.npz.validlog --exponential-smoothing --normalize=1 --quiet-translation --disp-freq 500 -w 1000 --mini-batch-fit --maxi-batch 250 --lr-report --early-stopping 20 --layer-normalization --dropout-rnn 0.2 --dropout-src 0.1 --dropout-trg 0.1 --beam-size 12 &

I flipped a coin and then I started obtaining translation scores :) bleu valid-metric is not specified in the documentation. bleu scores vary still and due to bpe, we should process the output first.

marian output 0.3611 BLEU in 30000th iteration. Then BLEU decrease to 0 at iteration 210000 and in between. What does bleu valid metric do?

If I run the same command with --valid-metrics translation, I get 0 translation score again.

Best Regards,
Ergun

Ergun Biçici

http://bicici.github.com/

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To post to this group, send email to maria...@googlegroups.com.
Visit this group at https://groups.google.com/group/marian-nmt.

To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/939319ac-0308-44a0-9b22-14fdf233ee20%40googlegroups.com.

Reply all

Reply to author

Forward