different translations w/ different batch parameters?

120 views
Skip to first unread message

sjbr...@gmail.com

unread,
Jun 9, 2022, 3:21:54 PM6/9/22
to marian-nmt
Is it expected that one might find different translations (marian-decoder results) of input files, given different mini and maxi batch parameters? 

In my case, I find the translations of an input file will differ, given exactly the same input, same model, and all decoder parameters the same,  EXCEPT for mini and maxi batch parameters. In particular,

< maxi-batch: 100
< maxi-batch-sort: src
< mini-batch: 8
< mini-batch-words: 0

> maxi-batch: 1
> maxi-batch-sort: none
> mini-batch: 1
> mini-batch-words: 0

If this is expected behavior, can someone please explain why? If not, are there certain parameter settings that I should be using to prevent this? I want to use batching in order to increase throughput, but I want the translations to be consistent, no matter how many lines of text the client submits for translation.

Thanks,

Steve

Marcin Junczys-Dowmunt

unread,
Jun 9, 2022, 3:26:26 PM6/9/22
to maria...@googlegroups.com

Two possible reasons.

 

  1. Most likely: The maximum output length is set as a multiplier of the input batch length. It’s going to be determined by the longest sentence in the batch also for the shorter sentences in a the batch.

You may want to set --max-length / --max-length-crop to larger values if you model still knows how to handle them.

  1. Minor differences: floating point calculation MAY be sometimes different in large matrices with a lot of threads etc. Some non-determinism might happen and minor fluctuations might occur. Matrix size are different for different batch sizes. I would think this is rather unlikely though possible.

 

Best,

Marcin

--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marian-nmt+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/9ce7270b-968b-423f-ad46-f486d25cc26fn%40googlegroups.com.

 

sjbr...@gmail.com

unread,
Jun 9, 2022, 4:31:21 PM6/9/22
to marian-nmt
In all cases I have

max-length: 200
max-length-crop: true
max-length-factor: 3

Is it possible that "max-length-factor" is overriding the other two settings? I was assuming that the first two would override the third.

Thanks,

Steve

Marcin Junczys-Dowmunt

unread,
Jun 9, 2022, 4:32:59 PM6/9/22
to maria...@googlegroups.com

Not really, are your differences in long sentences?

sjbr...@gmail.com

unread,
Jun 9, 2022, 7:24:48 PM6/9/22
to marian-nmt
No. I'm doing another translation to test stats, but I will get differences even when the input token stream is much less than the max-length.

sjbr...@gmail.com

unread,
Jun 10, 2022, 5:04:43 PM6/10/22
to marian-nmt

Ok. Here is my understanding about how this works. There are a number of parameters involved: max-length-factor, max-length, max-length-crop, mini-batch, maxi-batch, and max-batch-sort.    max-length-factor governs the allowable size of the output sentence as a function of the size of the input sentence. If you are not using batching, then this all works out the way that I naively expected. However, when translating in batches, the nominal input sentence length is set to the length of the longest sentence in the batch. This means you can get different results, even if you are using the exact same model, the exact same data, and all parameters the same except for different batch sizes. Depending upon how the input sentences are grouped into batches, the nominal length of the batches can change, yielding different translation results.

By the same reasoning, if you shuffle the sentences, and then use batching for translation, you can again change the nominal length to be be used for each individual sentence (i.e., the max length of the sentences in its new batch). Thus, we can get different results just by changing the order of the inputs.    maxi-batch and maxi-batch-sort can ameliorate this effect by trying to group sentences of similar sizes into the same batch.

Is this correct?

If so, has there been any thought given to methods that will keep track of the actual length of each sentence in a batch, so that translations will remain consistent across different batching parameters?

Thanks,

Steve

Marcin Junczys-Dowmunt

unread,
Jun 10, 2022, 5:25:51 PM6/10/22
to maria...@googlegroups.com

Yes, that’s correct.

 

I honestly never had that issue with normal translation. If your sentences actually tend to result in outputs that are that much longer than the source that usually indicates that your model is somehow broken (producing repeated outputs?) or you are doing a somewhat atypical task?

 

Do you encounter situations where a translation of a sentence that’s not the longest one in a batch needs more than three times the tokens of the longest sentence in the batch? You can increase the –max-length-factor to something larger than 3 until the length issue disappears?

sjbr...@gmail.com

unread,
Jun 10, 2022, 5:54:46 PM6/10/22
to marian-nmt
This only really became obvious when I started testing with a pretty immature model against different mini/maxi-batch parameters. I didn't wait for the model to converge, since I only needed it to test the infrastructure in which a "real" model was intended to live. 

I am seeing a bunch of repeated outputs, which only tend to get longer as the --max-length and --match-factor-length parameters grow. I'm sure that this is mostly due to the immaturity of the model.

I don't see the kinds of problems that you asked about.

I'm guessing that if I start using a decent model, the differences in translation will largely drop away. 

This "problem" doesn't concern me so much now that I understand better how the above parameters work together. 

Thanks for your help!

Steve

Marcin Junczys-Dowmunt

unread,
Jun 10, 2022, 5:57:42 PM6/10/22
to maria...@googlegroups.com

Ah, that makes sense.

Reply all
Reply to author
Forward
Message has been deleted
0 new messages