Two possible reasons.
You may want to set --max-length / --max-length-crop to larger values if you model still knows how to handle them.
Best,
Marcin
--
You received this message because you are subscribed to the Google Groups "marian-nmt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
marian-nmt+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/marian-nmt/9ce7270b-968b-423f-ad46-f486d25cc26fn%40googlegroups.com.
Not really, are your differences in long sentences?
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/6d4a0446-243b-40f9-a7a3-ea4396e94fbfn%40googlegroups.com.
Ok. Here is my understanding about how this works. There are a number of parameters involved: max-length-factor, max-length, max-length-crop, mini-batch, maxi-batch, and max-batch-sort. max-length-factor governs the allowable size of the output sentence as a function of the size of the input sentence. If you are not using batching, then this all works out the way that I naively expected. However, when translating in batches, the nominal input sentence length is set to the length of the longest sentence in the batch. This means you can get different results, even if you are using the exact same model, the exact same data, and all parameters the same except for different batch sizes. Depending upon how the input sentences are grouped into batches, the nominal length of the batches can change, yielding different translation results.
By the same reasoning, if you shuffle the sentences, and then use batching for translation, you can again change the nominal length to be be used for each individual sentence (i.e., the max length of the sentences in its new batch). Thus, we can get different results just by changing the order of the inputs. maxi-batch and maxi-batch-sort can ameliorate this effect by trying to group sentences of similar sizes into the same batch.
Is this correct?
If so, has there been any thought given to methods that will keep track of the actual length of each sentence in a batch, so that translations will remain consistent across different batching parameters?
Thanks,
Steve
Yes, that’s correct.
I honestly never had that issue with normal translation. If your sentences actually tend to result in outputs that are that much longer than the source that usually indicates that your model is somehow broken (producing repeated outputs?) or you are doing a somewhat atypical task?
Do you encounter situations where a translation of a sentence that’s not the longest one in a batch needs more than three times the tokens of the longest sentence in the batch? You can increase the –max-length-factor to something larger than 3 until the length issue disappears?
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/a4d161d5-05aa-46d2-8a24-053465e33383n%40googlegroups.com.
Ah, that makes sense.
To view this discussion on the web visit https://groups.google.com/d/msgid/marian-nmt/c017c6ab-1248-4ca0-9da7-d4d92c9198efn%40googlegroups.com.