Hi All,
Tried to train a model with multitask data for fr to en dataset, but the source and target task loss look very high, and I am not getting any improvement with multitask model compared to without multitask. (
S2UT reduced, no auxiliary task
: 38.4 ,
S2UT reduced, sc + tc : 37.8)
Using chars for source and targets, below are the sample task (source letter and target letter) for multitask data.
Source_letter:
train.tsv
id tgt_text
common_voice_fr_19510547 l e u r | p r o l i f é r a t i o n | c e l l u l a i r e | e s t | i m p o r t a n t e | e t | i l s | p e u v e n t | é v o l u e r | r a p i d e m e n t | v e r s | u n | g l i o b l a s t o m e
common_voice_fr_19510553 n ' y | a r r i v a n t | p a s | i l s | s ' e n | p r e n n e n t | a u | s u p p o r t | e t | c a u s e n t | d e | n o m b r e u s e s | d é g r a d a t i o n s
common_voice_fr_19510554 c e t t e | s a i s o n | s e r a | t o u t e | l e u r | v i e
dict.txt
| 1917002
e 1314627
t 927951
Target_letter:
train.tsv
id tgt_text
common_voice_fr_17299458 t h e | f l o o r | i s | t o | m r s | a n n i e | l e | h o u e r o u
common_voice_fr_17299459 t h a t | i s | c l e a r
dict.txt:
| 1881417
e 1570568
s 831638
Anyone could you pls help me to understand the why i am not getting improvement? or Am I doing something wrong.
Thanks in Advance.
Regards,
Saranya V