Convolutional Sequence to Sequence Learning

ARGHA DHAR

unread,

Jan 26, 2021, 3:36:29 AM1/26/21

to fairseq Users

Hello Everyone,

I am trying to go though this particular paper : https://arxiv.org/pdf/1705.03122.pdf . But honestly saying it is so difficult for a person like me to grasp what is in the paper . Again I am obliged to explain it to my Professor . I wont bother anyone to make it explain for me . But I will be so much grateful if anyone suggests me some resources where i can understand the paper (as i didn't find any suitable explanation) !

Part 2:

I am trying to implement Bangla to English machine translation using convolutional network . I have used fairseq ..... the commands i used were discussed for convolutional network in examples/translation/readme.md . It gave me BLEU of 15.98 , Though I didn't understand the model's architecture. I am using training set of ~65k sentences . May I get any idea of how to change the parameter values so that I may get a better score ??

I hope I am clear enough. :D

Best

Argha Dhar

Sunil Kumar

unread,

Jan 26, 2021, 8:54:33 PM1/26/21

to fairseq Users

HI Argha,

Not the entire paper but this post https://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/ covers most of it and should be helpful for your understanding.

You can check these arguments https://fairseq.readthedocs.io/en/latest/models.html#module-fairseq.models.fconv and pass them during training.

https://fairseq.readthedocs.io/en/latest/models.html#adding-new-models Understand this part to see how new models are added in fairseq and then check this https://github.com/pytorch/fairseq/blob/master/fairseq/models/fconv.py to understand how the named architectures you are using are added. You can then add your own architecture with your parameters at the bottom and use that for training. Make sure fairseq is installed in editable form. So, that changes you made inside fairseq is visible in your further use.

Hope this is helpful.

As usual, keep posting here, if you face any issues.

Thanks.

Dhar

unread,

Jan 26, 2021, 9:23:58 PM1/26/21

to Sunil Kumar, fairseq Users

Hello Sunil,

That’s a huge help . I can’t thank you enough . I will definitely go through the links . God bless you !!

--
You received this message because you are subscribed to a topic in the Google Groups "fairseq Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/fairseq-users/qpfriVAl7to/unsubscribe.
To unsubscribe from this group and all its topics, send an email to fairseq-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fairseq-users/0abc482e-465a-4a2e-9b14-8f118110f866n%40googlegroups.com.

ARGHA DHAR

unread,

Feb 1, 2021, 11:33:27 PM2/1/21

to fairseq Users

Hello,

I was out of work for a few days....that's why it has been late . As far I have come to know , If I set '--arch fconv_wmt_en_fr' then my convolutional model architecture will be something like this:

convs = "[(512, 3)] * 6" # first 6 layers have 512 units
convs += " + [(768, 3)] * 4" # next 4 layers have 768 units
convs += " + [(1024, 3)] * 3" # next 3 layers have 1024 units
convs += " + [(2048, 1)] * 1" # next 1 layer uses 1x1 convolutions
convs += " + [(4096, 1)] * 1" # final 1 layer uses 1x1 convolutions

args.encoder_embed_dim = getattr(args, "encoder_embed_dim", 768)
args.encoder_layers = getattr(args, "encoder_layers", convs)
args.decoder_embed_dim = getattr(args, "decoder_embed_dim", 768)
args.decoder_layers = getattr(args, "decoder_layers", convs)
args.decoder_out_embed_dim = getattr(args, "decoder_out_embed_dim", 512) base_architecture(args)

or If I set the architecture another (for say 'fconv' ) , then the architecture will be something that is defined for 'fconv'

Am i right?

Thanking

Argha Dhar

Faiza Nuzhat

unread,

Dec 6, 2022, 5:20:52 PM12/6/22

to fairseq Users

Hello All,

I am using fairsec models for the first time. I am trying to create a FconvModel or convolutional model from a check point. I have loaded state_dict from the checkpoint but can not create convolutional model to load the parameters from checkpoint.

Is it possible to give me some guidance regarding this?

Thanking

Faiza

Reply all

Reply to author

Forward