Stanford MLSys Seminar Episode 58: Shruti Bhosale [Th, 1.35-2.30pm PT]

35 views
Skip to first unread message

Karan Goel

unread,
Mar 10, 2022, 2:45:03 AM3/10/22
to stanford-ml...@googlegroups.com
Hi everyone,

We're back with the fifty-eighth episode of the MLSys Seminar on Thursday from 1.35-2.30pm PT. 

We'll be joined by Shruti Bhosale, who will talk about scaling up machine translation. The format is a 30 minute talk followed by a 30 minute podcast-style discussion, where the live audience can ask questions.

Guests: Shruti Bhosale
Title: Scaling Multilingual Machine Translation to Thousands of Language Directions
Abstract: Existing work in translation has demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In this talk, I will describe how we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining. Then, we explore how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models. Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.
Bio: Shruti Bhosale is a Research Engineer at Facebook AI Research in Menlo Park, focusing on Natural Language Processing. She currently works on projects in massively multilingual machine translation and natural language understanding/generation. Her recent work includes many-to-many machine translation for 100 languages, BASE Layers and efficient large-scale language models with Mixture of Experts. She graduated with a Master's Degree in Computer Science from University of Texas at Austin. Prior to Facebook, Shruti built models for people recommendation systems at LinkedIn.

See you all there!

Best,
Karan

Karan Goel

unread,
Mar 10, 2022, 4:21:29 PM3/10/22
to stanford-ml...@googlegroups.com
Reminder: we're starting in 15 minutes!
Reply all
Reply to author
Forward
0 new messages