I2R, A*STAR is holding a seminar where Prof. Mirella Lapata from the University of
Edinburgh will be sharing how deep generative models with latent
representations can be useful for tasks such as opinion summarization
and question paraphrasing. For more details, please see below.
Summarization and Paraphrasing in Quantized Transformer Spaces
Talk Time: Wednesday June 30 2021 3:30-5:30pm Singapore time
Deep generative models with latent variables have become a major focus of NLP research over the past several years. These models have been used both for generating text and as a way of learning latent representations of text for downstream tasks. While much previous work uses continuous latent variables, discrete variables are attractive because they are more interpretable and typically more space efficient. In this talk we consider learning discrete latent variable models with Quantized Variational Autoencoders, and show how these can be ported to two NLP tasks, namely opinion summarization and paraphrase generation for questions. For the first task, we provide a clustering interpretation of the quantized space and a novelextraction algorithm to discover popular opinions among hundreds of reviews, while for the second task we show that a principled information bottleneck leads to an encoding space that separately represents meaning and surface from, thereby allowing us to generate syntactically varied paraphrases.
Speaker Bio: Mirella Lapata is professor of natural language processing in the School of Informatics at the University of Edinburgh. Her research focuses on getting computers to understand, reason with, and generate natural language. She is the first recipient (2009) of the British Computer Society and Information Retrieval Specialist Group (BCS/IRSG) Karen Sparck Jones award, a Fellow of the ACL and the Royal Society of Edinburgh. She has also received best paper awards in leading NLP
conferences and has served on the editorial boards of the Journal of Artificial Intelligence Research, the Transactions of the ACL, and Computational Linguistics. She was president of SIGDAT (the group that organizes EMNLP) in 2018.
Nancy F. Chen
Lab Head and PI
Institute for Infocomm Research, A*STAR