[SING] 10 June 2021 16:00-17:00pm: Patrick Lewis [UCL] / PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them

2 views
Skip to first unread message

Pan, Liangming

unread,
Jun 10, 2021, 3:45:03 AM6/10/21
to si...@wing.comp.nus.edu.sg
Dear Singapore NLP interest groups: 

Here is the information for the third talk of the WING-NUS NLP Seminar, a series of invited talks over this summer made by current rising stars in NLP.  The seminar website is at https://wing-nus.github.io/nlp-seminar/. Please join us at the Zoom address if you're interested.  The talk and slides may be made available on the website but please do join us to avoid disappointment.

The talk is open to all, so feel free to circulate to others you might think are interested. 

WING-NUS NLP Seminar 2021 - Talk 3
   
Title: PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them
Speaker: Patrick Lewis
PhD Student
University College London, Facebook AI Research

Date/Time: 10 June 2021, Thursday, 16:00 PM to 17:00 PM
Venue: Join Zoom Meeting
http://bit.ly/knmnyn-zoom-nus
ZOOM Room ID: 770 447 8736, PIN: 3244
Chaired by: A/P Min-Yen Kan, School of Computing
(ka...@comp.nus.edu.sg)
   
ABSTRACT:
Open-Domain Question Answering is the task of answering natural language questions with short factual answers. These questions are not accompanied by evidence, and can be from an open set of domains. Models must understand questions, search for and assemble evidence necessary to answer the question, and then generate an answer. Models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared to conventional models which retrieve and read from text corpora. QA-pair retrievers also offer interpretable answers, a high degree of control, and are trivial to update at test time with new knowledge. However, these models lack the accuracy of retrieve-and-read systems, as substantially less knowledge is covered by the available training QA-pairs relative to text corpora like Wikipedia. To facilitate improved QA-pair models, we introduce Probably Asked Questions (PAQ), a very large resource of 65M automatically-generated QA-pairs. We introduce a new QA-pair retriever, RePAQ, to complement PAQ. We find that PAQ preempts and caches test questions, enabling RePAQ to match the accuracy of recent retrieve-and-read models, whilst being significantly faster. RePAQ can be configured for size (under 500MB) or speed (over 1K questions per second) whilst retaining high accuracy. Lastly, we demonstrate RePAQ’s strength at selective QA, abstaining from answering when it is likely to be incorrect. This enables RePAQ to “back-off” to a more expensive state-of-the-art model, leading to a combined system which is both more accurate and 2x faster than the state-of-the-art model alone.

BIO-DATA:
Patrick Lewis is a final year PhD student splitting his time between University College London and Facebook AI Research in London, supervised by Sebastian Riedel and Pontus Stenetorp. Patrick’s research interests center on Knowledge-intensive Natural Language Processing. His recent work has won Best Paper Awards at AKBC 2020 and EACL 2021, and he led a team which won 2 tracks of the EfficientQA competition at NeurIPS 2020. Patrick focuses on how to represent, store and access knowledge and building more powerful, efficient and robust models for knowledge-intensive NLP tasks such as Question Answering.


Please contact me if you have any questions. Contact info: 

Liangming Pan
Web Information Retrieval / Natural Language Processing Group (WING)
National University of Singapore

Thanks very much. Looking forward to your participation.

Here is a list of resources for our past two talks. Feel free to view the video and slides. 


Thanks, 
Liangming


Liangming Pan (潘亮铭)

Web Information Retrieval / Natural Language Processing Group (WING)
National University of Singapore

Reply all
Reply to author
Forward
0 new messages