Guest: Sasha Rush
Title: Beyond Softmax: Scaling Probabilistic Structure in NLP
Abstract: Progress on large autoregressive models for NLP applications has been transformative, but has left many practical questions about how to utilize these approaches in a controllable and efficient manner. This talk explores the challenge of using probabilistic models to impose explicit modeling structure. I show that discrete structured models can now be implemented efficiently on modern hardware with optimizing compilers. These approaches generalize the standard softmax function we all know and love, and in fact are not much harder to use in practice. To show the benefit of this approach, I will describe a factorization of the Transformer into a structured model that lets us learn a fast and accurate parallel translation decoder. The system shows how to take advantage of efficient inference based on basic distributional properties, while maintaining the modeling benefits of a deep model.
Bio: Alexander 'Sasha' Rush is an Associate Professor at Cornell Tech in NYC. His group's research is in the intersection of natural language processing, deep learning, and structured prediction with applications in text generation and efficient inference. He contributes to several open-source projects in NLP and works part time on HuggingFace Transformers. He was recently General Chair of ICLR and developed the MiniConf tool used to run ML/NLP virtual conferences. His work has received paper and demo awards at major NLP, visualization, and hardware conferences, an NSF Career Award, and a Sloan Fellowship.