Hi everyone,
We're super excited to host
Tim Dettmers for
this week's MLSys Seminar (October 23th) at 10:30 am PT.
The talk details are as follows:
Title:
Democratizing Foundation Models via k-bit Quantization
Abstract:
Foundation models are effective tools for many tasks but are challenging
to finetune and inference due to their GPU memory requirements. Compressing foundation models with k-bit quantization makes them more accessible with minimal resources, but k-bit quantization can lead to degradation in model quality. In this lecture, I will
talk about fundamental insights into how to compress foundation models with quantization while maintaining their predictive performance. We will learn about emergent outliers in large language models (LLMs) and how they affect performance during 8-bit quantization.
We will learn how to do effective k-bit compression of pretrained large language models such that we maximize their density of predictive performance per bit. We will also talk about how to do efficient fine-tuning of quantized 4-bit LLMs (QLoRA) and how this
helps to build state-of-the-art chatbots.
Bio:
Tim Dettmers is a graduating PhD student advised by Luke Zettlemoyer at the University of Washington in Seattle. He holds degrees in applied
math and computer science and has a background in industrial automation. His primary research goal is to democratize foundation models by making them more efficient and accessible through quantization, sparsification, and building machine learning systems
that use consumer-grade hardware. He is the creator of the bitsandbytes library. Tim runs a blog about deep learning, GPUs, and PhD life at
https://timdettmers.com/.
See everyone there!!
Best,
Simran