Stanford MLSys Seminar Episode 45: Baharan Mirzasoleiman [Th, 1.35-2.30pm PT]

33 views
Skip to first unread message

Karan Goel

unread,
Nov 2, 2021, 11:00:35 AM11/2/21
to stanford-ml...@googlegroups.com
Hi everyone,

We're back with the forty-fifth episode of the MLSys Seminar on Thursday from 1.35-2.30pm PT. 

We'll be joined by Baharan Mirzasoleiman, who will talk about robust machine learning from massive datasets. The format is a 30 minute talk followed by a 30 minute podcast-style discussion, where the live audience can ask questions.

Guests: Baharan Mirzasoleiman
Title: Data-efficient and Robust Learning from Massive Datasets
Abstract: Large datasets have been crucial to the success of modern machine learning models. However, training on massive data has two major limitations. First, it is contingent on exceptionally large and expensive computational resources, and incurs a substantial cost due to the significant energy consumption. Second, in many real-world applications such as medical diagnosis, self-driving cars, and fraud detection, big data contains highly imbalanced classes and noisy labels. In such cases, training on the entire data does not result in a high-quality model. In this talk, I will argue that we can address the above limitations by developing techniques that can identify and extract the representative subsets for learning from massive datasets. Training on representative subsets not only reduces the substantial costs of learning from big data, but also improves their accuracy and robustness against noisy labels. I will discuss how we can develop theoretically rigorous techniques that provide strong guarantees for the quality of the extracted subsets, as well as the learned models’ quality and robustness against noisy labels. I will also show the effectiveness of such methods in practice for data-efficient and robust learning.
Bio: Baharan Mirzasoleiman is an Assistant Professor in the Computer Science Department at UCLA. Her research focuses on developing new methods that enable efficient machine learning from massive datasets. Her methods have immediate application to high-impact problems where massive data volumes prohibit efficient learning and inference, such as huge image collections, recommender systems, Web and social services, video and other large data streams. Before joining UCLA, she was a postdoctoral research fellow in Computer Science at Stanford University. She received her Ph.D. in Computer Science from ETH Zurich. She received an ETH medal for Outstanding Doctoral Thesis, and was selected as a Rising Star in EECS by MIT.

See you all there!

Best,
Karan

Karan Goel

unread,
Nov 4, 2021, 4:14:44 PM11/4/21
to stanford-ml...@googlegroups.com
Reminder: this is in 20 minutes!
Reply all
Reply to author
Forward
0 new messages