Online Social Choice and Welfare Seminar: Daniel Halpern, Tuesday 9 September

10 views
Skip to first unread message

Marcus Pivato

unread,
Sep 3, 2025, 12:42:09 PM (8 days ago) Sep 3
to social-choice-a...@googlegroups.com, com...@duke.edu
[with apologies for cross-posting]

Dear all,

The next presentation in the Online Social Choice and Welfare Seminar will be next Tuesday (9 September).   Here are the details.

Time: 9PM GMT (2PM San Francisco, 5PM Toronto/Montréal, 6PM Rio de Janeiro, 10PM London, 11PM Paris, 12AM Istanbul, 6AM Wednesday in Seoul, 9AM Wednesday in Auckland)

Speaker: Daniel Halpern (Google Research)

Title:  "A Social Choice Perspective on AI Alignment"

Abstract:  Consider the problem of aligning large language models (LLMs) with human values. The standard approach begins with pairwise comparisons from users of the form "between these two outputs to the prompt, which do you prefer?" This response data is aggregated into a reward function, giving numerical scores to outputs, which is subsequently used to steer an existing LLM toward higher-reward answers. This process is essential for making LLMs helpful while avoiding dangerous or biased responses.

However, this paradigm faces a fundamental challenge: people often disagree on what constitutes a "better" output. What, then, should we do when faced with diverse and conflicting preferences?

This talk explores two approaches to this challenge rooted in social choice theory. First, we take an axiomatic perspective, arguing that the process of learning reward functions should satisfy minimal requirements such as Pareto Optimality: if all users unanimously prefer one outcome to another, the aggregated reward function should reflect this. We show that current alignment methods necessarily violate these basic axioms. In contrast, we provide a proof-of-concept aggregation rule that is guaranteed to satisfy them. Second, we explore a more radical approach: representing, rather than resolving, disagreement. Instead of training a single LLM, we train an ensemble, analogous to multi-winner voting systems. We introduce a novel criterion, pairwise calibration, inspired by proportionality. Together, these approaches provide a principled foundation for building AI systems aligned with the pluralism of human values.

Based on two papers:
https://arxiv.org/abs/2405.14758 (NeurIPS '24)
https://arxiv.org/abs/2506.06298 (Working paper)

(Joint work with Luise Ge, Evi Micha, Ariel D. Procaccia, Itai Shapira, Yevgeniy Vorobeychik, and Junlin Wu.)


To obtain the Zoom link, please subscribe to the Seminar Mailing List, or contact one of the organisers.


Reminder: On the seminar website you can find the video recordings, slides and supplementary materials for all past presentations, as well as information about future presentations.


--
Marcus Pivato
Centre d'Économie de la Sorbonne
Université Paris 1 Panthéon-Sorbonne
Reply all
Reply to author
Forward
0 new messages