Online Social Choice and Welfare Seminar: Daniel Halpern, Tuesday 9 September
10 views
Skip to first unread message
Marcus Pivato
unread,
Sep 3, 2025, 12:42:09 PM (8 days ago) Sep 3
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to social-choice-a...@googlegroups.com, com...@duke.edu
[with apologies for cross-posting]
Dear all,
The next presentation in the Online Social Choice and Welfare Seminar will be next Tuesday (9 September). Here are the details.
Time:
9PM GMT (2PM San Francisco, 5PM Toronto/Montréal, 6PM Rio de Janeiro,
10PM London, 11PM Paris, 12AM Istanbul, 6AM Wednesday in Seoul, 9AM
Wednesday in Auckland)
Title: "A Social Choice Perspective on AI Alignment"
Abstract:
Consider the problem of aligning large language models (LLMs) with
human values. The standard approach begins with pairwise comparisons
from users of the form "between these two outputs to the prompt, which
do you prefer?" This response data is aggregated into a reward function,
giving numerical scores to outputs, which is subsequently used to steer
an existing LLM toward higher-reward answers. This process is essential
for making LLMs helpful while avoiding dangerous or biased responses.
However,
this paradigm faces a fundamental challenge: people often disagree on
what constitutes a "better" output. What, then, should we do when faced
with diverse and conflicting preferences?
This talk explores two
approaches to this challenge rooted in social choice theory. First, we
take an axiomatic perspective, arguing that the process of learning
reward functions should satisfy minimal requirements such as Pareto
Optimality: if all users unanimously prefer one outcome to another, the
aggregated reward function should reflect this. We show that current
alignment methods necessarily violate these basic axioms. In contrast,
we provide a proof-of-concept aggregation rule that is guaranteed to
satisfy them. Second, we explore a more radical approach: representing,
rather than resolving, disagreement. Instead of training a single LLM,
we train an ensemble, analogous to multi-winner voting systems. We
introduce a novel criterion, pairwise calibration, inspired by
proportionality. Together, these approaches provide a principled
foundation for building AI systems aligned with the pluralism of human
values.
To obtain the Zoom link, please subscribe to the Seminar Mailing List, or contact one of the organisers.
Reminder: On the seminar website
you can find the video recordings, slides and supplementary materials
for all past presentations, as well as information about future
presentations.