The Levers of Political Persuasion with Conversational AI

Kobi Hackenburg, Ben M. Tappin, Luke Hewitt, Ed Saunders, Sid Black, Hause Lin, Catherine Fist, Helen Margetts, David G. Rand, Christopher Summerfield

There are widespread fears that conversational AI could soon exert unprecedented influence over human beliefs. Here, in three large-scale experiments (N=76,977), we deployed 19 LLMs-including some post-trained explicitly for persuasion-to evaluate their persuasiveness on 707 political issues. We then checked the factual accuracy of 466,769 resulting LLM claims. Contrary to popular concerns, we show that the persuasive power of current and near-future AI is likely to stem more from post-training and prompting methods-which boosted persuasiveness by as much as 51% and 27% respectively-than from personalization or increasing model scale. We further show that these methods increased persuasion by exploiting LLMs' unique ability to rapidly access and strategically deploy information and that, strikingly, where they increased AI persuasiveness they also systematically decreased factual accuracy.

--

David G. Rand (he/him)

Information Science & Marketing and Management Communication

Cornell University

http://www.DaveRand.org

Recent publication: Political motives help rather than hinder crowdsourced fact-checking Working paper

All misinformation/fake news related papers: [Dynamic list with PDF and Twitter links]

Ben Tappin

unread,

Jul 21, 2025, 1:26:26 PM7/21/25

to David Rand, Human Cooperation Lab

Thanks for sharing Dave.

Comments/feedback/withering criticisms all much appreciated!

Assistant Professor, London School of Economics and Political Science

Early Career Research Fellow, Leverhulme Trust

W: https://benmtappin.com | E: b.ta...@lse.ac.uk

Latest work: Using survey experiment pre-testing to support future pandemic response

--
You received this message because you are subscribed to the Google Groups "Human Cooperation Lab" group.
To unsubscribe from this group and stop receiving emails from it, send an email to human-cooperatio...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/human-cooperation-lab/CANjUb4LMzM6HzcEwkvz57CktuUNxSHEUDYHnvZfRSdctzKjsWQ%40mail.gmail.com.

Thomas Costello

unread,

Jul 21, 2025, 2:19:06 PM7/21/25

to Ben Tappin, David Rand, Human Cooperation Lab

Hi Ben,

Congrats on the ludicrously compelling and meticulous work. It is a monumental contribution, and likely to be seminal. I have some initial questions/thoughts that just cropped up while reading. I hope you don't mind my sharing them. Not withering at all, though.

- does "prompt-based personalization" matter for some issues but not others? Or did it vary by initial belief strength? I'm admittedly surprised that there wasn't more of an effect there, since I think under a roughly Bayesian model of belief-updating then tailoring an informational argument to someone's priors would help. one worry is explanatorily-helpful details becoming lost in the scale of your stimuli, that is, I'd imagine many people didn't have well-formed opinions about some of the political topics discussed (especially because the issues were chosen to ensure most people had moderate support), so perhaps in those cases they didn't have much of a prior belief to target? the evidence you provide is very strong, but I do still find it hard to believe that tailoring an argument's informational content to someone's strong priors wouldn't help? And perhaps relatedly, I wonder if a stronger form of prior-tailoring, like based on someone's larger causal model of politics and behavior, would have more meaningful returns.

- semi-relatedly, it might be interesting to look at how the format of the DV question shapes these results (e.g., how did DV questions vary in structure, rather than content? were some very specific and others quite broad? did some deal with essentially factual disputes ["increasing housing supply will reduce homelessness"] while others were moral or politically strategic?). I'm not sure "political issues" is a coherent category. I think the scale of your data would allow for this kind of analysis, and it could provide some very interesting results.

- how did the distribution of persuasive effects change with model scale, fine-tuning/RM, etc?

- Also, it's incredibly cool to see that both ML approaches improved persuasion; how are you thinking about the muddiness between "people who are more likely to change their minds liked particular kinds of messages" (which you then trained on) vs. "certain messages are more persuasive for everyone"? Or put another way, is it possible that the ML approaches squeezed extra juice out of ready-to-be-persuaded people?

- can you not do supervised fine-tuning for proprietary models? I was under the impression that, e.g., the openai platform allows for this.

- You say " Second, we used 56,283 additional conversations (covering 707 political issues) with GPT-4o to fine-tune a reward model (RM; a version of GPT-4o) that predicted belief change at each turn of the conversation, conditioned on the existing dialogue history." -- but I think this language makes it sound like you collected turn-by-turn belief change scores to do RM with (which you didn't, I think?). This is really minor but I got super excited about the prospect of a turn-by-turn dataset.

So much to dig into! But these were my first impressions.

Cheers,

Tom

To view this discussion visit https://groups.google.com/d/msgid/human-cooperation-lab/CANVMMyFa4jXHtfon5-E-heP508O4%3DXRSnN2z7gKN%2BKnj1afiGg%40mail.gmail.com.

Reply all

Reply to author

Forward

[2507.13919] The Levers of Political Persuasion with Conversational AI

David Rand

The Levers of Political Persuasion with Conversational AI

Ben Tappin

Thomas Costello