Model Silver

0 views

Skip to first unread message

Azalee Freas

unread,

Aug 5, 2024, 8:11:40 AM8/5/24

to paswebanksar

Thebiggest issue with the changes is that while the backend process for 538 has been totally overhauled, the front end report is very similar to how it always was. This can easily lead to people misinterpreting the report as comparable to the last few elections. Even if the new 538 models were sound, presenting the findings so consistently with such a different process wouldn't be much better.

This was originally going to be a Model Talk column, our weekly feature for paying subscribers. But, since I\u2019ve received numerous requests for public comment about this, I think I need to put it in front of the paywall. We\u2019ll make it up to paid subscribers with another post later this week or this weekend.

When the Silver Bulletin presidential forecast launched last month, I said I wasn\u2019t interested in prosecuting the \u201Cmodel wars\u201D, meaning having big public debates about forecasting methodology. One reason is that I find these arguments tiresome: I first published an election model in 2008, and it\u2019s been the same debates pretty much ever since. But there\u2019s also a more pragmatic consideration. If I think a model is unsound, I worry about elevating it by giving it even more attention. Because I do believe in probabilities, after all. Joe Biden\u2019s chance of winning another term is hard to forecast because (1) he might still drop out and (2) he\u2019s probably not capable of running the sort of normal campaign the model implicitly assumes he can. Biden\u2019s chances are probably lower than the current 28 percent in the Silver Bulletin forecast, in other words. But they\u2019re certainly not zero. I worry about a news cycle on Nov. 6 when an unsound model is validated because it \u201Cwon\u201D the model wars based on a sample size of one election.

What also makes this awkward is that the model I\u2019m going to criticize comes from the site I used to work for, 538. I\u2019m sure newsletter readers will know this, but what was formerly the FiveThirtyEight model1 from 2008-2022 is now the Silver Bulletin model \u2014 I retained the IP when I left Disney. But, I\u2019m not sure the rest of the world knows that. (I still sometimes run into people who think FiveThirtyEight is affiliated with the New York Times, which it hasn\u2019t been since 2013.) I worry a little bit about a Naomi Klein / Naomi Wolf situation in which criticism of the 538 model rebounds back on me.

However, various high-profile reporters have contacted me for comment. And I think I have a professional obligation to speak up. Not all that many people have explored the inner workings of models like these. Moreover, we\u2019re in an unusual circumstance where the models themselves have become part of the debate about what Biden should do. For instance, the 538 model \u2014 which showed Biden with a 53 percent chance of winning as of Thursday afternoon \u2014 has been cited by Biden defenders like Ron Klain, the former White House Chief of Staff, as a reason that Biden should stay in the race:

I\u2019m not sure that Klain or anyone else should get their hopes up from the 538 model, however. At best, all it\u2019s really saying is that Biden will probably win because he\u2019s an incumbent: the polls have very little influence on the 538 forecast at this point. And at worst, it might be buggy. It\u2019s not easy to understand what it\u2019s doing or why it\u2019s doing it.

I thought the 538 model seemed basically reasonable when it was first published in June, showing the race as a toss-up. But its behavior since the debate \u2014 Biden has actually gained ground in their forecast over the past few weeks even though their polling average has moved toward Trump by 2 points! \u2014 raises a lot of questions. This may be by design \u2014 Morris seems to believe it\u2019s too early to really look at the polls at all. But If my model was behaving like this, I\u2019d be concerned.

In the Silver Bulletin model, we take steps that are roughly similar to the 538 model. First, we take a current \u201Csnapshot\u201D of the race \u2014 which is based on polling, although with some fancy adjustments to smooth out the data in states where there isn\u2019t much polling \u2014 and then regress it toward a prior based on \u201Cfundamentals\u201D (which in our case consist of the economy and incumbency).

Right now, for instance, Biden trails by about 2 points nationally in our polling-based estimate, but our fundamentals forecast says he \u201Cshould\u201D eventually win the popular vote by roughly 2.5 points. Currently, our model uses roughly a 70/30 blend of the snapshot and the fundamentals.

Blending the polls and fundamentals yields a mix where Biden is projected to lose the popular vote by around 0.6 points \u2014 not so bad, actually, but keep in mind that Biden\u2019s Electoral College position is considerably worse than his standing in the popular vote.

538 projects Trump to lead the polling average by 2.7 points in Wisconsin on Election Day. And it thinks the fundamentals basically show a tie (Biden ahead, but by only 0.2 points). And yet somehow, it predicts Biden to win by 1.3 points in their \u201Cfull forecast\u201D. This doesn\u2019t make a lot of sense.

Now, that could at least theoretically be correct if they think the fundamentals are the more reliable indicator. The problem is, they don\u2019t seem to think that, or at least not based on this chart they\u2019ve published. Look at the error bar for their \u201Cfundamentals-only forecast\u201D. It\u2019s incredibly wide \u2014 notably larger than the error for their polling component.2 In fact, the 95th percentile probability distribution on their fundamentals estimate covers everything from roughly Trump +20 to Biden +30 (!), results that are nearly impossible in today\u2019s highly polarized political environment.

So the chart is telling us that Morris doesn\u2019t think the fundamentals are very informative at all. And yet, his model seemingly assigns 85 percent of the weight to the fundamentals. (I\u2019m going to use terms like \u201Cseemingly\u201D a lot because of the lack of transparency in what the 538 model is actually doing.) As a principle of model design, it\u2019s almost axiomatic that if you\u2019re blending two or more components into an average, you\u2019ll want to place more weight on the more reliable component. But they\u2019re doing just the opposite.

It also doesn\u2019t explain what\u2019s going on in Wisconsin or in other states like Ohio \u2014 where, to repeat, 538 has Biden doing much better in their full forecast than in either the polls or the fundamentals.

The explanation that Morris has given for this discrepancy is jargony and hard to parse. I might do a follow-up post where I try my best, but I\u2019m reluctant to do his work for him. (He claims to be too busy to provide a longer explanation.) So far, what he\u2019s written raises as many questions as it answers.

But what everyone should know is that statistical models like these are complex and can very easily go wrong. Models can contain coding errors \u2014 like, say, flipping a plus sign for a minus sign \u2014 or can suffer from incorrect data (e.g. mistakenly inputting an Alabama poll as an Arizona poll). But more than that, models have a lot of complicated components, and if you aren\u2019t careful they can be less than the sum of their parts.

Moreover, it\u2019s often hard to detect these design flaws through backtesting alone \u2014 usually you only learn the hard way once a model is stress-tested under real world conditions. (Because Morris\u2019s model is new this year, it hasn\u2019t endured one of those tests yet.) It\u2019s a little bit like thinking you\u2019ve engineered a good car, but for some reason the first time you test drive it, it continuously drifts to the left-hand side of the road and won\u2019t go faster than 45 miles per hour. Sorry, but it\u2019s time to go back to the lab when that sort of thing happens and not pass off a bug as a feature (\u201Cit\u2019s actually good that you can only go 45 MPH because you\u2019ll get in fewer accidents that way!\u201D).

However, even if the 538 model is working as intended, I don\u2019t think it\u2019s informing us of much. Its thesis is basically this: Joe Biden is a reasonably clear favorite to win the popular vote because he\u2019s an incumbent, and it\u2019s too early to really update that assumption based on the polling or anything else. Indeed, you shouldn\u2019t really think of the 538 model as a polling-based model at all, given that their forecast has actually moved in the opposite direction of the polling so far. I worry that people like Klain are interpreting the 538 model as saying Biden\u2019s polling is fine, when it isn\u2019t really saying anything about the polling.

Basically, the chart says that polls don\u2019t tell you very much until about 75 days before the election (roughly late August) at which point they begin to rapidly converge toward what they\u2019re going to say on Election Day. This still isn\u2019t a reason to favor fundamentals over polls, because the fundamentals are even less reliable according to 538\u2019s calculations.3

But leave that aside for now. I also don\u2019t really buy the chart and I think it badly exaggerates the amount of polling movement we\u2019re likely to see. Why? Two reasons. One is that it\u2019s based on state polls, not national polls. State polling averages are much noisier for various reasons, but mostly just because each state will only get polled a few times a month, or sometimes much less than that. Much of what a state polling average captures is just statistical noise, not real movement in the race. The error is going to be much less if you use the sort of fancy polling averages that 538 or Silver Bulletin do, which can combine state and national polls to give you a much more precise snapshot of the race. For instance, if there were two or three polls of West Virginia in the 1976 election between Jimmy Carter and Gerald Ford and they bounced around a lot, I don\u2019t think that tells us very much about the current environment.