Opponent adjustments

Bill Connelly

unread,

Dec 15, 2012, 2:05:14 PM12/15/12

to study-hal...@googlegroups.com

Alright, let's get this listserv going.

A lot of us have set up our own system of ratings in recent years, and while we all know that opponent adjustments are EVERYTHING in college sports, we also probably know that there are a thousand different ways to go about making opponent adjustments. To the extent that you are willing to share (because we are by nature rather protective of our methods), what do you feel is the most sound way of doing so?

Brian Fremeau

unread,

Dec 16, 2012, 3:08:19 PM12/16/12

to study-hal...@googlegroups.com

Good question. I imagine there are many opinions on this topic. Some initial thoughts:

How are opponent-adjustments applied? Is it a season-long performance adjusted against the composite strength of the opposition faced? Is it applied game by game? Is it applied possession by possession or play by play?

Ed Feng had an interesting post regarding the Colley Matrix, which to Ed's surprise (and mine) does not take into consideration the specific results of games. Swap wins with losses against a given schedule, and Colley spits out the same rating: http://thepowerrank.com/2012/12/10/the-shocking-truth-about-the-colley-matrix-bcs-computer-poll/

Matthew Smith challenged the assertion that this was actually a problem, and he, Ed, and I had a bit of a Twitter exchange about it: http://twitter.com/bcfremeau/status/278618320276713472

I'm of the opinion, like Ed, that the specific results do matter and that beating a good team and losing to a poor one is not the same as beating the poor team and losing to the good one. Not precisely the same, anyway. As Matthew Smith pointed out, and I have discussed at FO, my FEI rating system includes a 'relevance' factor as part of the opponent-adjustment. Generally speaking, this means that games against teams that are of similar strength receive more weight in my formula, but I also add more weight to bad losses.

In the current issue of ESPN the Magazine, Peter Keating wrote a college football bit about the SRS (simple rating system), trumping it up as the best rating system in existence. It's absurd to consider any individual system the best without a lot of data to back it up, and Keating's column is pretty thin in that regard. Aside from the hyperbole, the article did get me to thinking about a key aspect to SRS that Keating valued -- the output is represented as the "points better than average" of each team. i.e., Alabama is 30 points better than average team, Oregon is 28 points better, Notre Dame is only 22 points better, etc.

Most systems represent their data this way. Maybe not in terms of points, but in terms of relationship to average. FEI is represented this way. The thing that jumped out at me from Keating's article was that we take the transitive property for granted when we represent our data this way. Alabama is 30 points better than average, Notre Dame is 22 points better than average, ergo, Alabama is 8 points better than Notre Dame. It makes intuitive sense to draw that conclusion, but I wonder if that's actually what the numbers themselves mean. Does a great team's relationship to an average team have a linear relationship to its relationship to another great team?

I represent strength of schedule as an elite team's likelihood of going undefeated against the schedule. That's quite different from measuring its ability to play against a schedule full of average teams. Should the rating system be reconstructed to measure each team against an elite team rather than an average one?

Sef Dresslar

unread,

Dec 18, 2012, 4:47:45 PM12/18/12

to study-hal...@googlegroups.com

All good thoughts and it's something that's bothered me for a while. My stats are adjusted evenly based on an average of a team's opponents (however possession and play stats are weighted based on how many possessions and plays they faced against each opponent, of course). It seems intuitive that games between more evenly matched teams should be given more relevance in a system, but does history bear that out? For a given team, does a retrodictive study where their ratings are recalculated based only on more equal opponents predict the result of a specific matchup more accurately than if they are recalculated based on their most unequal opponents?

Matthew Smith

unread,

Dec 21, 2012, 6:05:59 PM12/21/12

to study-hal...@googlegroups.com

Without going too much into specifics, I consider a number of situational adjustments for my schedule calculations as opposed to just opponent quality. Obviously home/away/neutral, but also distance (see http://cfn.scout.com/2/1063535.html for a demonstration of why this matters), bye weeks (not a huge impact but non-zero), etc.

I've tried in the past to tweak my system to analyze quality of offense vs defense instead of just overall team strength but have never gotten anything useful out of the excercise. There's a lot more data available than there used to be, though, so I may end up playing around with it a bit in the coming offseason. Depending on the level of weekly information that's easy to pull live, I may also try and account for certain luck statistics (mainly fumble luck) on an ongoing basis instead of just using that as part of my preseason projections.

As far as Brian's point about Keating's system goes, I agree. Someone who states that their system is "the best" with a decent amount of supporting data is overstating their case and at best straddling the line with obnoxiousness; someone who just says it flat out with no backup whatsoever (which it sounds like that guy did) is just being ridiculous.

Brian, one question I had about your schedule calculations: does your "likelihood of going undefeated" schedule calculations actually figure in as a direct input to your model or is it a function of the ratings you've already calculated? I don't know if this is part of the internal weighting system or just an output that you use because it's interesting. If it's the latter, it might not be a bad idea to list both "standard SOS" (an average, possibly weighted by similarity if you feel that's more appropriate) along with what I'd call "elite SOS" (since some someone like, say, UMass, there's little functional difference between a game against Alabama and one against South Carolina, even though clearly it matters a lot for a top-tier team).

Chris Treadaway

unread,

Dec 21, 2012, 9:06:40 PM12/21/12

to study-hal...@googlegroups.com, study-hal...@googlegroups.com

The model I built for the 2003-2004 bowl season (and my stats project in business school for that matter) was iterative. I started with full season statistics and ran a stepwise regression solving for wins to find out the variables that statistically mattered the most for victory, not points, point differential, yardage differential, or whatever else. All teams were assumed to have the same strength of schedule to start.

Then I looked at interconference records to try to weight conferences against each other, ultimately assigning a dirty expected point differential when trying to compare one team against the next. Then I re-ran the model.

I'd like to say it was more scientific, but I then eyeballed it for the third run -- which attempted to solve for the different rankings (BCS, Coaches, AP).

At the end of the day, it told me that LSU was 7 points better than Oklahoma -- which encouraged this Tiger fan to buy a ticket to watch LSU beat Oklahoma 21-14 for the national title. The model was 72% against the spread for the entire bowl season.

The model was dirty, but I think got smarter as it went along. And most notably, it helped me identify outliers in the point spreads... games that the model aced 100%

I've wanted to go back to re-run it all against results to see if I can refine it or better understand the losers in the model. I'm also eager to use some of the data sets that people have here to see how things like Bill's #s and drive charts and so on can be integrated. I'm convinced the data, if properly harnessed, can do a pretty good empirical job of assessing which teams are better than others.

I was going to do it for the 2012 bowl season, but we just had our first child - a baby girl. So I might not get it done in time for the real bowls.