Monte's method

48 views
Skip to first unread message

Danny

unread,
Mar 19, 2012, 2:06:10 PM3/19/12
to Machine March Madness
Monte was kind enough to share more details about his approach for
team "Predict the Madness".
http://blog.smellthedata.com/2012/03/predict-madness-by-monte-mcnair.html

Scott Turner

unread,
Mar 19, 2012, 2:20:46 PM3/19/12
to machine-ma...@googlegroups.com

There are a couple of interesting things here, and I'll try to comment on them in detail when I get a chance.

-- Scott

Scott Turner

unread,
Mar 19, 2012, 10:06:45 PM3/19/12
to machine-ma...@googlegroups.com
On Mon, Mar 19, 2012 at 2:06 PM, Danny <danny...@gmail.com> wrote:

Some comments:

I want to know how important playing at home is so that I can strip this out for neutral site games.

It's a common assumption that neutral court games should be treated differently from games played at one team's home court, but is that really true?  The SI article that looked at home court advantage concluded that it was primarily due to the referees treating the home team differently.  That jibes with something I found -- that large home dogs don't get a HCA:

http://netprophetblog.blogspot.com/2011/06/idle-experiment.html

Presumably the refs don't give the benefit of the doubt when they know the home team is overmatched.

I did some other experiments (prior to starting the blog, so they aren't documented there) where I trained a predictor on regular season games using just a strength measure for each team, so that the prediction equation looked like this:

    MOV =  (C1 * Strength of Home Team) + (C2 * Strength of Away Team) + C3

C2 was negative, and C3 (along with any C1/C2 ratio) was the "home court advantage".

I then tested the accuracy of this predictor on NCAA tournament games, first treating the higher seed as the home team, then the lower seed as the home team, and then washing out HCA altogether by dropping C3 and forcing C1 & C2 to be equal. 

What I found was that the best prediction was made treating the higher seed as the home team.  This makes some intuitive sense -- the refs are giving the benefit of the doubt to the team that they "know" is the better team.  So I'm a little dubious that there's really no "HCA" in tournament games, although I don't know that anyone else has looked at it.

For example, say North Carolina played Duke on January 7th in one of my training games. For North Carolina's profile, I used stats from all of their games before AND after January 7th.

I can see some problems with that -- for example, if the team's lineup changes significantly at some point.  And obviously you can't use any strength measures (like RPI) if they include the game being predicted. 

The next thing to do is to take our matchup predictions and maximize our expected points based on the scoring system we are presented with. While this is most beneficial when scoring systems provide bonuses for picking upsets or some other unique scoring, it can still be helpful in basic scoring systems and is better than simply advancing winners round by round. 

I think I understand what you're trying to do here, and if so it's pretty clever.  I can't imagine how you try to optimize it over all 63 games, though.

It's similar to the common human picking strategy I've considered implementing, to try to pick upsets in a branch leading to a very strong team -- the idea being that if you get the upset wrong, your losses will be cut off when the team that didn't get upset loses to the strong team.  (If that makes sense.) 

-- Scott
Reply all
Reply to author
Forward
0 new messages