Saturday, November 28, 2009

Week 12 NFL Predictions (model v1.1)

I tweaked the model somewhat. While I have not fully minimized the extremely high/low probability scenarios, those will occur less often in the current version of the model (1.1). This model will still tend to pick the same winners as the previous version, but the predicted scores and win probabilities are somewhat different.

The difference between this version and the previous is actually more of a bug correction than a substantive change. The previous version used each team's average score for the season to that point as a starting point. It then shifted that starting point up or down based on how well the team does relative to an overall average of teams along with factors related to the specific opponent. Therefore there was some redundancy such that a team that is better than average will start with a better than average score and then be further shifted up. So this version starts everyone at the overall average and simply shifts that value based on how well a given team has been doing and the specific opponent. It's a subtle distinction, but one that should predict fewer blowouts. That appears to be the case. Below are predictions using v1.0 and then v1.1. I added the predicted margin along with the point spread set by Vegas to see how the model does not just in terms of picking winners but against the spread (ATS).

Version 1.0
Away Home Away
Score
Home
Score
Away
probability
Predicted
Margin
Vegas
Spread
Packers Lions 39 16 0.97 -23 -10
Raiders Cowboys 13 27 0.18 14 13.5
Giants Broncos 28 18 0.8 -10 -6.5
Browns Bengals 15 26 0.22 11 13.5
Buccaneers Falcons 15 36 0.06 21 12
Dolphins Bills 28 14 0.87 -14 -3
Seahawks Rams 23 11 0.82 -12 -4
Panthers Jets 16 23 0.26 7 3
Redskins Eagles 10 28 0.07 18 9
Colts Texans 31 19 0.85 -12 -3.5
Chiefs Chargers 12 34 0.006 22 13.5
Jaguars 49ers 19 22 0.39 3 3
Cardinals Titans 34 21 0.82 -13 3
Bears Vikings 19 40 0.05 21 11
Steelers Ravens 20 23 0.42 3 2.5
Patriots Saints 35 46 0.23 11 2

These picks are not particularly controversial but several of the probabilities tend to be extreme.

Version 1.1
Away Home Away
Score
Home
Score
Away
probability
Predicted
Margin
Vegas
Spread
Packers Lions 35 18 0.93 -17 -10
Raiders Cowboys 9 26 0.09 17 13.5
Giants Broncos 23 22 0.56 -1 -6.5
Browns Bengals 12 26 0.13 14 13.5
Buccaneers Falcons 19 33 0.14 14 12
Dolphins Bills 26 19 0.71 -7 -3
Seahawks Rams 25 14 0.81 -11 -4
Panthers Jets 17 24 0.29 7 3
Redskins Eagles 14 23 0.2 9 9
Colts Texans 26 18 0.77 -8 -3.5
Chiefs Chargers 16 29 0.05 13 13.5
Jaguars 49ers 20 23 0.41 3 3
Cardinals Titans 31 21 0.76 -10 3
Bears Vikings 20 31 0.18 11 11
Steelers Ravens 19 21 0.44 2 2.5
Patriots Saints 28 32 0.41 4 2


In the second set of predictions the average absolute value of the point difference is 9.25 compared to 13.44 in the first version. Neither value is necessarily clearly better. The average margin of victory for the last two weeks were 9.38 and 8.19, but three weeks ago it was 13.92.

I hope to make a few additional changes soon, but I wanted to get these numbers up before the majority of the games were played.

Sunday, November 22, 2009

Football Prediction Model v1.0

Before going into the mechanics of the model, I want to examine the accuracy of the predictions this week:

Teams

Predicted Score

Actual Score

Predicted Probability

Dolphins

29

24

.72

Panthers

21

17






Redskins

9

6


Cowboys

26

7

.90





Browns

8

37


Lions

14

38

.74





49ers

18

24


Packers

28

30

.86





Steelers

27

24

.94

Chiefs

9

27






Falcons

31

31

.54

Giants

30

34






Saints

48

38

.95

Buccaneers

16

7






Bills

14

15


Jaguars

21

18

.69





Colts

30

17

.71

Ravens

22

15






Seahawks

20

9


Vikings

39

35

.93





Cardinals

35

21

.95

Rams

11

13






Jets

18

14


Patriots

32

31

.81





Bengals

26

17

.79

Raiders

15

20






Chargers

26

32

.83

Broncos

17

3



While the model correctly picked 11 of the 14 games played so far this week, I think there are some problems with the underlying assumptions. I'm not at all bothered by the Falcons/Giants result which predicted a close game (and nailed the Falcons' final score dead on) because it only had the Falcons win 54% of the time. But for the Steelers/Chiefs game and frankly the Redskin/Cowboy and Browns/Lions games, and others, I think that the win probabilities or at least the scoring distributions were way off.

My model samples 10,000 times from distributions which estimate the final score of each team. You can see that how my model came to the probability of .94 in favor of the Steelers.

It's hard to know if the failure of the model for this game to predict the winner is simply because this happens to be one of those 6% of games when the Steelers are more to the left and the Chiefs are more to the right of their respective curves.

My suspicion, however, is that very rarely should any team ever be given a probability greater than 90% to win any game. There are too many unpredictable factors. So I may introduce a fudge factor into the next version of the model when disparities are high. On the other hand, the Seahawks/Vikings game and Saints/Buccaneers games were also predicted to be blowouts, and the model was pretty close to the mark. I still question the distributions though, especially at the higher ends.

So how did I come up with this first version of the model? Well I was originally expecting to use many factors, such as yards/att for passing and rushing, interceptions, sacks, aspects of the offensive line, home/away status, and others. So I decided to run a stepwise regression with a dozen obvious factors, in which the predicted variable was Win-Loss record (I used 2008 data so that I had an entire data set to work with).


It's pretty clear from the beginning that some of the factors appear to be important (coefficient confidence intervals do not overlap with the vertical 0 line) and others are not. But this is what happens when I include in the model a simple factor that is simply the average points scored minus average points allowed:

Amazingly, nothing else matters. This one variable captures 83% of the variance for the entire season:
So while I would much rather make a model that is able to predict the outcome of games based on the underlying mechanics of the game, I'm starting by just focusing on how many points each team scores and how many points they allow.

In this version of the model I compare team performance with overall averages and then adjust the parameters for any given match up. For instance, a team that does much better on defense than average will shift the opponents' likely score downward. I then produce constants for each team based on their opponent. I then sample from a modified normal distribution 10,000 times with a standard deviation based on the sample standard deviation of that team. The win probability for a team is simply the number of times a higher score was sampled for them versus their opponent, divided by 10,000.

A few things I plan to consider (in addition to making large win probabilities and really high predicted scores appear less frequently):

1) What sort of distribution should I use? I've simply been sampling from normal distributions (telling it to redo any sample where a score is less than zero or greater than 65). Looking at the distributions of scores, it may make more sense to use a Gamma distribution or at least transforming the values first before sampling from a normal.
2) Should I weight more recent games more strongly to capture trends in these parameters?
3) What other factors should I include in the model? (e.g. does it really matter if a game is played at home vs. away? Also, what effects should coaching or historical rivalries have on games predicted to be particularly close?)

A new year, a new project


So the election has come and gone. A year has passed and I have been itching to play around with other predictive statistical models. I decided to turn to professional football. I think that it will be difficult to produce a model that is very accurate for a couple of reasons. First, there are very few games in a football season (at least compared to other sports) so there is greater error in parameter estimation. Second, there is so much chaos in a football game; a few seemingly small decisions may determine the outcome of the game. Of course there are also the larger decisions or rare events that no one can foresee (e.g. the end of the New England vs. Indianapolis game). So I'm not really convinced that the model will do too well but it should be fun nonetheless.

Like the election forecast model, this operates using a parametric bootstrap. In the election model I used polls to estimate win probability for each state. In this model I use performance data from earlier in the season to estimate points gained and points allowed by each team in each subsequent game (along with win probability). The NFL is in week 11. So there is still time to tweak the model to improve its performance through the end of the season. Hopefully it will do as well as the "experts" at ESPN, which is not always too impressive.

Details about this model (version 1.0) along with predictions for week 11 will be posted soon.

Tuesday, November 11, 2008

Election Night Post-Mortem (I)



Now that we have had a week to digest the election results, I would like to do a little post-election analysis of the results.  My model predicted Obama would win with a probability  > 99.9%. While I don't know how to verify this number, I can compare my electoral vote estimates with the actual result.  

Not all of the results were immediately available.  Several close contests included IN, NC, and 1 NE district (all went to Obama) and MO (probably will go to McCain).  This brings Obama's final electoral vote tally to 365.

My first question is: Assuming that the actual outcome came from a distribution of Obama electoral vote probabilities identical to that of my model's (null hypothesis), what is the probability of having an actual Obama electoral vote count as extreme as 365?

To answer this we do a simple z-test: Z = 365 - 352.1 / 24.39 = 0.5289.  This translates into a probability of p = 0.30.  A typical significance threshold would be alpha = 0.05.  Obviously the p-value obtained here is well above that so we do not have enough evidence to reject the null hypothesis, that the actual outcome came from my simulated distribution.

Perhaps a clearer way to see this is by simply marking the bin in my distribution which contains the actual outcome (Obama 365).  
It's clear from this that the final outcome is right in the fat part of my bell curve.

It seems then, that these results suggest:
 (1) The polls were quite accurate in predicting the outcome and (2) The model adequately translates accurate polling data into a valid prediction of cumulative electoral vote outcome.
OR
(1) The polls were poor or modest predictors of outcome BUT (2) The simulation model is robust to somewhat inaccurate polls.
OR
(1) The polls were so accurate that they made up for (2) a model that otherwise would have been a poor or moderate predictor of outcome.

All of these can be examined to some extent.  
States above the line indicate where Obama over-performed the pollster trend margin.  States below the line are those in which Obama underperformed the pollster margin.  I haven't yet examined these data extensively (nor have I done a comparable comparison for the realclearpolitics data, the other polling source in my model).  At a glance, however it seems that while the polls look like pretty good predictors, Obama seemed to over perform in more of the deep blue states (e.g. VT, HI, RI, MA) and underperformed in deep red states (e.g. OK, UT, AK, AR).  Those designated as toss-ups seemed to fit the unity line quite well (states that hover near 0,0).

Off the top of my head it makes sense that the pollster trends would be the most accurate for toss-ups because these states tend to be more extensively polled, especially near the time of the election.   The tendency for strong red states to become more red and strong blue to become more blue might reflect the superior ground game of the dominant party in those states leading to higher voter turnout for their party.  This explanation is somewhat unsatisfying to me though, as likely voter models (which many pollsters use) are generally designed to account for turnout discrepencies.  Also, if the results happened to be in the opposite direction I would have said that this too would make sense, as I would expect greater complacency of democrats in strong blue states and republicans in strong red ones.  I guess that's a problem with any post-hoc explanation. 

I intend to examine some of these issues more fully later.  One last point though.  I had mentioned that the site fivethirtyeight.com seemed to have the best election model out there.  The sophistication of Nate Silver's model has actually won him a great deal of publicity lately, including interviews on cable news channels and a recent NY Times article.  His final prediction had Obama winning 98.9% of the time with an average of 348.6 electoral votes.  As I mentioned before, I know of no way to evaluate the accuracy of win probability.  His electoral vote prediction of 348.6 was similar (slightly less accurate even--although not a statistically significant difference) than my simple bare-bones model.  I wonder whether my model was as accurate as his because of the particular electoral landscape in this election, or whether the added sophistication in his model (with demographic information and weights associated with each pollster, etc.) simply reach a point of diminishing returns and does not affect the outcome that much.

Whatever the answer, this whole experience has reinforced my novice interest in playing with statistics and modelling and I intend to continue doing this sort of thing in the future.