Monday, October 27, 2008

2008 Presidential Election Simulation (III): One Week To Go

With one week left, Barack Obama seems to be holding on to his lead. For this simulation I decided to average the state-by-state win probabilities based on realclearpolitics.com and pollster.com polling aggregates.

The results: Obama wins greater than 99.9% of the time. I have him averaging 345 electoral votes with a 95% confidence interval of (298, 392).

All data to the right of the vertical dotted line represent Obama simulation "election" wins. Those few results to the left of the dotted line are wins for McCain. As with my previous simulations, these data represent the results of 10,000 simulated elections.

The chart below represents the current status of the race based on the win probabilities computed from my model. The states are ordered by probability from most likely Obama wins to most likely McCain wins. This demonstrates the challenge McCain faces. The states colored in green are places where Obama currently leads but are the likeliest of this group for McCain to pick off. In order to get to 270, McCain must either win all of those green states: NM, NH, CO, OH, NV, FL, NC, and MO or some other combination of even more strongly blue states, such as PA (which despite this week's broadcast of This American Life, McCain is at the wrong part of the normal curve winning just 5% of the time).


From Chart

These data are a snapshot of what would happen if the election were held today. Who knows what events will occur to change the dynamics of the race in the next week. I am skeptical that McCain can do much of anything at this point to win. Only a low probability crisis or surprise about Obama would likely change the outcome. I will crunch the numbers on election day and examine how close my electoral vote projections are to the election results.

Friday, October 24, 2008

2008 Presidential Election Status (Compared with 2004)

In the preceding post, I discussed the commanding lead that Obama has sustained and alluded to a changing electoral landscape in which (at least this time) the outcome will not simply hang on the results in FL, OH, and PA.

To provide context for where things were immediately before the 2004 election, I decided to dig up state polling data from Real Clear Politics.  This page gives the RCP Average polling margin for 18 states (those states for which the race was somewhat tighter) in the 2004 election.  Those states were FL, OH, PA, WI, IA, MN, MI, MO, NM, NV, CO, NH, ME, WV, OR, NJ, AR, and HI.

Below is a histogram of the margin of error.  Negative values indicate states where Bush polled ahead of the actual result and vice-versa for positive values.

From this we can see that the RCP Average was generally pretty accurate, with the notable exception of HI (far left of the histogram) which RCP predicted for Bush by 0.9 points but Kerry won by 9.

Since not all states have equal value, I decided to normalize these errors based on electoral votes to give a theoretical RCP error impact (simply, percent margin * state electoral votes)


Again, we see no major bias in the data, with the average theoretical error impact near zero (0.07) and the cumulative error impact is 1.32.  This indicates that the polling theoretically overestimated Kerry's electoral vote total by 1.32 votes.  This positive value is almost entirely due to FL which was polling at Bush +0.6 but was won by Bush with a margin of +5.  Of course this error did not affect the outcome in FL,  so it is important to not misinterpret these theoretical error impact scores as reflecting actual differences in electoral votes.  In fact, there were only two states for which the polling data chose the wrong winner:  HI (4 EV) and WI (10 EV), Kerry won both and they were predicted for Bush.  While polling for HI was way off (see above), the RCP Average for WI was Bush by 0.9 while Kerry won it by 0.4.  Although the outcome was wrong, the RCP Average was still somewhat accurate.  Together the actual impact of Polling error was therefore Kerry +14 EVs.

So what did the polls look like in 2004?  Below is a histogram of the state margins (Kerry - Bush) based on the RCP Averages for 18 competitive states just prior to the election:

The average margin in 2004 for these states was in slightly in favor of Bush by 0.3889 percent.  Kerry led in the polls of 7 while Bush was ahead in 11.  Again, because these states have different EV values, the predicted electoral vote count among these 18 states was: Bush 109, Kerry 78.  The actual outcome for these states was: Bush 95, Kerry 92, a large enough margin for Bush to still win the election.

Below is a histogram of the same 18 states indicating the current RCP Average margin for the 2008 election (Note: states like VA and NC, which may flip to the democrats this year were not among those included in the 2004 list of 18 and are therefore not included here either.

One obvious difference between this histogram and the one above is a rightward shift in polling margin to an average Obama lead by >9 %.  Based on state electoral vote values, this predicts that of the 18 states Obama wins all but two of them, receiving an amazing 176 electoral votes to McCain's 11.

In summary, the 2004 polling data indicates RCP Averages are pretty close to the actual results.
If the current pattern, or anything close to it holds up for the 2008 election, Obama will win far more than the 270 EVs he needs, with a high probability for a landslide.

Thursday, October 16, 2008

Obama doesn't need Ohio or Florida (although he may still get one or both of them)

In the past two weeks, the numbers have become far more favorable for Obama across the board. This includes leads in Florida and Ohio as well as previously red states like Virginia, North Carolina, and others.

The most recent polls seem to have the race tightening somewhat in Ohio in particular and to some extent in Florida (graphs created by pollster.com, using trend lines that are maximally sensitive to local changes as well as noise).




Nevertheless, Obama still wins if he is able to hang on to New Hampshire (Obama + 7.3 to +10.4), New Mexico (Obama +7.5 to +8.4), Virginia (Obama +7.7 to +8.1), and Colorado (Obama +5.8 to + 6.2) (ignoring all other toss-up states)





Of these, the margin in New Hampshire may be tightening somewhat, although Obama still has a healthy lead.



Additional toss-ups which lean in Obama's direction (when smoothing is maximally sensitive to local changes) include Nevada, North Carolina, Missouri, and still of course Ohio, and Florida. In recent days, North Dakota and West Virginia have also become toss-ups, and Obama still threatens to pick up Indiana. With the four I mentioned and graphed above (NM, NH, VA, and CO), however, all these other states, which include the ostensibly critical Florida and Ohio simply become gravy for Obama.

Here's a wonderful graph from Charles Franklin, that puts this into perspective. Read his full article here.
So the story line of the last two elections (namely, that it all comes down to who can win 2 out of Ohio, Pennsylvania, and Florida) has changed. There are many ways Obama can win whether or not he gets Ohio or Florida, and with a double-digit lead in Pennsylvania, it is unlikely to be a deciding factor (any more than, say, New York or Maryland).

Monday, October 6, 2008

Presidential Tape Measure


The New York Times recently published a chart with all major party presidential candidate heights and weights since the 1896 election. They explicitly ask:

Does candidate height and weight play a role in electoral success?
I decided to run a linear regression using height and weight to examine the role of each variable. It seems that while neither significantly contributes (p > 0.05), there is a trend for weight (p = 0.10). Adding height to the model does not improve it, which makes some sense given the highly significant correlation between height and weight.

In the group, the average weight of winners was nearly 12 lbs heavier than that of losers, although again, this is not enough to reject the possibility that this difference is due to chance.

One last note: I looked to see whether heights and weights have changed much over time. It seems that there is absolutely no change in average presidential candidate weight since 1896. However, presidential candidates have gotten significantly taller (statistically speaking) through the 20th century. from 1896-1960 (beginning of the TV era) presidents were about 5'10.6". Since then, they average 6'0.5". I suppose that these data suggest that modern presidents have been more physically fit than early-to-mid 20th century ones.

Saturday, October 4, 2008

2008 Presidential Election Simulation (II)

I last posted election simulation results a few weeks ago.  At that point Democrats were holding their heads in their hands as McCain coasted through a post-convention bump and the dominant news stories were about the "lipstick on a pig" comment and Obama's supposed kindergarten sex ed program.  Since that time we have had a financial crisis, McCain "suspended" his campaign, we have had presidential and vice presidential debates, and today a wall street/economy bail out plan finally passed in the House, after it was initially rejected.  In all of this time there has been a great deal of movement in poll numbers, generally in a favorable direction for Obama.

Here is another presidential election simulation based on current polling data, predicting what would happen if the election were held today.  Keep in mind that as polling numbers change in the next month, so too will these probability values.

When I previously explained my methodology, I may have glossed over the details a little bit.  Here is hopefully a clearer explanation of my procedure.

By taking the current polling margins I compute a win probability for each state based on a normal cumulative distribution function approximating that of Charles Franklin's.  His curve can be found here.  Below is my approximation of his curve, using the parameters mu=-1, sigma=6.8.  
I then use the current state-level polling data to determine win probabilities for each state.  I create two separate models, one based on the Real Clear Politics averages and the other based on the latest regression analysis margins from Pollster.com.

I then simulate 10,000 elections based on these probabilities.  The percentage of the time Obama gets more than 269 electoral votes is taken to be his overall win probability if the election were held today.

Here are the results:
Based on the margins of polling averages from Real Clear Politics, Obama wins an astonishing 98% of the time (average 328 electoral votes).  Note that results to the left of the dotted line are McCain wins while those to the right of the dotted line are Obama wins.
Based on the margins of pollster.com regression lines, Obama wins 93% of the time (average 312 electoral votes).


So at this point the race is Obama's to lose.  Nevertheless, after seeing such dramatic movement in the polls in the last three weeks we might expect that these numbers could shift back.  In fact, I expect Obama's lead to decrease in the next month if for no other reason than the fact that races tend to get tighter as elections get nearer.

Wednesday, October 1, 2008

Will Power: Dopamine and Temporal Discounting

Temporal discounting (a.k.a. delay discounting and hyperbolic discounting) is a concept in psychology and behavioral economics that describes the tendency for animals and people to prefer smaller rewards that will happen sooner (SS) over larger rewards that will occur later (LL). This concept has been developed by George Ainslie, who wrote Breakdown of Will, a book I recommend. This approximates the concept of "impulsive choice," which is integral to our understanding of drug addiction. It is also why we may value being healthy and physically fit over eating that piece of chocolate cake, but we still end up eating the cake when it's in front of us.

Temporal discounting is not just associated with pathologies, although it does describe a normal tendency to behave irrationally. For example, most people offered $50 now versus $100 (inflation adjusted) in two years, would choose the $50. On the other hand if they could choose between $50 in five years versus $100 in seven years the vast majority would choose the latter option. How is this possible? In five years, after all, they would be in the identical situation as the first scenario, in which they make the opposite choice.

The rate (shape) of a value curve as a function of delay can explain this discrepancy. Most behavioral data indicate that the change in value as a function of delay follows a hyperbolic function (as opposed to exponential, for example) function.

This figure, from Ross et al. (University of Alabama) shows that when a delay is short, the hyperbolic function allows a SS reward to be more strongly valued than a LL reward, even though the smaller reward is less strongly valued most of the time.Depending on the individual and the reward involved, the value curve can look dramatically different. A smoker, may be certain they are smoking their last cigarette and feel very much in control. Hours later, that same smoker, because of a sharply scooped out hyperbolic value curve, may find himself lighting up once again. This is one way to understand "will power": The sharper the hyperbola, the more impulsive the behavior appears.

In the July 30, 2008 issue of the Journal of Neuroscience Kobayashi & Shultz publish a paper showing that dopamine neurons, which are known to be associated with reward prediction error and incentive salience, also encode temporal discounting in a Pavlovian conditioning task. This suggests that prior to making the choice, our dopamine neurons are likely firing less to the idea of $100 in two years compared than to the idea of $50 now.
If you believe in "free will" (I will do another post on this topic sometime), how do you square away the potential role of dopamine neuronal activity in constraining value? In fairness, it remains to be proven that the activity of these dopamine neurons causes the behavioral phenomenon. Nevertheless, future work will undoubtedly examine how well we can predict subsequent choice based on the early activity of dopamine neurons to differently valued and differently delayed rewards prior to their presentation as a choice. If we are able to do this accurately, it seems to me we will have strong neurobiological evidence for value constraints that precede choice and will.