Posted by: Paul Hewitt | May 26, 2009

Calibration = Prediction Market Accuracy?

In response to a recent paper I wrote on prediction market accuracy (or lack thereof), I received counter arguments claiming that a prediction market may still be “accurate” even though the prediction fails to accurately predict the actual outcome.  The argument was that “accuracy” is found in the calibration of the market predictions to the actual outcomes.  Let’s look at this concept as it applies to two types of markets:  a pari-mutuel horse race and a “winner-take-all” prediction market (using sales as an example).

Pari-mutuel Calibration

Pari-mutuel horse race markets (betting pools) are very well-calibrated (lots of examples and lots of proof of this).  That is, the odds generated from bets placed do, in fact, reflect the actual distribution of outcomes averaged over a large number of trials (races).  We shouldn’t find this particularly remarkable, as long as the market (bettor pools) possesses a reasonable degree of information “completeness”.  This probably holds true, given the fairly large number of diverse track bettors for most races.  Consequently, horses with a 10% chance of winning, based on the bets placed, will win about 10% of the races.  Here, the results are averaged over many, many races, just as they are when a coin is tossed many times and “heads” comes up 50% of the time.  Having a high degree of calibration in these markets ensures that the odds are “fair” to the bettors.

If we want to cash the most winning tickets, we would place bets on the favourite in every race.  The favourite has the highest likelihood of winning.  This doesn’t mean that the favourite will win, just that the odds of that horse winning are better than those for any of the other horses.  If there was no track “take” (i.e. no cost to play), it would be a zero-sum game.  You could bet on any (or all) horses and expect to come out “even” in the long-run.  Not much fun in that!

While pari-mutuel horse race markets are set up for the primary purpose of wagering, they do provide a frequency distribution of bets placed on each of the horses, which provides some predictive information about the future outcome (winner).  However, pari-mutuel horse races are different from “winner-take-all” prediction markets that attempt to predict the actual future value of a continuous variable (future sales for example).  In a horse race, the possible outcomes are discrete (horses).  In the horse race, unless the horse with the highest likelihood of winning does win, the market has failed to predict accurately, despite the fact that the pari-mutuel market is “well-calibrated.”  This is fine for a betting pool, but it is of little use in a corporate prediction market.

Enterprise Prediction Market Calibration

Many enterprise prediction markets are formulated as “winner-take-all” bets, which provide distributions of predictions about uncertain outcomes, somewhat similar to those of horse race markets.  Ideally, we would like these distributions to accurately reflect the distribution of actual outcomes that are being predicted.  Seems obvious enough, but how do we know when a prediction market is well-calibrated with the distribution of actual outcomes?  I have yet to see a study that has run similar enterprise prediction markets enough times to obtain an accurate distribution of actual outcomes that could be compared with prediction market distributions.  Aren’t we really assuming that prediction markets are well-calibrated?

In a prediction market, the decision-maker is hoping to derive an accurate prediction of the actual outcome.  Given that we are looking at an uncertain outcome, there will always be an error factor associated with the prediction.  If the most likely state (or share) does not capture the actual outcome, we hope that the next most likely state will.  That is, we want the most likely state to be as close as possible to the actual outcome.  Contrast this with a horse race.  There is no decision-maker other than the bettor.  The bettor selects a horse to win.  If the horse does not win, it doesn’t matter which of the other horses actually won, the bettor loses.  In an enterprise prediction market, the decision-maker does care which of the other “horses” (states) “wins.”

The difference is that an enterprise prediction market usually attempts to predict a continuous variable, such as quarterly sales, whereas a horse race market attempts to predict a discrete outcome.  In such a prediction market, we can derive the average sales forecast figure.  This is the figure representing the best estimate of the future outcome.  This is the figure that must be “accurate” for it to be useful in decision-making.  In a horse race market, such an average is meaningless (i.e. the 2.6th horse?), because the horse numbers (or names, or positions, etc…) are not related in any meaningful way.

The value of calibration is that it verifies the extent to which the distribution of predictions (bets) matches the actual distribution of outcomes.  This tells us the extent to which we may rely on the prediction market distribution as a proxy for the underlying uncertainty of the actual outcome.  It also tells us (if well-calibrated), how much uncertainty exists surrounding the future outcome.  If there is a great deal of uncertainty, the prediction market will not be very useful.

Now let’s look at a prediction market that has near perfect calibration, but a nearly flat distribution of bets (opinions).  Some might argue that the market is “accurate”, but it is useless for decision-making purposes.  The market is telling us that the outcome is too unpredictable.  However, the market could be used for betting, because it is well-calibrated. Think of a betting market on the outcome of rolling a fair die.


The point of this discussion is that prediction markets should be well-calibrated, but this is not a sufficient condition for their usefulness.  They must also provide accurate predictions, with relatively tight distributions.  The maximum allowable dispersion of the distribution will depend on the materiality of the forecast error.  That is, the prediction should be accurate enough, such that the maximum allowable error would not cause the decision-maker to alter his or her decision had the true value been known in advance.

Where the distribution of the prediction market is not tight, the market may still have some use, but not so much for being able to predict the outcome.  Instead, the market will be providing information about the degree of uncertainty surrounding the outcome.  This may indicate the need for greater care in assessing risks and the need for more extensive contingency planning.  A flatter distribution may indicate that the market is not functioning properly (lack of information completeness, perhaps).  Alternatively, a flat distribution may indicate that the variable being predicted is, simply, not predictable.



  1. […] Even when well calibrated, prediction markets can be useless, nevertheless. Written by Chris F. Masse on May 26, 2009 — Leave a Comment Paul Hewitt: […]

  2. […] Calibration = Prediction Market Accuracy? […]

  3. […] For more information about this, please refer to my previous post on calibration, here. […]

  4. […] of the measures of accuracy is calibration.  We can be fairly sure that horse race odds are well-calibrated with race outcomes, because we […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: