Posted by: Paul Hewitt | May 3, 2009

Prediction Market Accuracy and Usefulness

Consensus and Differences of Opinion in Electronic Prediction Markets Thomas S. Gruca, Joyce E. Berg and Michael Cipriano (2005)

I came across an obscure paper that delivers some interesting findings about the capabilities of prediction markets in the real world. Google Scholar indicates that this paper has only six citations, yet I found it to be very useful, because it involves a real world case study that examines three aspects of prediction markets:

  1. How well do prediction markets capture private information held by traders?
  2. Do prediction market prices reflect the dispersion of trader forecasts in addition to the consensus?
  3. How does the composition of the trader pool affect the disclosure of private information?

The authors conclude that prediction markets are able to aggregate privately held information quite well, they are able to aggregate information about the consensus of private information and its dispersion, and that ‘open’ markets result in better predictions than ‘closed’ markets of homogeneous traders.  Consequently, corporate prediction markets should not be restricted to in-house participants.  In this blog, I critically examine these conclusions and provide additional insight into the issues raised.

Background

The authors start with the premise reached by Plott and Sunder (1982, 1988), who were able to show that markets are able to disseminate information from “informed” traders to the uniformed traders.  Where there is perfect information (no uncertainty), it is effectively communicated from the informed to the uninformed.  Where the information is “complete” (sum of all information reveals the true state), market prices accurately predict the outcome.  Where there is uncertainty or the information set is not complete, prices may deviate from their expected values and lose the power to predict accurately (Sunder 1995). Their conclusions were based on laboratory experiments, involving a simple, hypothetical situation.

The authors of the current paper decided to test these conclusions in the real world.   They chose to run a series of markets, similar to those run by the Hollywood Stock Exchange (HSE), involving predictions of four-week box office receipts for 11 different movies openings (November 1998 – November 2002).  Each market involved 4 – 6 “winner-take-all” securities.   Trading took place on the Iowa Electronic Market (IEM), using its continuous double-auction mechanism with real money trades.  Trading commenced between four and 14 days before each movie opened in the theatres.  A Market prediction was obtained immediately before each movie opened, though trading continued during the movie’s run.

In order to test the market’s ability to aggregate private information held by traders, the authors collected forecasts from traders before they started trading.   This provided a measure of the private information held by the traders (as opposed to public information revealed by prices or other means).   Most of the traders were marketing students who completed a project in which they were asked to forecast movie box office receipts, performing their own analyses, using any information they could find.  There were four “closed” markets, in which all of the traders were students who had submitted their private forecasts before trading.  There were also seven “open” markets in which other self-selecting traders were allowed to participate, using their own money.   Here, the term “forecasts” refers to the students’ prior forecasts, and “predictions” refers to the prediction markets’ predictions.  This will make it easier to follow the analyses.

Do Prediction Markets accurately incorporate Private Information?

Yes. The authors compared the means of the students’ forecasts before trading in the market with the mean prediction implied by the market prices just before the movie opened.  They found a correlation of 0.99, indicating that the prediction market prices were accurately reflecting the private information held by the traders.

Do Prediction Markets reflect the Dispersion of Traders’ Forecasts (based on private information)?

The traders’ private information was incorporated and reflected in their forecasts (made prior to trading).  The degree of dispersion of these forecasts is described by the standard deviation.  Similarly, the authors calculated the standard deviation implied by the contract prices obtained from the prediction market.   They found that the market standard deviation was smaller than that for the students’ forecasts in every market, indicating a tighter distribution in the prediction markets and, presumably, a less uncertain prediction.  Some of the reasons put forth to explain the tighter distribution were that:

  • extreme forecasts get changed by some traders, when they see the other traders’ forecasts, as reflected in   market prices;
  • the number of contracts in the market may have affected the standard deviation, and
  • the assumption of a normal distribution may affect the true standard deviation.

So, they compared the actual market contract prices with those that would be expected if the entire distribution of students’ point forecasts (private, prior) were used to determine the contract prices.  That is, using the frequency data from the point forecasts, they estimated the probability of each contract paying off.  The expected contract prices should correspond to those observed in the market, if the entire distribution of students’ private information is being reflected in the contract prices.  Here, they found that the correlations were significant in 7 of the 11 markets, with the average being 0.81.  However, the correlations were particularly poor in two markets.  They cite three possible reasons for the poor correlations:

  • Additional information was obtained by traders after their point forecasts were made (and reflected in market prices only);
  • Other, non-student, traders (no prior forecast) were more influential in setting market prices than were the student traders (these markets appear to have been dominated by non-student traders, who had very different information), or
  • There was a market failure.

No conclusion was reached.  We might say that if either of the first two explanations is true, that is a good thing.  We want prediction markets to incorporate new information and the information provided by new participants.   Also, we want the market to determine which traders will be most influential in setting prices, based on their own individual predictions and degrees of certainty.  That is, just because the students did some research doesn’t mean that their forecasts should dominate in the prediction market.  They may not be very good forecasters.

Does the Composition of Traders Affect Market Accuracy?

There were two classes of markets – ‘open’ and ‘closed’.  The closed markets included only students who had completed the project of forecasting movie receipts before they began trading.  Open markets included other real money traders, who self-selected into the markets.

In order to estimate the accuracy of the prediction markets, the authors looked at the absolute percentage error of the predictions and forecasts (private, priors).   They found a mean average percentage error (MAPE) of 0.29, or 29% across all markets.  The MAPE for the seven open markets was 17%, but for the four closed markets it was 50%.   The authors conclude that adding additional traders to the mix improves the accuracy of the prediction markets.   They imply that corporate prediction markets should consider opening the markets to traders not normally involved with the forecast, in order to improve the accuracy of the predictions.

There are several problems with this analysis. The authors’ conclusion is wrong.   Looking at all of the students’ forecasts, we find that the MAPE was 33%.  We also find that it was 57% when they were in ‘closed’ markets, but only 20% when they were in ‘open’ markets.  The students did not know which market they would be in prior to making their forecasts, so it should be irrelevant.  We need to look, solely, at the overall accuracy.

By applying a bit of my own math, I find that the percentage improvement of the market predictions over the initial student forecasts is about 11.7%, and it does not matter much whether the market is open or closed.  Both open and closed markets experienced gains in accuracy (11.5% and 12.0%, respectively).  However, two of the seven open markets actually had a higher error than the initial forecasts made by the students prior to the market opening.  This was not explained by the authors.  I will provide one explanation, below.  We cannot attribute any effect on accuracy to whether the market was ‘open’.  Instead, the average error appears to be more dependent on the particular movie’s receipts being forecasted.  Some movies are harder to predict than others.  Maybe these markets are not appropriate for obtaining useful predictions, given the makeup of the trader pool.

UPON FURTHER EXAMINATION…

I took the data disclosed in this paper and ran it through my own analysesMy Analysis.   I segregated that open and closed market data, so that all analyses could be compared between the two groups, if necessary.   I calculated the average percentage error for the student forecasts and for the market predictions, to see how much of an improvement (if any) was obtained by running the prediction markets.   I calculated the decrease in the standard deviation between the student forecasts and the market contract prices, to see whether the prediction market helped to reduce the uncertainty of the prediction over the students’ initial forecasts.

The authors calculated the percentage error with the actual outcome on the denominator.  They also looked only at the absolute error (i.e. didn’t matter whether the market under or over-estimated the outcome).  If the Hollywood executives were to use the forecasts or market predictions in their decision-making, the error should be calculated using the forecast figure as the base (denominator), as this is the figure they would be using to make decisions.   I made this adjustment.   I already had the standard deviations for each market, for the students’ forecasts and for the market predictions.  Armed with this, I thought it would be interesting to see whether the prediction markets outperformed the students in their forecasts of the actual movie receipts.

Would Hollywood executives rely on these prediction markets?

The answer has to be ‘no’.

As mentioned above, the average absolute error of the market predictions was 29%, which is only an 11.7% improvement over the students’ initial forecasts.  This shows that prediction markets do bring about some improvement in forecasts of the future, but is it good enough to be used in decision-making?  The answer has to be ‘no’ in the case of predicting future movie receipts (at least with these trader pools).

Using the absolute percentage error disguises the fact that the errors go both ways (some were under- and others were over-estimated).   Further, the prediction markets provide no guidance as to which way the error is likely to fall.  Therefore, the real error is much larger than the absolute (value) of the percentage error.  It is, perhaps, as much as twice the error calculated by the authors.  Consequently, the real prediction market error might be as high as 58% in these markets.

We also saw that the predictions in two of the markets were worse than the initial forecasts (and we don’t know why this happened).  This speaks to the consistency issue.   If prediction markets cannot provide consistently accurate predictions in similar situations, how can they be relied upon for decision-making purposes?

What went wrong?

The authors considered the information that students obtained through their research and analyses as being “private”.  Except to the extent there may have been “collusion” in the development of individual forecasts (i.e. “study groups”), the students’ conclusions were privately held.  However, students would not be privy to industry information that would be available to Hollywood executives, film distributors, theatre owners, film critics, etc.  Instead, the students only had access to publicly available information on which to base their forecasts.  So, I think it is safe to say that the information available to the traders (collectively) was not “complete.”

Since completeness a pre-condition for market prices to predict the true outcome, it is not surprising that these markets failed to accurately predict movie receipts.  The trader pool was not diverse enough to have in their possession enough information to predict the outcome accurately.

These markets showed that prediction markets are able to reflect participant information fairly accurately, but if there isn’t enough information from the traders, the prediction may not be very good.  The conclusion has to be that diversity in the trader pool must be sufficient to include most of the relevant information needed to make an accurate prediction.

Perhaps a reduction in uncertainty has value?

In my analysis, I calculated the improvement of the dispersion in the prediction markets, relative to the initial forecasts.  Overall, the standard deviation in the prediction markets was about 35% tighter than that of the student forecasts.  It appears that trading in a prediction market helps to focus the participants’ estimates closer to the mean.   On the face of it, we would say this is a good thing.  The market is less uncertain about the forecast than a flatter distribution would indicate.  But, in these markets, the predictions have very large errors.  In a word, they were inaccurate.

Let’s examine this from a decision-making point of view.  We would expect a range, one standard deviation around the mean, to capture the actual outcome 68% of the time, if the distribution is normal.  The actual movie receipts were contained within this range for the students’ mean forecasts in 7 of 11 markets.  Perhaps about what one might expect, given that the students were not “experts” in forecasting movie receipts.  Here’s the kicker: The market predictions failed to fall within this range in 8 of the 11 prediction markets! Put another way, had the executives making decisions on a range of potential movie receipts, that was within one standard deviation of the market prediction, they would expect their prediction to be correct 68% of the time.  This did not happen in these markets.  We aren’t even looking at whether this level of accuracy is adequate for their decision-making purposes.   (I doubt it would have been).  So, even though the prediction markets had tighter distributions, they did not appear to be usefully more accurate than the students’ forecasts.

We find that a tighter distribution around an inaccurate forecast can make for very poor decisions.

It makes no sense to be “more sure” (or less uncertain) of a wrong forecast.

Advertisements

Responses

  1. […] DESTRUCTION Written by Chris F. Masse on May 5, 2009 — Leave a Comment Paul Hewitt asses a research paper by the Iowa Electronic Markets scholars… and what’s l…. No TweetBacks yet. (Be the first to Tweet this post)SHARETHIS.addEntry({ title: “TOTAL […]

  2. […] My review of the literature and case studies (that have been published) indicates that prediction markets have improved the accuracy of forecasts, but the improvements have not been great enough to encourage widespread (or even minimal) acceptance. Furthermore, these studies like to average their results over a number of markets, disguising the fact that some markets improve forecasts, while others fail to do so. Some studies look at average absolute errors, covering up the fact that some predictions were underestimating the true outcome and others overestimating it. This means the real errors are as much as twice as large as those reported. Few, if any, explanations for the failures are ever presented. This raises the issue of consistency. In case studies such as these, where there is no clear under- or over-estimation tendency, for which a correction may be made, the prediction errors are just too great. […]

  3. […] (“Wieviele Kleinwagen verkaufen wir im Q3 2009?”) einsetzen. Die EPMs sind nachweislich präziser als traditionelle Umfragen und […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: